facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.95k forks source link

Example LSTM-network for time series binary classification #849

Open GeorgyKonoplich opened 7 years ago

GeorgyKonoplich commented 7 years ago

Hello everyone ! I just wanted to know how to create model for time series binary classification . I try to change char_rnn.py example. There is a code:

    model = model_helper.ModelHelper(name="rnn")

    input_blob, seq_lengths, hidden_init, cell_init, target = \
        model.net.AddExternalInputs(
            'input_blob',
            'seq_lengths',
            'hidden_init',
            'cell_init',
            'target',
        )

    hidden_output_all, self.hidden_output_last, _, self.cell_state = LSTM(
        model, input_blob, seq_lengths, (hidden_init, cell_init),
        3, self.hidden_size, scope="LSTM")
    output = brew.fc(
        model,
        self.hidden_output_last,
        None,
        dim_in=self.hidden_size,
        dim_out=1,
        axis=2
    )

    pred = model.net.Sigmoid(output, 'pred')
    loss = model.net.SigmoidCrossEntropyWithLogits([pred, target], 'loss')
    model.AddGradientOperators([loss])
    build_sgd(
        model,
        base_learning_rate=0.1,
        policy="step",
        stepsize=1,
        gamma=0.9999
    )

But I get an Exception: No gradient registered for RecurrentNetwork. Exception from creating the gradient op: [enforce fail at operator_gradient.h:137] goutput.at(i).IsDense(). Gradient of output LSTM/hidden_t_all is either sparse or not provided.

Any thoughts about how to create model? Thanks!

akyrola commented 7 years ago

I think problem is that the LSTM only supports gradient over the hidden_output_all blob. Instead you, are passing the last hidden output to FC, and to the loss.