zomux / deepy

A highly extensible deep learning framework
MIT License
422 stars 68 forks source link

RNN state #5

Closed ssamot closed 9 years ago

ssamot commented 9 years ago

Hello - is there a way to keep the state (the activations) of the RNN without resetting it so it can be used in an online mode (i.e, train whenever a new example is given, while keeping all the past information) incrementally?

zomux commented 9 years ago

Persist states in RNN is not a good idea in batch training, as you have to reset some states on conditions.

But anyway it's easy to do in deepy, You can create a new layer or just reuse RNN in deepy.

Here is an example:

from deepy import *

class MyRNNWithPersistHiddenValue(NeuralLayer):

    def __init__(self):
        super(MyRNNWithPersistHiddenValue, self).__init__("MyRNNWithPersistHiddenValue")

    def setup(self):
        self.rnn = RNN(hidden_size=100, input_type="sequence", output_type="sequence").connect(input_dim=self.input_dim)
        self.output_dim = self.rnn.output_dim
        self.register_inner_layers(self.rnn)

        self.h0 = self.create_vector(100, name="hidden cell")

        # To manipulate h0, you can create updates like this
        self.register_updates((self.h0, (self.h0 >= 0) * self.h0))

        # Or create callback like this, but it's slower
        self.register_training_callbacks(self.iter_callback)
        self.register_testing_callbacks(self.iter_callback)

    def iter_callback(self):
        if "something":
            self.h0.set_value(self.h0.get_value(borrow=True) * 0)

    def output(self, x):
        # Swap dims of x
        # batch, time, data ---> time, batch, data
        seq = x.dimshuffle((1,0,2))
        # The step function of RNN receives (sequence, hidden) as input,
        # and output processed hidden vars
        hiddens, _ = theano.scan(self.rnn.step,
                                 sequences=[seq],
                                 outputs_info=[self.h0])

        # Restore dimensions
        return hiddens.dimshuffle((1,0,2))
ssamot commented 9 years ago

But unless you update self.h0 with the hiddens it will always keep the initial values right? Where is this done?

zomux commented 9 years ago

Well, in Theano, to implement a language model in the fashion of RNNLM actually is not an easy work. I will dig into this issue in these days, and hopefully to provide an example of RNN-based langauge model training.

ssamot commented 9 years ago

Thanks! I think state is needed for a lot of reasons - it will make life easier in online learning as well. Also might prove useful if one is to train sequences of uneven length without padding.

zomux commented 9 years ago

Well, although some code refactoring should be done, you can see https://github.com/uaca/deepy/blob/master/experiments/lm/baseline_rnnlm.py for an example implementation of vanilla language model

ssamot commented 9 years ago

Thanks for this - will look into it