Sequencer `:forget()` does not properly work with DataParallelTable

I am filing this primarily for documentation: As mentioned in https://github.com/torch/cunn/issues/233, the nn.Sequencer's :forget method (and possibly other methods) do not work properly with nn.DataParallelTable, as they only apply to the model on the first GPU.

I'm not sure what the cleanest way to fix this in the RNN library is. One option is to define a :forget() method on DataParallelTable, but the simplest way of doing that would introduce a dependency on cunn which would not be ideal.

Element-Research / rnn

Sequencer `:forget()` does not properly work with DataParallelTable #404