I am filing this primarily for documentation: As mentioned in https://github.com/torch/cunn/issues/233, the nn.Sequencer's :forget method (and possibly other methods) do not work properly with nn.DataParallelTable, as they only apply to the model on the first GPU.
I'm not sure what the cleanest way to fix this in the RNN library is. One option is to define a :forget() method on DataParallelTable, but the simplest way of doing that would introduce a dependency on cunn which would not be ideal.
I am filing this primarily for documentation: As mentioned in https://github.com/torch/cunn/issues/233, the nn.Sequencer's
:forget
method (and possibly other methods) do not work properly withnn.DataParallelTable
, as they only apply to the model on the first GPU.I'm not sure what the cleanest way to fix this in the RNN library is. One option is to define a
:forget()
method onDataParallelTable
, but the simplest way of doing that would introduce a dependency oncunn
which would not be ideal.