viorik / ConvLSTM

Spatio-temporal video autoencoder with convolutional LSTMs
291 stars 85 forks source link

UntiedConvLSTM: Error In Back-propagation #18

Closed rmalav15 closed 8 years ago

rmalav15 commented 8 years ago

I am getting following error while training Network containing UntiedConvLSTM()

~/torch/install/share/lua/5.1/nn/ConcatTable.lua:55: bad argument #2 to 'add' (sizes do not match at /home/rmalav/torch/extra/cutorch/lib/THC/generic/THCTensorMathPointwise.cu:10)

After some testing, I found that network is unable to backpropogate (model:forward() works fine, But model:backward() throws above error). I have created a simple Jupyter Notebook File just to show the problem.

I think I am missing something obvious. Can you help me with it? Thank You.

viorik commented 8 years ago

It seems to me the problem is coming from the fact that you are not specifying the batchsize. If you look in ConvLSTM (the parent of UntiedConvLSTM), the batchsize is set to nil by default. But for your input data, the batchsize should be explicitly set as 1. Hence use convlstm = nn.UntiedConvLSTM(3,3,6,3,5,1,1) Let me know if this helps.

rmalav15 commented 8 years ago

Thank you so much @viorik. It works now.

I thought its not mandatory to specify batch-size, as in "model-demo-ConvLSTM.lua" UntiedConvLSTM was initialized without it.

net:add(nn.UntiedConvLSTM(opt.nFiltersMemory[1],opt.nFiltersMemory[2], opt.nSeq, opt.kernelSize, opt.kernelSizeMemory, opt.stride))

But again, I should have gone through constructor details of both novel layers.

viorik commented 8 years ago

It was initialised without batchsize since the data in that demo is not structured in batches, so there indeed the batchsize is nil.

rmalav15 commented 8 years ago

Thanks again @viorik Its clear now.