viorik / ConvLSTM

Spatio-temporal video autoencoder with convolutional LSTMs
292 stars 85 forks source link

understanding ConvLSTM constructor #17

Closed rmalav15 closed 8 years ago

rmalav15 commented 8 years ago

function ConvLSTM:__init(inputSize, outputSize, rho, kc, km, stride, batchSize) Description for above constructor says

inputSize - number of input feature planes outputSize - number of output feature planes rho - recurrent sequence length kc - convolutional filter size to convolve input km - convolutional filter size to convolve cell; usually km > kc

I am confused between kc and km, How convolve cell filter and convolve input filter is different? As I can understand from "model-demo-ConvLSTM.lua", Here UntiedConvLSTM is initialized as

if opt.untied then net:add(nn.UntiedConvLSTM(opt.nFiltersMemory[1],opt.nFiltersMemory[2], opt.nSeq, opt.kernelSize, opt.kernelSizeMemory, opt.stride))

with opt.kernelSize = 7 opt.kernelSizeMemory = 7

Is it simply referring to 7x7 (readas: kcXkm) convolution kernel? Am I missing something? section NETWORK PARAMETERS from "patio-temporal video autoencoder with differentiable memory" says

The memory module (LSTM ) has 64 filters, size 7 × 7, and the optical flow regressor Θ has 2 convolutional layers,

which is also pointing to the kc x km convolution kernel.

My question is, do kc and km serve the same purpose as KW and KH in below code snippet? module = nn.SpatialConvolution(nInputPlane, nOutputPlane, kW, kH, [dW], [dH], [padW], [padH])

I have another newbe doubt If I use convLSTM with Sequencer, will 'rho' parameter still be valid? As the Sequencer will additionally call forget() before each call to forward and if we use seqlen < rho than, it will forget previous step after every seqlen no matter what "rho" is.

Thank You

viorik commented 8 years ago

Hi @rmalav15

If you look in the paper, equation 1, you will see that the activations of each gate and cell are the result of two convolutions plus bias: one convolution on the input xt and one convolution on the previous state h{t-1}. The convolution on the input is performed with a filter kc x kc, and the convolution on the previous state (memory) with a filter km x km.

When ConvLSTM is used within a sequencer, rho parameter gets overwritten indeed.

Hope it's clearer now.

rmalav15 commented 8 years ago

@viorik Much Thanks Ma'am.