Element-Research / rnn

Recurrent Neural Network library for Torch7's nn
BSD 3-Clause "New" or "Revised" License
941 stars 313 forks source link

Problem with ClassNllCriterion and Sequencer using three dimensional dataset #368

Closed mtiezzi closed 7 years ago

mtiezzi commented 7 years ago

Hi, I'm using SeqLSTM, my net structure is the following

rnn=nn.Sequential()
 rnn :add( nn.SeqLSTM(132,20))
 rnn:add( nn.SeqLSTM(20, 4))

I was previously using MSECriterion, wrapped by the SequencerCriterion

criterion=nn.MSECriterion()
criterion = nn.SequencerCriterion(criterion)

My learning set has the following size

 360
  60
 132
[torch.LongStorage of size 3]

in the classical form seqlen x batchsize x inputsize

target 
 360
  60
   4
[torch.LongStorage of size 3]

Each of the 4 element in the target is 0 or 1 if an event occur or not. All works perfectly, for each time istant from 1 to 360 the batch are forwarded, backwarded and the learning works.

Now I have to use ClassNllCriterion with the same dataset.

Reading the criterion readme :

The input given through a forward() is expected to contain log-probabilities of each class: input has to be a 1D Tensor of size n.
This criterion expects a class index (1 to the number of class) as target

So I thought to use the following target set:


 360
  60
   1
[torch.LongStorage of size 3]

where the third dim is a number among 1,2,3,4 depending on the class of the input.

I added a LogSoftMax Layer like found in examples, now the net is:

rnn=nn.Sequential()
 rnn :add( nn.SeqLSTM(132,20))
 rnn:add( nn.SeqLSTM(20, 4))
 rnn:add(nn.Sequencer(nn.LogSoftMax()))

and I've changed the criterion :

criterion=nn.ClassNLLCriterion()
criterion = nn.SequencerCriterion(criterion)

Using the same code that worked with MSECriterion (I'm using optim, so it has to be the same) I got an error in the first forward of the criterion:

Start learning:

/home/matteo/torch/install/bin/luajit: /home/matteo/torch/install/share/lua/5.1/nn/THNN.lua:110: multi-target not supported at /tmp/luarocks_nn-scm-1-9731/nn/lib/THNN/generic/ClassNLLCriterion.c:20
stack traceback:
        [C]: in function 'v'
        /home/matteo/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'ClassNLLCriterion_updateOutput'
        ...teo/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:43: in function 'forward'
        ...o/torch/install/share/lua/5.1/rnn/SequencerCriterion.lua:55: in function 'forward'
        classrpropck.lua:144: in function 'opfunc'
        /home/matteo/torch/install/share/lua/5.1/optim/rprop.lua:41: in function 'rprop'
        classrpropck.lua:163: in main chunk
        [C]: in function 'dofile'
        ...tteo/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00406670

It seems that the criterion receivs a target with wrong dimension,but I don't understand why.

Using instead the CrossEntropyCriterion (and so not using the nn.LogSoftMax()) layer ) it works , but the error seems too high (with respect to MSECriterion) and I don't know if it is correct.

Is there something in dimensions I'm doing wrong? Why using CrossEntropyCriterion it works? (I've posted this also in the google group of torch)

JoostvDoorn commented 7 years ago

Instead of 360x60x1 use 360x60. See the documentation here: https://github.com/torch/nn/blob/master/doc/criterion.md#nn.ClassNLLCriterion

Sequencer will only take care of the first dimension, as in the time dimension.

mtiezzi commented 7 years ago

Thank you very much, this solved the problem. I'll close the issue. Just one more thing. Have you any idea why the error, obtained with the code

for i = start_i , num_iterations do  
    _, fs = optim.rprop(feval,x,rprop_params)
    if i % opt.print_every == 0 then
        print('error for iteration ' .. i  .. ' is ' .. fs[1] / rho)
        end

after very few iterations get stuck? Maybe the dataset is not very informative?

Start learning
error for iteration 1 is 1.3862942457199
error for iteration 2 is 1.1019926213556
error for iteration 3 is 0.74603102687332
error for iteration 4 is 0.71663362615638
error for iteration 5 is 0.71454483444492
error for iteration 6 is 0.714112207873
error for iteration 7 is 0.71400824098123
error for iteration 8 is 0.71399763805999
error for iteration 9 is 0.71399728076326
error for iteration 10 is 0.71399727662404
error for iteration 11 is 0.71399727662404
error for iteration 12 is 0.71399727662404
error for iteration 13 is 0.71399727662404
error for iteration 14 is 0.71399727662404
error for iteration 15 is 0.71399727662404
error for iteration 16 is 0.71399727662404
error for iteration 17 is 0.71399727662404
error for iteration 18 is 0.71399727662404
error for iteration 19 is 0.71399727662404
error for iteration 20 is 0.71399727662404
error for iteration 21 is 0.71399727662404
error for iteration 22 is 0.71399727662404
error for iteration 23 is 0.71399727662404
error for iteration 24 is 0.71399727662404

I know that the code works because I used it for another dataset with the same dimensions. Previously I used the same network structure, using MSECriterion and optim.

Thanks again

mtiezzi commented 7 years ago

Oh I found out this new dataset was bad formed, all zeroes. So there is no problem. Bye