torch / demos

Demos and tutorials around Torch7.
355 stars 301 forks source link

demos contain errors, please fix them or indicate how we can fix them #56

Closed kirk86 closed 7 years ago

kirk86 commented 7 years ago

hi everyone, so I just cloned the demos repo and run the digit recognizer example out of the box without modifying anything.

th train-on-mnist.lua -f -p -o "LBFGS" -b 256 -t 8

And this is what I get. After searching online I've found similar problems where they say that the solution is to remove labels with zero values. But in this case I don't know where that might come from since I've checked the labels min and max values and there are no zeros.

PANIC: unprotected error in call to Lua API (...tro/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:49: Assertion `cur_target >= 0 && cur_target < n_classes' failed.  at /home/user/torch/extra/nn/lib/THNN/generic/ClassNLLCriterion.c:57
stack traceback:
    [C]: in function '__index'
    ...tro/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:49: in function 'forward'
    train-on-mnist.lua:209: in function 'opfunc'
    /home/jmitro/torch/install/share/lua/5.1/optim/lbfgs.lua:66: in function 'lbfgs'
    train-on-mnist.lua:245: in function 'train'
    train-on-mnist.lua:345: in main chunk
    [C]: in function 'dofile'
    ...itro/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405d50)

My guess is that the problem is here:

 -- create mini batch
      local inputs = torch.Tensor(opt.batchSize,1,geometry[1],geometry[2])
      local targets = torch.Tensor(opt.batchSize)
      local k = 1
      for i = t,math.min(t+opt.batchSize-1,dataset:size()) do
         -- load new sample
         local sample = dataset[i]
         local input = sample[1]:clone()
         local _,target = sample[2]:clone():max(1)
         target = target:squeeze()
         inputs[k] = input
         targets[k] = target
         k = k + 1
      end

If I print targets just before the error occurs it outputs all zeros and some nan values