facebookarchive / fb.resnet.torch

Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
Other
2.29k stars 664 forks source link

error when trying to run ResNet on Torch 7 #172

Open paras42 opened 7 years ago

paras42 commented 7 years ago

I'm new to Torch, and trying to run a ResNet on my own data (consisting of two classes). I put the data into train and val directories, further divided into "class1" and "class2."

However, I got this error after running it "attempt to call method 'squeeze' (a nil value). More details below:

$ th main.lua -backend cudnn -nEpochs 5 -nClasses 2 -depth 18 -batchSize 8 -data [path to my data folder] => Creating model from file: models/resnet.lua
| ResNet-18 ImageNet
=> Training epoch # 1
/usr/bin/luajit: /usr/share/lua/5.1/nn/CrossEntropyCriterion.lua:11: attempt to call method 'squeeze' (a nil value) stack traceback: /usr/share/lua/5.1/nn/CrossEntropyCriterion.lua:11: in function 'forward' ./train.lua:58: in function 'train' main.lua:52: in main chunk [C]: in function 'dofile' /usr/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

pranerd commented 7 years ago

I'd advise you to update your packages, maybe the error message does not correspond to the latest version available, I found that CudaLongTensor in the earlier version of cutorch doesn't support squeeze(), but the lateset version does.

fangchangma commented 7 years ago

I got the exact same error.

holibert commented 6 years ago

I got the exact same error.

holibert commented 6 years ago

I solved the problem by referring to the TRAINING.md

I guess it's the dataset type problem. I use cifar10-python.tar.gz or cifar10-binary.tar.gz. That's wrong.

you should run

$ th main.lua -dataset cifar10 -nGPU 1 -batchSize 128 -depth 20

to download the special torch cifar10 dataset or other dataset.