karpathy / neuraltalk2

Efficient Image Captioning code in Torch, runs on GPU
5.49k stars 1.26k forks source link

train.lua run error #207

Open xuliang-a opened 4 years ago

xuliang-a commented 4 years ago

th train.lua -input_h5 coco/cocotalk.h5 -input_json coco/cocotalk.json When I run train.lua , the output log is as follows: I hope someone can help me. Thank you very much.

======================================================================== `DataLoader loading json file: coco/cocotalk.json
vocab size is 9567
DataLoader loading h5 file: coco/cocotalk.h5
read 123287 images of size 3x256x256
max sequence length in data is 16
assigned 113287 images to split train
assigned 5000 images to split val
assigned 5000 images to split test
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 553432081 Successfully loaded model/VGG_ILSVRC_16_layers.caffemodel conv1_1: 64 3 3 3 conv1_2: 64 64 3 3 conv2_1: 128 64 3 3 conv2_2: 128 128 3 3 conv3_1: 256 128 3 3 conv3_2: 256 256 3 3 conv3_3: 256 256 3 3 conv4_1: 512 256 3 3 conv4_2: 512 512 3 3 conv4_3: 512 512 3 3 conv5_1: 512 512 3 3 conv5_2: 512 512 3 3 conv5_3: 512 512 3 3 fc6: 1 1 25088 4096 fc7: 1 1 4096 4096 fc8: 1 1 4096 1000 converting first layer conv filters from BGR to RGB...
total number of parameters in LM: 11908448
total number of parameters in CNN: 136358208
constructing clones inside the LanguageModel
/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/Container.lua:67: In 1 module of nn.Sequential: /root/torch/install/share/lua/5.1/cudnn/init.lua:145: Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR stack traceback: [C]: in function 'error' /root/torch/install/share/lua/5.1/cudnn/init.lua:145: in function 'getHandle' /root/torch/install/share/lua/5.1/cudnn/init.lua:156: in function 'call' /root/torch/install/share/lua/5.1/cudnn/init.lua:163: in function 'errcheck' /root/torch/install/share/lua/5.1/cudnn/init.lua:178: in function 'toDescriptor' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:35: in function 'resetWeightDescriptors' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:93: in function 'checkInputChanged' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:117: in function 'createIODescriptors' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:177: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:175> [C]: in function 'xpcall' /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' train.lua:246: in function 'lossFun' train.lua:297: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /root/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' train.lua:246: in function 'lossFun' train.lua:297: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50`

========================================================================

My Installed rocks:

argcheck scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

cudnn scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

cunn scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

cutorch scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

cwrap scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

dok scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

env scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

gnuplot scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

graph scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

hdf5 0-0 (installed) - /root/torch/install/lib/luarocks/rocks

image 1.1.alpha-0 (installed) - /root/torch/install/lib/luarocks/rocks

loadcaffe 1.0-0 (installed) - /root/torch/install/lib/luarocks/rocks

lua-cjson 2.1.0-1 (installed) - /root/torch/install/lib/luarocks/rocks

luaffi scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

luafilesystem 1.6.3-1 (installed) - /root/torch/install/lib/luarocks/rocks

moses 1.6.1-1 (installed) - /root/torch/install/lib/luarocks/rocks

nn scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

nngraph scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

nnx 0.1-1 (installed) - /root/torch/install/lib/luarocks/rocks

optim 1.0.5-0 (installed) - /root/torch/install/lib/luarocks/rocks

paths scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

penlight scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

qtlua scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

qttorch scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

sundown scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

sys 1.1-0 (installed) - /root/torch/install/lib/luarocks/rocks

threads scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

torch scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

totem 0-0 (installed) - /root/torch/install/lib/luarocks/rocks

trepl scm-1 (installed) - /root/torch/install/lib/luarocks/rocks

xlua 1.0-0 (installed) - /root/torch/install/lib/luarocks/rocks