simonwsw / deep-soli

Gesture Recognition Using Neural Networks with Google's Project Soli Sensor
MIT License
140 stars 52 forks source link

Fix: #9

Closed graulef closed 7 years ago

graulef commented 7 years ago

The argument --useCuda in the given command prompt seems not to propagate through the whole program. This leads to a mixed usage of Cuda and normal Float tensors, which lead to the following error in my case:

imi@imi-All-Series:~/graulef/deep-soli$ th net/main.lua --file ../datapre --list config/file_half.json --load ../uni_image_np_50.t7 --inputsize 32 --inputch 4 --label 13 --datasize 32 --datach 4 --batch 16 --maxseq 40 --cuda --cudnn
Cuda enabled
[eval] data with 1364 seq
[net] loading model ../uni_image_np_50.t7
nn.Sequencer @ nn.Recursor @ nn.MaskZero @ nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> output]
  (1): cudnn.SpatialConvolution(4 -> 32, 3x3, 2,2)
  (2): nn.SpatialBatchNormalization (4D) (32)
  (3): cudnn.ReLU
  (4): cudnn.SpatialConvolution(32 -> 64, 3x3, 2,2)
  (5): nn.SpatialBatchNormalization (4D) (64)
  (6): cudnn.ReLU
  (7): nn.SpatialDropout(0.400000)
  (8): cudnn.SpatialConvolution(64 -> 128, 3x3, 2,2)
  (9): nn.SpatialBatchNormalization (4D) (128)
  (10): cudnn.ReLU
  (11): nn.SpatialDropout(0.400000)
  (12): nn.Reshape(1152)
  (13): nn.Linear(1152 -> 512)
  (14): nn.BatchNormalization (2D) (512)
  (15): cudnn.ReLU
  (16): nn.Dropout(0.5, busy)
  (17): nn.Linear(512 -> 512)
  (18): nn.LSTM(512 -> 512)
  (19): nn.Dropout(0.5, busy)
  (20): nn.Linear(512 -> 13)
  (21): cudnn.LogSoftMax
}
/home/imi/torch/install/bin/luajit: /home/imi/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
/home/imi/torch/install/share/lua/5.1/cudnn/init.lua:92: attempt to index a nil value
stack traceback:
        /home/imi/torch/install/share/lua/5.1/cudnn/init.lua:92: in function 'scalar'
        ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:195: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
        [C]: in function 'xpcall'
        /home/imi/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
        /home/imi/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/MaskZero.lua:94: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/Recursor.lua:27: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/Sequencer.lua:94: in function 'forward'
        ./net/rnntrain.lua:34: in function 'batchEval'
        ./net/train.lua:25: in function 'epochEval'
        ./net/train.lua:47: in function 'train'
        net/main.lua:47: in main chunk
        [C]: in function 'dofile'
        .../imi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x004065d0

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
        [C]: in function 'error'
        /home/imi/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
        /home/imi/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/MaskZero.lua:94: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/Recursor.lua:27: in function 'updateOutput'
        /home/imi/torch/install/share/lua/5.1/rnn/Sequencer.lua:94: in function 'forward'
        ./net/rnntrain.lua:34: in function 'batchEval'
        ./net/train.lua:25: in function 'epochEval'
        ./net/train.lua:47: in function 'train'
        net/main.lua:47: in main chunk
        [C]: in function 'dofile'
        .../imi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x004065d0

I fixed this by simply adding this line of code in the constructor of RnnTrain: useCuda = true This is a dirty fix, but it worked for me. I will try to find out where the actual mistake is in the code.

graulef commented 7 years ago

Found the issue and added pull request!