I'm getting the above error a couple of seconds after the first training epoch starts:
nClasses: 1000
nTest: 50000
==> doing epoch on training data:
==> online epoch # 1
cudnnFindConvolutionForwardAlgorithm failed: 2 convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA800,3,224,224 -filtA96,3,11,11 800,96,55,55 -padA2,2 -convStrideA4,4 CUDNN_DATA_FLOAT
/home/drodo/torch/install/bin/luajit: /home/drodo/torch/install/share/lua/5.1/threads/threads.lua:179: [thread 2 endcallback] /home/drodo/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
/home/drodo/torch/install/share/lua/5.1/cudnn/find.lua:483: cudnnFindConvolutionForwardAlgorithm failed, sizes: convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA800,3,224,224 -filtA96,3,11,11 800,96,55,55 -padA2,2 -convStrideA4,4 CUDN
stack traceback:
[C]: in function 'error'
/home/drodo/torch/install/share/lua/5.1/cudnn/find.lua:483: in function 'forwardAlgorithm'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:190: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
[C]: in function 'xpcall'
/home/drodo/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/drodo/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/home/drodo/xnornet/XNOR-Net/train.lua:176: in function </home/drodo/xnornet/XNOR-Net/train.lua:157>
[C]: in function 'xpcall'
/home/drodo/torch/install/share/lua/5.1/threads/threads.lua:174: in function 'dojob'
/home/drodo/torch/install/share/lua/5.1/threads/threads.lua:223: in function 'addjob'
/home/drodo/xnornet/XNOR-Net/train.lua:108: in function 'train'
main.lua:50: in main chunk
[C]: in function 'dofile'
...rodo/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670
Data set is prepared exactly as indicated in the README.md and cuda config is also operational. Has anyone ever come across a similar error running this ConvNet?
Hey,
I'm getting the above error a couple of seconds after the first training epoch starts:
Data set is prepared exactly as indicated in the README.md and cuda config is also operational. Has anyone ever come across a similar error running this ConvNet?
Cheers & Thanks,
-- Dimitrios