albarji / neural-style-docker

A dockerized version of neural style transfer algorithms
MIT License
112 stars 34 forks source link

Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED (cudnnFindConvolutionForwardAlgorithm) #7

Closed alagerald closed 7 years ago

alagerald commented 7 years ago

I tried to run fake-it.sh, and got the following error:

./scripts/fake-it.sh docker.png fauvism.jpg tput: No value for $TERM and no -T specified [libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. Successfully loaded /neural-style/models/VGG_ILSVRC_19_layers.caffemodel [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192 conv1_1: 64 3 3 3 conv1_2: 64 64 3 3 conv2_1: 128 64 3 3 conv2_2: 128 128 3 3 conv3_1: 256 128 3 3 conv3_2: 256 256 3 3 conv3_3: 256 256 3 3 conv3_4: 256 256 3 3 conv4_1: 512 256 3 3 conv4_2: 512 512 3 3 conv4_3: 512 512 3 3 conv4_4: 512 512 3 3 conv5_1: 512 512 3 3 conv5_2: 512 512 3 3 conv5_3: 512 512 3 3 conv5_4: 512 512 3 3 fc6: 1 1 25088 4096 fc7: 1 1 4096 4096 fc8: 1 1 4096 1000 Setting up style layer 2 : relu1_1 Setting up style layer 7 : relu2_1 Setting up style layer 12 : relu3_1 Setting up style layer 21 : relu4_1 /root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/Container.lua:67: In 17 module of nn.Sequential: /root/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED (cudnnFindConvolutionForwardAlgorithm) stack traceback: [C]: in function 'error' /root/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:185: in function 'createIODescriptors' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:366: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:363> [C]: in function 'xpcall' /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' /neural-style/neural_style.lua:204: in function 'main' /neural-style/neural_style.lua:515: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

WARNING: If you see a stack trace below, it doesn't point to the place where this error occured. Please use only the one above. stack traceback: [C]: in function 'error' /root/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' /neural-style/neural_style.lua:204: in function 'main' /neural-style/neural_style.lua:515: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

real 0m9.684s user 0m0.083s sys 0m0.019s

I checked the nvidia-docker installation: sudo nvidia-docker run --rm nvidia/cuda nvidia-smi Tue Jan 3 09:49:52 2017
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 367.57 Driver Version: 367.57 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 750 Ti Off | 0000:01:00.0 On | N/A | | 42% 32C P8 1W / 38W | 402MiB / 1998MiB | 11% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| +-----------------------------------------------------------------------------+

Seemed nvidia-docker is okay.

Any suggestions?

Thanks,

Gerald

albarji commented 7 years ago

This is a memory alloc error in the GPU. How much memory does your GPU have? I would say around 3GB should be required for this to work.

alagerald commented 7 years ago

The GPU memory is 2G. Got it.Thanks!