jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.7k forks source link

Error: libcudnn (R4) not found in library path. How do I fix this? #154

Open ProGamerGov opened 8 years ago

ProGamerGov commented 8 years ago
user@user-XPS-8500:~/neural-style$ th neural_style.lua -gpu 0 -backend cudnn
nil 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/ben/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/cudnn/ffi.lua:1279: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    neural_style.lua:64: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 
jcjohnson commented 8 years ago

Download the shared library files from the NVIDIA website, and put them in your CUDA install directory. For example on Ubuntu you'll probably need to put them in /usr/local/cuda/lib64

ProGamerGov commented 8 years ago

I followed step 6 from this guide here: https://github.com/jcjohnson/neural-style/blob/master/INSTALL.md

jcjohnson commented 8 years ago

Oops, I guess I should update that for cuDNN R4 which got released recently. It should be the same, except that the cuDNN .tgz has a different filename now I think.

ProGamerGov commented 8 years ago

It's not exactly the same.


user@user-XPS-8500:~$ tar -xzvf cudnn-7.0-linux-x64-v4.0-prod.tgz
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.4
cuda/lib64/libcudnn.so.4.0.7
cuda/lib64/libcudnn_static.a
cuda/include/cudnn.h
user@user-XPS-8500:~$ cd cuda/
user@user-XPS-8500:~/cuda$ sudo cp libcudnn* /usr/local/cuda-7.0/lib64
[sudo] password for user: 
cp: cannot stat ‘libcudnn*’: No such file or directory
user@user-XPS-8500:~/cuda$ sudo cp cudnn.h /usr/local/cuda-7.0/include
cp: cannot stat ‘cudnn.h’: No such file or directory
user@user-XPS-8500:~/cuda$ 

cuDNN is under /home/user/cuda/include/cudnn.h And no libcudnn file exists.

jcjohnson commented 8 years ago

Seems like the directory structure changed a bit; you need to copy the .so files from cuda/lib64 (from the unpacked .tgz) to /usr/local/cuda-7.0/lib64

ProGamerGov commented 8 years ago

Here's what happened trying to install the older version of cuDNN:


user@user-XPS-8500:~$ tar -xzvf cudnn-6.5-linux-x64-v2.tgz
cudnn-6.5-linux-x64-v2/
cudnn-6.5-linux-x64-v2/INSTALL.txt
cudnn-6.5-linux-x64-v2/CUDNN_License.pdf
cudnn-6.5-linux-x64-v2/cudnn.h
cudnn-6.5-linux-x64-v2/libcudnn_static.a
cudnn-6.5-linux-x64-v2/libcudnn.so.6.5
cudnn-6.5-linux-x64-v2/libcudnn.so.6.5.48
cudnn-6.5-linux-x64-v2/libcudnn.so
user@user-XPS-8500:~$ cd cudnn-6.5-linux-x64-v2/
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ sudo cp libcudnn* /usr/local/cuda-7.0/lib64
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ sudo cp cudnn.h /usr/local/cuda-7.0/include
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ luarocks install cudnn

Installing https://raw.githubusercontent.com/torch/rocks/master/cudnn-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/cudnn-scm-1.rockspec... switching to 'build' mode
Cloning into 'cudnn.torch'...
remote: Counting objects: 43, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 43 (delta 13), reused 25 (delta 3), pack-reused 0
Receiving objects: 100% (43/43), 41.44 KiB | 0 bytes/s, done.
Resolving deltas: 100% (13/13), done.
Checking connectivity... done.
cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/user/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1" && make

-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found Torch7 in /home/user/torch/install
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/luarocks_cudnn-scm-1-4344/cudnn.torch/build
cd build && make install
Install the project...
-- Install configuration: "Release"
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricMaxPooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/LogSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/convert.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricBatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/ReLU.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/ffi.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pooling3D.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialCrossMapLRN.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialMaxPooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialDivisiveNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Tanh.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/env.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialAveragePooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialBatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/init.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Sigmoid.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/BatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialLogSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/functional.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricAveragePooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pointwise.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/TemporalConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialFullConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialCrossEntropyCriterion.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pooling.lua
Updating manifest for /home/user/torch/install/lib/luarocks/rocks
cudnn scm-1 is now built and installed in /home/user/torch/install/ (license: BSD)

user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ 
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ th neural_style.lua -gpu 0 -backend cudnn
/home/user/torch/install/bin/luajit: cannot open neural_style.lua: No such file or directory
stack traceback:
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ cd neural-style
bash: cd: neural-style: No such file or directory
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ th neural_style.lua -gpu 0 -backend cudnn
/home/user/torch/install/bin/luajit: cannot open neural_style.lua: No such file or directory
stack traceback:
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/cudnn-6.5-linux-x64-v2$ cd
user@user-XPS-8500:~$ cd neural-style
user@user-XPS-8500:~/neural-style$ th neural_style.lua -gpu 0 -backend cudnn
nil 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/cudnn/ffi.lua:1279: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    neural_style.lua:64: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

I'll try copying them.

jcjohnson commented 8 years ago

You shouldn't use the old version, R4 is much faster plus cudnn.torch master now expects R4.

ProGamerGov commented 8 years ago

Should I change the permissions for /usr/local/cuda-7.0/lib64 so that I can paste the files in?

jcjohnson commented 8 years ago

No, just sudo cp them

ProGamerGov commented 8 years ago
user@user-XPS-8500:~$ sudo cp /home/user/cuda/lib64/libcudnn.so /usr/local/cuda-7.0/lib64
user@user-XPS-8500:~$ sudo cp /home/user/cuda/lib64/libcudnn.so.4 /usr/local/cuda-7.0/lib64
user@user-XPS-8500:~$ sudo cp /home/user/cuda/lib64/libcudnn.so.4.0.7 /usr/local/cuda-7.0/lib64
user@user-XPS-8500:~$ sudo cp /home/user/cuda/lib64/libcudnn_static.a /usr/local/cuda-7.0/lib64
user@user-XPS-8500:~$ cd neural-style
user@user-XPS-8500:~/neural-style$ th neural_style.lua -gpu 0 -backend cudnn
nil 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/cudnn/ffi.lua:1279: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    neural_style.lua:64: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 
user@user-XPS-8500:~/neural-style$ luarocks install cudnn
Installing https://raw.githubusercontent.com/torch/rocks/master/cudnn-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/cudnn-scm-1.rockspec... switching to 'build' mode
Cloning into 'cudnn.torch'...
remote: Counting objects: 43, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 43 (delta 13), reused 25 (delta 3), pack-reused 0
Receiving objects: 100% (43/43), 41.44 KiB | 0 bytes/s, done.
Resolving deltas: 100% (13/13), done.
Checking connectivity... done.
cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/user/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1" && make

-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found Torch7 in /home/user/torch/install
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/luarocks_cudnn-scm-1-8790/cudnn.torch/build
cd build && make install
Install the project...
-- Install configuration: "Release"
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricMaxPooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/LogSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/convert.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricBatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/ReLU.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/ffi.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pooling3D.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialCrossMapLRN.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialMaxPooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialDivisiveNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Tanh.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/env.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialAveragePooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialBatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/init.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Sigmoid.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/BatchNormalization.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialLogSoftMax.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/functional.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricAveragePooling.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pointwise.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/TemporalConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialFullConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/VolumetricConvolution.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/SpatialCrossEntropyCriterion.lua
-- Installing: /home/user/torch/install/lib/luarocks/rocks/cudnn/scm-1/lua/cudnn/Pooling.lua
Updating manifest for /home/user/torch/install/lib/luarocks/rocks
cudnn scm-1 is now built and installed in /home/user/torch/install/ (license: BSD)

user@user-XPS-8500:~/neural-style$ th neural_style.lua -gpu 0 -backend cudnn
nil 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: /home/user/torch/install/share/lua/5.1/cudnn/ffi.lua:1279: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    neural_style.lua:64: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

libcudnn.so.4 is in /usr/local/cuda-7.0/lib64

Also did sudo cp /home/user/cuda/include/cudnn.h /usr/local/cuda-7.0/include

Not sure what to do at this point. I'm still pretty new to Linux.

ProGamerGov commented 8 years ago

@jcjohnson

Seems like the directory structure changed a bit; you need to copy the .so files from cuda/lib64 (from the unpacked .tgz) to /usr/local/cuda-7.0/lib64

This does not work.

ProGamerGov commented 8 years ago

Is this a problem with neural-style, or is my issue the result of something else I have done?

jcjohnson commented 8 years ago

There is something wrong with your cuDNN installation. It should be very simple:

tar -xzvf cudnn-7.0-linux-x64-v4.0-prod.tgz
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-7.0/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda-7.0/include/

Also check your LD_LIBRARY_PATH:

echo $LD_LIBRARY_PATH

You should see /usr/local/cuda-7.0/lib64 along with possibly other things.

jcjohnson commented 8 years ago

I've also updated the instructions in INSTALL.md

ProGamerGov commented 8 years ago

My LD_LIBRARY_PATH:

user@user-XPS-8500:~$ echo $LD_LIBRARY_PATH
/home/user/torch/install/lib:/home/user/torch/install/lib:
user@user-XPS-8500:~$ 

Looks like I may have managed to seriously mess things up.

jcjohnson commented 8 years ago

That's your problem. You need to add something like this to your .bashrc or other startup scripts.

export LD_LIBRARY_PATH=/usr/local/cuda-7.0/lib64:$LD_LIBRARY_PATH

then source ~/.bashrc.

ProGamerGov commented 8 years ago

Thanks for the help, that seems to have fixed the issue. Though now I am receiving another error.

user@user-XPS-8500:~/neural-style$ th neural_style.lua -gpu 0 -backend cudnn
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/cudnn/init.lua:45: Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR
stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:45: in function 'getHandle'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:53: in function 'errcheck'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:41: in function 'resetWeightDescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:362: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 
ProGamerGov commented 8 years ago

Maybe I should do a full reinstall. Unless its a simple fix.

austingg commented 8 years ago

@jcjohnson could you also update the speed performance when using cudnn R4

ProGamerGov commented 8 years ago

Going to reinstall everything, I probably messed things up in my attempts to fix issues.

ProGamerGov commented 8 years ago

So, I set it up correctly, but I am still having issues:

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
Setting up style layer      21  :   relu4_1 
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-2986/cutorch/lib/THC/generic/THCStorage.cu line=40 error=2 : out of memory
/home/user/torch/install/bin/luajit: ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:142: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-2986/cutorch/lib/THC/generic/THCStorage.cu:40
stack traceback:
    [C]: in function 'resize'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:142: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn -cudnn_autotune -optimizer lbfgs
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED
stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and

user@user-XPS-8500:~/neural-style$ th neural_style.lua -image_size 256 -gpu 0 -backend cudnn -cudnn_autotune -optimizer adam
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED
stack traceback:
    [C]: in function 'error'
    /home/user/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186: in function 'createIODescriptors'
    ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
    /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    neural_style.lua:204: in function 'main'
    neural_style.lua:500: in main chunk
    [C]: in function 'dofile'
    .../user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670
user@user-XPS-8500:~/neural-style$ 

and finally:

user@user-XPS-8500:~$ nvidia-smi
Thu Mar  3 19:58:45 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.63     Driver Version: 352.63         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 660     Off  | 0000:01:00.0     N/A |                  N/A |
| 28%   38C    P8    N/A /  N/A |    289MiB /  1532MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
user@user-XPS-8500:~$ 

Not sure if my card is the problem?

ghost commented 8 years ago

@ProGamerGov I'm new to this torch, lua, and neural networks. And not so adept in linux. Can you guide me on how best to remove/uninstall everything related to torch, cunn, cutorch, neural-style so that I can start fresh once again? Will removing the torch and neural style directories be enough?