AaronJackson / vrn

:man: Code for "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression"
http://aaronsplace.co.uk/papers/jackson2017recon/
MIT License
4.51k stars 744 forks source link

./run.sh failed! #9

Closed xtttttttttx closed 6 years ago

xtttttttttx commented 6 years ago

I've installed cuda9.0 and cudnn v7. What is *.raw?

▶ ./run.sh                                                 
/home/artprog/usr/local/torch/install/bin/luajit: cannot open main.lua: No such file or directory
stack traceback:
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
ls: cannot access '*.txt': No such file or directory
Found Environment variable CUDNN_PATH = /usr/lib/x86_64-linux-gnu/libcudnn.so7/home/artprog/usr/local/torch/install/bin/luajit: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...prog/usr/local/torch/install/share/lua/5.1/cudnn/ffi.lua:1592: /usr/lib/x86_64-linux-gnu/libcudnn.so7: cannot open shared object file: No such file or directory
stack traceback:
    [C]: in function 'error'
    ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    process.lua:17: in main chunk
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
ls: cannot access '*.raw': No such file or directory

I've tried to run from matlab. But it complained as follows:

device =

gpu

zsh:1: command not found: th
Error using run (line 38)
Failed to run Torch7 script.

But torch works in terminal.

AaronJackson commented 6 years ago

Hi, did you clone the git repo recursively? it looks like I forgot to add the --recursive flag to the vrn git clone. I will correct this immediately.

xtttttttttx commented 6 years ago

still the same error.

AaronJackson commented 6 years ago

Can you please be more specific? In your first post, the first error was that it was unable to find main.lua. if you cloned recursively, this file will exist under face-alignment/main.lua.

xtttttttttx commented 6 years ago

@AaronJackson The folder face-alignment does contain the main.lua. Now I run it in matlab, it complained like this:

device =

gpu

/home/artprog/usr/local/torch/install/bin/luajit: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...sr/local/torch/install/share/lua/5.1/luarocks/loader.lua:117: error loading module 'fb.python.lib' from file '/home/artprog/usr/local/torch/install/lib/lua/5.1/fb/python/lib.so':
    /usr/local/MATLAB/R2015b/sys/os/glnxa64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/artprog/usr/local/torch/install/lib/lua/5.1/fb/python/lib.so)
stack traceback:
    [C]: in function 'error'
    ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    main.lua:8: in main chunk
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
Error using run (line 38)
Failed to run Torch7 script.
AaronJackson commented 6 years ago

I believe the matlab error might be due to an old version of matlab. One quick fix for this might be to start matlab like so:

LD_PRELOAD_PATH=/usr/lib64/libstdc++.so.6 matlab

Can you also test the run.sh script.

xtttttttttx commented 6 years ago

Matlab still says:

device =

gpu

/home/artprog/usr/local/torch/install/bin/luajit: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...sr/local/torch/install/share/lua/5.1/luarocks/loader.lua:117: error loading module 'fb.python.lib' from file '/home/artprog/usr/local/torch/install/lib/lua/5.1/fb/python/lib.so':
    /usr/local/MATLAB/R2015b/sys/os/glnxa64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/artprog/usr/local/torch/install/lib/lua/5.1/fb/python/lib.so)
stack traceback:
    [C]: in function 'error'
    ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    main.lua:8: in main chunk
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
Error using run (line 38)
Failed to run Torch7 script.

run.sh complained like this:

▶ ./run.sh 
Found Environment variable CUDNN_PATH = /usr/lib/x86_64-linux-gnu/libcudnn.so.7/home/artprog/usr/local/torch/install/bin/luajit: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...prog/usr/local/torch/install/share/lua/5.1/cudnn/ffi.lua:1618: These bindings are for CUDNN 5.x (5005 <= cudnn.version > 6000) , while the loaded CuDNN is version: 7002  
Are you using an older or newer version of CuDNN?
stack traceback:
    [C]: in function 'error'
    ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    main.lua:13: in main chunk
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
ls: cannot access '*.txt': No such file or directory
Found Environment variable CUDNN_PATH = /usr/lib/x86_64-linux-gnu/libcudnn.so.7/home/artprog/usr/local/torch/install/bin/luajit: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: ...prog/usr/local/torch/install/share/lua/5.1/cudnn/ffi.lua:1618: These bindings are for CUDNN 5.x (5005 <= cudnn.version > 6000) , while the loaded CuDNN is version: 7002  
Are you using an older or newer version of CuDNN?
stack traceback:
    [C]: in function 'error'
    ...rog/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    process.lua:17: in main chunk
    [C]: in function 'dofile'
    ...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50
ls: cannot access '*.raw': No such file or directory

Do I need reinstall cudnn of version 5?

AaronJackson commented 6 years ago

Which version of Linux are you using? Can you run find /usr -name "libstdc++.so.6" to find the correct path to find your libstdc, and then use that path with the LD_PRELOAD_PATH environment variable?

The version of CuDNN might be an issue.

AaronJackson commented 6 years ago

Ah, it should be LD_PRELOAD not LD_PRELOAD_PATH

1adrianb commented 6 years ago

@TianxinTse Cudnn 7.0 support is experimental as far as I know. From what I can see you are using the wrong cudnn package, please make sure you are using the R7 branch https://github.com/soumith/cudnn.torch/tree/R7 or better, use cudnn5 :)

xtttttttttx commented 6 years ago

@AaronJackson @1adrianb Thank you for you guys' help and advice. I will try cudnn5.