Closed aminemarref closed 2 years ago
Thanks for the easy to read github issue. :+1:
Please try removing these lines from utils.lua
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import matplotlib.patches as patches
Thank you for your prompt reply,
I performed the required modifications to the file utils.lua
, re-installed numpy
and matplotlib
through pip
, recompiled torch
, thpp
, and fblualib
; and now running run.sh
yields:
amine@Dell-Optiplex-990:~/Work/vrn$ ./run.sh
Found Environment variable CUDNN_PATH = /usr/local/cuda/lib64/libcudnn.so.5...ork/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: module 'matio' not found:No LuaRocks module found for matio
no field package.preload['matio']
no file '/home/amine/.luarocks/share/lua/5.1/matio.lua'
no file '/home/amine/.luarocks/share/lua/5.1/matio/init.lua'
no file '/home/amine/Work/usr/local/torch/install/share/lua/5.1/matio.lua'
no file '/home/amine/Work/usr/local/torch/install/share/lua/5.1/matio/init.lua'
no file './matio.lua'
no file '/home/amine/Work/usr/local/torch/install/share/luajit-2.1.0-beta1/matio.lua'
no file '/usr/local/share/lua/5.1/matio.lua'
no file '/usr/local/share/lua/5.1/matio/init.lua'
no file '/home/amine/.luarocks/lib/lua/5.1/matio.so'
no file '/home/amine/Work/usr/local/torch/install/lib/lua/5.1/matio.so'
no file '/home/amine/Work/usr/local/torch/install/lib/matio.so'
no file './matio.so'
no file '/usr/local/lib/lua/5.1/matio.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
warning: <matio> could not be loaded (is it installed?)
...ork/usr/local/torch/install/share/lua/5.1/trepl/init.lua:389: module 'npy4th' not found:No LuaRocks module found for npy4th
no field package.preload['npy4th']
no file '/home/amine/.luarocks/share/lua/5.1/npy4th.lua'
no file '/home/amine/.luarocks/share/lua/5.1/npy4th/init.lua'
no file '/home/amine/Work/usr/local/torch/install/share/lua/5.1/npy4th.lua'
no file '/home/amine/Work/usr/local/torch/install/share/lua/5.1/npy4th/init.lua'
no file './npy4th.lua'
no file '/home/amine/Work/usr/local/torch/install/share/luajit-2.1.0-beta1/npy4th.lua'
no file '/usr/local/share/lua/5.1/npy4th.lua'
no file '/usr/local/share/lua/5.1/npy4th/init.lua'
no file '/home/amine/.luarocks/lib/lua/5.1/npy4th.so'
no file '/home/amine/Work/usr/local/torch/install/lib/lua/5.1/npy4th.so'
no file '/home/amine/Work/usr/local/torch/install/lib/npy4th.so'
no file './npy4th.so'
no file '/usr/local/lib/lua/5.1/npy4th.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
warning: <npy4th> could not be loaded (is it installed?)
Scanning directory for data...
Found 5 images
5 images require a face detector
Initialising python libs...
Initialising detector...
/home/amine/Work/usr/local/torch/install/bin/luajit: main.lua:51: Invalid numpy data type 9
stack traceback:
[C]: in function 'detect'
main.lua:51: in main chunk
[C]: in function 'dofile'
...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
ls: cannot access '*.txt': No such file or directory
Found Environment variable CUDNN_PATH = /usr/local/cuda/lib64/libcudnn.so.5ls: cannot access '*.raw': No such file or directory
amine@Dell-Optiplex-990:~/Work/vrn$ python -c "import numpy; print(numpy.version.version); print(numpy.__file__)"
1.14.2
/home/amine/.local/lib/python2.7/site-packages/numpy/__init__.pyc
amine@Dell-Optiplex-990:~/Work/vrn$
So on the plus side, I am able to execute Main.lua
further down using the pip
-installed packages i.e. without recourse to the apt python
packages; but on the minus side, I am again hit by the Invalid numpy data type
error. Sigh...
Hmm, yes a few people have had that error. I'm not sure what causes it. Are you running Ubuntu? It seems to happen to Ubuntu users.
I have made yet another fresh CentOS 7 installation (configured for development workstation) and followed the vrn installation guide, and got the usual segmentation-fault error. When I deleted the three lines from utils.lua
, I got the invalid-numpy data-type error. I think the error is reproducible on CentOS 7.
I noticed that when executing the command sudo ./install-deps
which is part of Torch's installation, the packages numpy
and matplotlib
get installed although they were initially installed via pip
as shown in the following standard output's extract:
Dependencies Resolved
================================================================================
Package Arch Version Repository
Size
================================================================================
Installing:
python-ipython noarch 3.2.1-1.el7 epel 13 k
Installing for dependencies:
PyQt4 x86_64 4.10.1-13.el7 base 2.9 M
agg x86_64 2.5-18.el7 base 145 k
atlas x86_64 3.10.1-12.el7 base 4.5 M
blas x86_64 3.4.2-8.el7 base 399 k
kde-filesystem x86_64 4-47.el7 base 48 k
lapack x86_64 3.4.2-8.el7 base 5.4 M
numpy x86_64 1:1.7.1-11.el7 base 2.8 M
pexpect noarch 2.3-11.el7 base 142 k
phonon x86_64 4.6.0-10.el7 base 205 k
phonon-backend-gstreamer x86_64 2:4.6.3-3.el7 base 140 k
python-ipython-console noarch 3.2.1-1.el7 epel 1.6 M
python-ipython-gui noarch 3.2.1-1.el7 epel 177 k
python-matplotlib x86_64 1.2.0-15.el7 base 26 M
python-mistune x86_64 0.8.3-1.el7 epel 137 k
python-nose noarch 1.3.7-1.el7 base 276 k
python-path noarch 5.2-1.el7 epel 47 k
python-pillow x86_64 2.0.0-19.gitd1c6db8.el7 base 438 k
python-pygments noarch 1.4-10.el7 base 599 k
python-repoze-lru noarch 0.4-3.el7 epel 13 k
python-simplegeneric noarch 0.8-7.el7 epel 12 k
python-zmq x86_64 14.3.1-1.el7 epel 468 k
python2-jsonschema noarch 2.5.1-3.el7 epel 75 k
sip x86_64 4.14.6-4.el7 base 122 k
t1lib x86_64 5.1.2-14.el7 base 166 k
texlive-base noarch 2:2012-38.20130427_r30134.el7 base 325 k
texlive-dvipng noarch 2:svn26689.1.14-38.el7 base 44 k
texlive-dvipng-bin x86_64 2:svn26509.0-38.20130427_r30134.el7 base 63 k
texlive-kpathsea noarch 2:svn28792.0-38.el7 base 140 k
texlive-kpathsea-bin x86_64 2:svn27347.0-38.20130427_r30134.el7 base 40 k
texlive-kpathsea-lib x86_64 2:2012-38.20130427_r30134.el7 base 78 k
Transaction Summary
================================================================================
I thought I would remove them before building torch, thpp, and fblualib. So I deleted torch, and re-performed the following steps.
# Install the Torch distribution.
$ cd $HOME/Work/usr/local
$ git clone https://github.com/torch/distro.git
$ mv distro torch
$ cd torch
$ sudo ./install-deps
[New!] $ sudo yum remove numpy [hit tab for full package name]
[New!] $ sudo yum remove python-matplotlib [hit tab for full package name]
$ sudo ./install.sh
$ source $HOME/Work/usr/local/torch/install/bin/torch-activate
# Install THPP and fb.python for the face alignment code
$ cd $HOME/Work/usr/src
$ git clone https://github.com/1adrianb/thpp.git
$ cd thpp/thpp
$ export Torch_DIR="/home/amine/Work/usr/local/torch/pkg/torch/build/cmake-exports" [if needed]
$ export Torch_DIR="/home/amine/Work/usr/local/torch/install/share/cmake/torch" [xor if needed]
$ THPP_NOFB=1 ./build.sh [sudo does not work here]
# Install fb.python.
$ cd $HOME/Work/usr/src
$ git clone https://github.com/facebook/fblualib.git
$ cd fblualib/fblualib/python
$ luarocks make rockspec/*
The above process did not complain about the two packages I deleted and finished successfully. When I run vrn.sh (after removing the three suggested lines), I got the same error.
I have been putting too much focus on Python's libraries setup because it appeared to me from reading the related threads that Python's configuration is the culprit.
Anyway, that's about what my time and expertise allow me to do. I hope someone can share with me the exact versions/configurations of anything related (OS, Python, etc.) and in which month, in which day, at what time, and what the exact Cartesian coordinates of the coffee cup on the desk were for the successful installation of this tool chain (ideally the coffee brand as well).
Cheers.
I managed to debug this on someones Ubuntu 14.04 workstation today. The changes required to get it working are to face-alignment/utils.lua
- local detections = py.reval('[np.asarray([d.left(), d.top(), d.right(), d.bottom()]) for i, d in enumerate(dets)]',{dets=dets})
+ local detections = py.reval('[np.asarray([d.left(), d.top(), d.right(), d.bottom()],dtype=float) for i, d in enumerate(dets)]',{dets=dets})
If you are also having this problem on CentOS then try the above. Hopefully it'll sort it out.
Thanks Aaron,
I confirm that this fix (on file facedetection_dlib.lua
by the way) works both on Ubuntu 16 and CentOS 7.
So to recap, the following change has been performed on vrn/face-alignment/utils.lua
:
REMOVE LINE: from mpl_toolkits.mplot3d import Axes3D
REMOVE LINE: import matplotlib.pyplot as plt
REMOVE LINE: import matplotlib.patches as patches
and the following change has been performed on vrn/face-alignment/facedetection_dlib.lua
:
REPLACE LINE: local detections = py.reval('[np.asarray([d.left(), d.top(), d.right(), d.bottom()]) for i, d in enumerate(dets)]',{dets=dets})
BY LINE: local detections = py.reval('[np.asarray([d.left(), d.top(), d.right(), d.bottom()],dtype=float) for i, d in enumerate(dets)]',{dets=dets})
Now I get out-of-CUDA-memory issues, but that's a story for another thread perhaps.
Cheers.
Nice! That's for confirming that this fix works. What GPU are you using? If you have 2GB then you can run the face-alignment network but not the 3D reconstruction network. However, the 3D reconstruction will work fairly well on the CPU anyway, so you can change gpu to cpu in the run.sh file.
Yep you guessed it right, I have a teeny-tiny GPU memory :-) I changed the device from gpu to cpu and it works great now. You saved me a lot of hair-tearing troubleshooting :-) Thanks a lot.
:+1: I'm going to leave this open for a while to stop people asking the same question. :)
Hello,
I went through the threads about not being able to run vrn.sh and getting a segmentation fault but I could not find a solution there. After two weeks of trying to run the script, I am giving up and reaching for help.
I installed a fresh Ubuntu 16.04 for this purpose on an i5 machine with GeForce GTX 1050. I followed the install instructions to the letter (including the supported Cuda/Cudnn versions). In particular, the required Python libraries where installed this way (in case the fresh installation comes with conflicting Python modules):
When everything finished I got the following error running "vrn.sh".
Stepping through the code in Torch yields:
Another Torch stepping right after the previous one yields (Notice that the error tstate mix-up disappears):
Another Torch run complained about some null state, but I could not reproduce the error to attach it here.
So my conclusion was that I am unable to get passed line 8 of "Main.lua".
I decided to install the Python libraries using another route : installing numpy and matplotlib through apt (Ignore the unnecessary steps, after two weeks of being annoyed I developed the habit of not trusting the Python/Linux relationship):
After this, I re-installed Torch, THPP, and Fblualib, and when I run the script "vrn.sh" I get:
So now the script executes up to line 51 of "Main.lua" then complains about "Invalid numpy data type" --- an error for which I found exactly two Google-search entries; none of which were terribly useful for my limited understanding.
At this stage I could not think anymore. It looks as if the numpy/mayplotlib libraries installed through apt get me further in code execution but the complaint in line 51 is mysterious.
For the sake of completeness (or verbosity), I show the initial install process.
S.O.S.