AaronJackson / vrn

:man: Code for "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression"
http://aaronsplace.co.uk/papers/jackson2017recon/
MIT License
4.52k stars 742 forks source link

Segmentation fault #22

Closed yar-resh closed 6 years ago

yar-resh commented 6 years ago

Hello. I have a problem with running ./run.sh script:

./run.sh: line 30: 11690 Segmentation fault      (core dumped) th main.lua -model 2D-FAN-300W.t7 -input ../$INPUT/ -detectFaces true -mode generate -output ../$INPUT/ -device gpu -outputFormat txt
ls: cannot access '*.txt': No such file or directory
ls: cannot access '*.raw': No such file or directory

I have ubuntu 16.04 LTS Python 2.7.12 CUDA Version 9.0.176 CUDNN v5.0 nVidia GTX 1080 GPU

Do you have any idea how to resolve this problem? And what i can provide to you for additional information?

Thanks.

AaronJackson commented 6 years ago

Please use CUDA 7.

AaronJackson commented 6 years ago

Sorry, I meant to say CUDA 7.5 or 8.0

yar-resh commented 6 years ago

Thank you for your advise and sorry for my late response.

As you said, I've tried to install CUDA 7.5, and later CUDA 8.0 Also I've switched from Ubuntu 16.04 to Ubuntu 14.04 I could not use CUDA 7.5 because (if I understood it correctly) my video card is too recent. But with CUDA 8.0 I have the same error - segmentation fault. screenshot from 2017-11-15 11 09 03

Do you have any idea how else it can be fixed? Thank you.

AaronJackson commented 6 years ago

It definitely does not work with CUDA 9 so that's the first part ruled out. If you open Adrian's face-alignment main.th in a text editor, and in a terminal open the th interpreter, you run line by line to see what fails. That will help us figure out where the problem lies.

GregaVrbancic commented 6 years ago

Hi @AaronJackson !

Would this project work without cuda gpu? I'm trying to run it with "device" set to "cpu" and I'm also getting the same output as @yar-resh.

Thank you!

AaronJackson commented 6 years ago

Hi @GregaVrbancic. Same result but for different reasons. While the VRN code will run on the CPU, the face-alignment will not, unfortunately, as it is used nngraph to build the model and converting it to float doesn't seem to work).

yzhang559 commented 6 years ago

Hi, @AaronJackson Could you indicate the suitable GPU ram size? screenshot from 2017-11-16 17-00-02

Is this caused by ram size? Thx.

AaronJackson commented 6 years ago

@yzhang559 Yes. What GPU do you have? You can check with nvidia-smi

yzhang559 commented 6 years ago

@AaronJackson It's NVIDIA GeForce GTX 645, only 972MB

AaronJackson commented 6 years ago

@yzhang559 Yes that is not enough.

slamgo commented 6 years ago

Hi, @AaronJackson 2017-11-16 18-53-08

device =

gpu

/bin/bash: line 1: 5771 Segmentation fault (core dumped) CUDA_VISIBLE_DEVICES=0 th main.lua -model 2D-FAN-300W.t7 -input ../examples/ -detectFaces true -mode generate -output ../examples/ -device gpu -outputFormat txt 2>&1 > /dev/null cd face-alignment;CUDA_VISIBLE_DEVICES=0 th main.lua -model 2D-FAN-300W.t7 -input ../examples/ -detectFaces true -mode generate -output ../examples/ -device gpu -outputFormat txt 2>&1 > /dev/null;: Segmentation fault Error using run (line 38) Failed to run Torch7 script. I'm going crazy,I have make it to run on my computer for a week ,but it still not Do you have any idea how to resolve this problem? I have centos7

CUDA 8.0 CUDNN v5.1 nVidia GTX 1080 GPU RAM 8G CPU i5 7400

AaronJackson commented 6 years ago

Open a torch shell and run through the code from face alignment, line by line. The issue is probably from one of the requires.

On 16 November 2017 11:06:38 GMT+00:00, slamgo notifications@github.com wrote:

Hi, @AaronJackson 2017-11-16
18-53-08

device =

gpu

/bin/bash: line 1: 5771 Segmentation fault (core dumped) CUDA_VISIBLE_DEVICES=0 th main.lua -model 2D-FAN-300W.t7 -input ../examples/ -detectFaces true -mode generate -output ../examples/ -device gpu -outputFormat txt 2>&1 > /dev/null cd face-alignment;CUDA_VISIBLE_DEVICES=0 th main.lua -model 2D-FAN-300W.t7 -input ../examples/ -detectFaces true -mode generate -output ../examples/ -device gpu -outputFormat txt 2>&1 > /dev/null;: Segmentation fault Error using run (line 38) Failed to run Torch7 script. I'm going crazy,I have make it to run on my computer for a week ,but it still not Do you have any idea how to resolve this problem? I have centos7

CUDA 8.0 CUDNN v5.1 nVidia GTX 1080 GPU RAM 8G CPU i5 7400

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

leon-nn commented 6 years ago

Hello @AaronJackson, thanks for the support. Also, how much memory is needed?

Also, I'm getting a similar error as @yar-resh. When I go through the requirements in main.lua line-by-line in th, I get a Segmentation fault (core dumped) for the line local utils = require 'utils'. So, moving to utils.lua, I get a Segmentation fault (core dumped) when I try to run the py.exec lines that imports the Python libraries. So, I tried importing each library individually, and it seems that the matplotlib libraries cause the segmentation fault. Therefore, I chose to only import the numpy library, and th utils.lua runs fine. So I proceed to run run.sh with these lines commented, and I get the following error:

Scanning directory for data...
/home/leon/torch/install/bin/luajit: /home/leon/torch/install/share/lua/5.1/torch/Tensor.lua:466: Wrong size for view. Input size: 1x68x2. Output size: 2 stack traceback: [C]: in function 'error' /home/leon/torch/install/share/lua/5.1/torch/Tensor.lua:466: in function 'view' ./utils.lua:186: in function 'bounding_box' ./utils.lua:308: in function 'getFileList' main.lua:25: in main chunk [C]: in function 'dofile' ...leon/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50 ls: cannot access '.txt': No such file or directory Found Environment variable CUDNN_PATH = /usr/local/cuda/lib64/libcudnn.so.5ls: cannot access '.raw': No such file or directory

I'm using Ubuntu 16.04 LTS, CUDA Version 8.0.61, CUDNN R5, NVIDIA GeForce GTX TITAN X, and Python 2.7.14 from Anaconda.

AaronJackson commented 6 years ago

@longu How did you end up with an input of 1x68x2

leon-nn commented 6 years ago

@AaronJackson I'm not sure, but I found that I had some .t7 files in the examples folder from playing around with the opts parameters (I had changed the outputFormat from txt to t7), and having those files in the input folder caused that error. I deleted them and it runs fine now -- with the code I can produce and visualize the .raw files. However, I still get a segmentation fault message from the line in run.sh that calls vis.py when displaying some of the .raw files, despite being able to view all of them. It seems like for me there are certain issues in calling visualization libraries from Python (matplotlib and visvis): in the case of importing matplotlib in utils.lua, it causes the code to fail as previously described.

yzhang559 commented 6 years ago

@longu Maybe you can try to include 'do' and 'end at the beginning and end of the utils.lua, like this: screenshot from 2017-11-17 13-19-22

JimmyLauren commented 6 years ago

2017-11-17 17-35-19 Hi @AaronJackson , I ran your code and it showed up like this, what should i do to get a 3D model?

AaronJackson commented 6 years ago

Note to everyone: If your issue does not have anything to do with a segmentation fault, please open a separate support ticket. This is getting impossible to keep track of. I will close this ticket now, but we can continue discussing.

@JimmyLauren haha, oh dear. Did you change any of the lighting code? It is being displayed but it seems like the lighting code has been messed with, or you have some issues with opengl. If you haven't changed the lighting code in render.m please try starting matlab with matlab -softwareopengl -desktop

CalmCapK commented 5 years ago

@yar-resh @AaronJackson I have the same problem