Closed OswaldoBornemann closed 5 years ago
I change gpu to cpu, then the output is below:
zeng_ruihong@GPU-server:~/vrn$ ./run.sh
Scanning directory for data...
Found 5 images
5 images require a face detector
Initialising python libs...
Initialising detector...
/home/zeng_ruihong/torch/install/bin/luajit: ...eng_ruihong/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
.../zeng_ruihong/torch/install/share/lua/5.1/cudnn/init.lua:171: assertion failed!
stack traceback:
[C]: in function 'assert'
.../zeng_ruihong/torch/install/share/lua/5.1/cudnn/init.lua:171: in function 'toDescriptor'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:123: in function 'createIODescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:188: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
[C]: in function 'xpcall'
...eng_ruihong/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
...ng_ruihong/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'func'
..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
main.lua:63: in main chunk
[C]: in function 'dofile'
...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...eng_ruihong/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
...ng_ruihong/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'func'
..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
main.lua:63: in main chunk
[C]: in function 'dofile'
...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
Cropped and scaled AFLW_image00046.jpg
Cropped and scaled AFLW_image00095.jpg
Cropped and scaled AFLW_image00190.jpg
Cropped and scaled AFLW_image00656.jpg
Cropped and scaled asj.jpg
Processed AFLW_image00190.
Processed asj.
Processed AFLW_image00095.
Processed AFLW_image00656.
Processed AFLW_image00046.
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
CPU mode isn't supported. Run nvidia-smi and confirm that you do actually have memory free on the GPU.
may i ask how could i specific the gpu device that torch7 used ? i am new to lua.Thans @AaronJackson
You got the idea in Python but to make it global from for all applications running from the current shell, you need to export the device you want to use. i.e:
export CUDA_VISIBLE_DEVICES=2
./run.sh
@AaronJackson Follow your instruction, i got the same error:
zeng_ruihong@GPU-server:~/vrn$ export CUDA_VISIBLE_DEVICES=2
zeng_ruihong@GPU-server:~/vrn$ ./run.sh
Scanning directory for data...
Found 5 images
5 images require a face detector
Initialising python libs...
Initialising detector...
THCudaCheck FAIL file=/home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/zeng_ruihong/torch/install/bin/luajit: .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: cuda runtime error (2) : out of memory at /home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'read'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function <.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:245>
[C]: in function 'read'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
...
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:353: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:495: in function 'read'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
main.lua:29: in main chunk
[C]: in function 'dofile'
...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
Cropped and scaled AFLW_image00046.jpg
Cropped and scaled AFLW_image00095.jpg
Cropped and scaled AFLW_image00190.jpg
Cropped and scaled AFLW_image00656.jpg
Cropped and scaled asj.jpg
THCudaCheck FAIL file=/home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/zeng_ruihong/torch/install/bin/luajit: /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:11: cuda runtime error (2) : out of memory at /home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'resize'
/home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:11: in function 'torch_Storage_type'
/home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:57: in function 'recursiveType'
...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'type'
/home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType'
/home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType'
...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'cuda'
process.lua:18: in main chunk
[C]: in function 'dofile'
...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
: cannot connect to X server
Show me the output of nvidia-smi please
tsungruihon writes:
@AaronJackson Follow your instruction, i got the same error:
zeng_ruihong@GPU-server:~/vrn$ export CUDA_VISIBLE_DEVICES=2 zeng_ruihong@GPU-server:~/vrn$ ./run.sh Scanning directory for data... Found 5 images 5 images require a face detector Initialising python libs... Initialising detector... THCudaCheck FAIL file=/home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory /home/zeng_ruihong/torch/install/bin/luajit: .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: cuda runtime error (2) : out of memory at /home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66 stack traceback: [C]: in function 'read' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function <.../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:245> [C]: in function 'read' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject' ...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject' ...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject' ... .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:353: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject' ..._ruihong/torch/install/share/lua/5.1/nngraph/gmodule.lua:495: in function 'read' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject' .../zeng_ruihong/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load' main.lua:29: in main chunk [C]: in function 'dofile' ...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50 Cropped and scaled AFLW_image00046.jpg Cropped and scaled AFLW_image00095.jpg Cropped and scaled AFLW_image00190.jpg Cropped and scaled AFLW_image00656.jpg Cropped and scaled asj.jpg THCudaCheck FAIL file=/home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory /home/zeng_ruihong/torch/install/bin/luajit: /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:11: cuda runtime error (2) : out of memory at /home/zeng_ruihong/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66 stack traceback: [C]: in function 'resize' /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:11: in function 'torch_Storage_type' /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:57: in function 'recursiveType' ...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'type' /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType' /home/zeng_ruihong/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType' ...e/zeng_ruihong/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'cuda' process.lua:18: in main chunk [C]: in function 'dofile' ...hong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50 : cannot connect to X server : cannot connect to X server : cannot connect to X server : cannot connect to X server : cannot connect to X server
-- Aaron Jackson - M6PIU http://aaronsplace.co.uk/
@AaronJackson
Mon Oct 15 10:06:03 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 00000000:02:00.0 Off | N/A |
| 56% 83C P2 112W / 250W | 11787MiB / 12207MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX TIT... Off | 00000000:04:00.0 Off | N/A |
| 87% 88C P2 201W / 250W | 11462MiB / 12207MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX TIT... Off | 00000000:84:00.0 Off | N/A |
| 22% 35C P8 18W / 250W | 11MiB / 12207MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 81088 C python 11774MiB |
| 1 105752 C python 11451MiB |
+-----------------------------------------------------------------------------+
Ah, it has been a while since I looked at that script. The variable is exported in run.sh
, so if you open it and change the CUDA_VISIBLE_DEVICES line to 2, you should be good to go.
@AaronJackson glad to hear that. I am very grateful. Thanks!! Now the output is below:
zeng_ruihong@GPU-server:~/vrn$ ./run.sh
Scanning directory for data...
Found 5 images
5 images require a face detector
Initialising python libs...
Initialising detector...
Cropped and scaled AFLW_image00046.jpg
Cropped and scaled AFLW_image00095.jpg
Cropped and scaled AFLW_image00190.jpg
Cropped and scaled AFLW_image00656.jpg
Cropped and scaled asj.jpg
Processed AFLW_image00190.
Processed asj.
Processed AFLW_image00095.
Processed AFLW_image00656.
Processed AFLW_image00046.
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
when i wrote export QT_QPA_PLATFORM='offscreen'
, now the output is
zeng_ruihong@GPU-server:~/vrn$ ./run.sh
Scanning directory for data...
Found 5 images
5 images require a face detector
Initialising python libs...
Initialising detector...
Cropped and scaled AFLW_image00046.jpg
Cropped and scaled AFLW_image00095.jpg
Cropped and scaled AFLW_image00190.jpg
Cropped and scaled AFLW_image00656.jpg
Cropped and scaled asj.jpg
Processed AFLW_image00190.
Processed asj.
Processed AFLW_image00095.
Processed AFLW_image00656.
Processed AFLW_image00046.
./run.sh: 行 90: 91420 段错误 (核心已转储) python ../vis.py --image ../$INPUT/scaled/$fname.jpg --volume $fname.raw
./run.sh: 行 90: 91424 段错误 (核心已转储) python ../vis.py --image ../$INPUT/scaled/$fname.jpg --volume $fname.raw
./run.sh: 行 90: 91428 段错误 (核心已转储) python ../vis.py --image ../$INPUT/scaled/$fname.jpg --volume $fname.raw
./run.sh: 行 90: 91432 段错误 (核心已转储) python ../vis.py --image ../$INPUT/scaled/$fname.jpg --volume $fname.raw
./run.sh: 行 90: 91436 段错误 (核心已转储) python ../vis.py --image ../$INPUT/scaled/$fname.jpg --volume $fname.raw
@AaronJackson
Well the vis script can't display anything because there is no X server. Either connect with X11 forwarding or modify the scripts to use raw2obj instead of vis.
thanks @AaronJackson . Everything is fine now. But now i when i open the obj.file, the object is all black but not colored. How could i output texture image ?
i have three gpu but just one(11G) is in free. I have write the code below in 'face-aligment/utils.lua'
When i run './run.sh', the output show