Open Spoon94 opened 6 years ago
I test it on Jetson TX2, there are no errors.
Hi @Spoon94, we haven't tested Detectron on TX1 ourselves but given that Caffe2 supports TX1 you should be able to run Detectron on TX1. It seems that @CarryJzzZ managed to run it on TX2 so maybe he can share potential additional tips with you.
Two things to double-check:
You installed Caffe2 following the installation instructions for tegra?
You confirmed that Caffe2 GPU build was successful?
# This must print a number > 0 in order to use Detectron python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
@ir413 The following output is 1. But when I run test of caffe2 errors occurs.
# This must print a number > 0 in order to use Detectron
python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
@CarryJzzZ You use cmake or shell script?
Hello @Spoon94 did you solve the issue?
@ir413 I am getting the same error as @Spoon94.
output of python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())
is 4
I have installed caffe2 using conda
`
conda install -c caffe2 caffe2-cuda8.0-cudnn7 |
---|
Below is the output of nvidia-smi. I am using detectron in that case CUDA for the first time Can you please help here?
Fri Mar 2 14:51:07 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 00000000:05:00.0 On | N/A |
| 23% 37C P8 11W / 250W | 695MiB / 12181MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp Off | 00000000:06:00.0 Off | N/A |
| 23% 40C P8 11W / 250W | 2MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 TITAN Xp Off | 00000000:09:00.0 Off | N/A |
| 23% 41C P8 9W / 250W | 2MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 TITAN Xp Off | 00000000:0A:00.0 Off | N/A |
| 23% 34C P8 10W / 250W | 2MiB / 12189MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1263 G /usr/lib/xorg/Xorg 433MiB | | 0 2383 G compiz 186MiB | | 0 16658 G ...-token=1299DE8AA85A5380C75C435BF9B1C466 49MiB | | 0 20369 G /usr/lib/firefox/firefox 23MiB | +-----------------------------------------------------------------------------+ `
I have also seen this error on a desktop machine with Ubuntu 16.04 and an NVIDIA 1060 card.
(detectron) mfe@mfe-ubuntu:~/code/detectron$ python2 tests/test_spatial_narrow_as_op.py
No handlers could be found for logger "caffe2.python.net_drawer"
net_drawer will not run correctly. Please install the correct dependencies.
E0406 12:55:08.916733 3932 init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0406 12:55:08.916744 3932 init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0406 12:55:08.916746 3932 init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Found Detectron ops lib: /home/mfe/anaconda3/envs/detectron/lib/libcaffe2_detectron_ops_gpu.so
E.E
======================================================================
ERROR: test_large_forward (__main__.SpatialNarrowAsOpTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_spatial_narrow_as_op.py", line 68, in test_large_forward
self._run_test(A, B)
File "tests/test_spatial_narrow_as_op.py", line 39, in _run_test
workspace.RunOperatorOnce(op)
File "/home/mfe/anaconda3/envs/detectron/lib/python2.7/site-packages/caffe2/python/workspace.py", line 165, in RunOperatorOnce
return C.run_operator_once(StringifyProto(operator))
RuntimeError: [enforce fail at context_gpu.h:155] . Encountered CUDA error: invalid device function Error from operator:
input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
======================================================================
ERROR: test_small_forward_and_gradient (__main__.SpatialNarrowAsOpTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests/test_spatial_narrow_as_op.py", line 59, in test_small_forward_and_gradient
self._run_test(A, B, check_grad=True)
File "tests/test_spatial_narrow_as_op.py", line 39, in _run_test
workspace.RunOperatorOnce(op)
File "/home/mfe/anaconda3/envs/detectron/lib/python2.7/site-packages/caffe2/python/workspace.py", line 165, in RunOperatorOnce
return C.run_operator_once(StringifyProto(operator))
RuntimeError: [enforce fail at context_gpu.h:155] . Encountered CUDA error: invalid device function Error from operator:
input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
----------------------------------------------------------------------
Ran 3 tests in 0.354s
FAILED (errors=2)
I can confirm that caffe2 was installed correctly and can see my gpu:
(detectron) mfe@mfe-ubuntu:~/code/detectron$ python2 -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"
Success
(detectron) mfe@mfe-ubuntu:~/code/detectron$ python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
1
and:
(detectron) mfe@mfe-ubuntu:~/code/detectron$ nvidia-smi
Fri Apr 6 12:58:53 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 On | N/A |
| 0% 37C P8 9W / 156W | 792MiB / 6069MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1078 G /usr/lib/xorg/Xorg 412MiB |
| 0 1742 G /opt/teamviewer/tv_bin/TeamViewer 1MiB |
| 0 1951 G compiz 108MiB |
| 0 2299 G ...-token=098A14C533A842315853BF4DEAA8A6E9 136MiB |
| 0 25321 G ...-token=A9684E55600A374D0EC8C5B0E9B4F86E 88MiB |
| 0 25420 G ...-token=1276F752EEF983320812E87494AEEE42 42MiB |
+-----------------------------------------------------------------------------+
The output of make
is:
(detectron) mfe@mfe-ubuntu:~/code/detectron/lib$ make
python2 setup.py develop --user
running develop
running egg_info
writing Detectron.egg-info/PKG-INFO
writing top-level names to Detectron.egg-info/top_level.txt
writing dependency_links to Detectron.egg-info/dependency_links.txt
reading manifest file 'Detectron.egg-info/SOURCES.txt'
writing manifest file 'Detectron.egg-info/SOURCES.txt'
running build_ext
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/utils
gcc -pthread -B /home/mfe/anaconda3/envs/detectron/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/mfe/anaconda3/envs/detectron/lib/python2.7/site-packages/numpy/core/include -I/home/mfe/anaconda3/envs/detectron/include/python2.7 -c utils/cython_bbox.c -o build/temp.linux-x86_64-2.7/utils/cython_bbox.o -Wno-cpp
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/utils
gcc -pthread -shared -B /home/mfe/anaconda3/envs/detectron/compiler_compat -L/home/mfe/anaconda3/envs/detectron/lib -Wl,-rpath=/home/mfe/anaconda3/envs/detectron/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/utils/cython_bbox.o -L/home/mfe/anaconda3/envs/detectron/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/utils/cython_bbox.so
building 'utils.cython_nms' extension
gcc -pthread -B /home/mfe/anaconda3/envs/detectron/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/mfe/anaconda3/envs/detectron/lib/python2.7/site-packages/numpy/core/include -I/home/mfe/anaconda3/envs/detectron/include/python2.7 -c utils/cython_nms.c -o build/temp.linux-x86_64-2.7/utils/cython_nms.o -Wno-cpp
gcc -pthread -shared -B /home/mfe/anaconda3/envs/detectron/compiler_compat -L/home/mfe/anaconda3/envs/detectron/lib -Wl,-rpath=/home/mfe/anaconda3/envs/detectron/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/utils/cython_nms.o -L/home/mfe/anaconda3/envs/detectron/lib -lpython2.7 -o build/lib.linux-x86_64-2.7/utils/cython_nms.so
copying build/lib.linux-x86_64-2.7/utils/cython_bbox.so -> utils
copying build/lib.linux-x86_64-2.7/utils/cython_nms.so -> utils
Creating /home/mfe/.local/lib/python2.7/site-packages/Detectron.egg-link (link to .)
Detectron 0.0.0 is already the active version in easy-install.pth
Installed /home/mfe/code/detectron/lib
Processing dependencies for Detectron==0.0.0
Finished processing dependencies for Detectron==0.0.0
Any ideas what's going on?
I have also encountered this problem, it seems Caffe2 and Detectron are installed correctly with the output like: (detectron) [gaoyefei@dlgpu1 ~/project/Detectron-master]$ python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())' 4 (detectron) [gaoyefei@dlgpu1 ~/project/Detectron-master/lib]$ make python2 setup.py develop --user Compiling utils/cython_bbox.pyx because it changed. Compiling utils/cython_nms.pyx because it changed. [1/2] Cythonizing utils/cython_bbox.pyx [2/2] Cythonizing utils/cython_nms.pyx ... Installed /home/gaoyefei/project/Detectron-master/lib Processing dependencies for Detectron==0.0.0 Finished processing dependencies for Detectron==0.0.0
However, when run: (detectron) [gaoyefei@dlgpu1 ~/project/Detectron-master]$ python tests/test_spatial_narrow_as_op.py the error occurs... ERROR: test_small_forward_and_gradient (main.SpatialNarrowAsOpTest)
Traceback (most recent call last): File "tests/test_spatial_narrow_as_op.py", line 59, in test_small_forward_and_gradient self._run_test(A, B, check_grad=True) File "tests/test_spatial_narrow_as_op.py", line 39, in _run_test workspace.RunOperatorOnce(op) File "/home/gaoyefei/miniconda3/envs/detectron/lib/python2.7/site-packages/caffe2/python/workspace.py", line 165, in RunOperatorOnce return C.run_operator_once(StringifyProto(operator)) RuntimeError: [enforce fail at context_gpu.h:155] . Encountered CUDA error: invalid device function Error from operator: input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 2 }
Ran 3 tests in 2.006s FAILED (errors=2)
i am using conda and cuda8+cudnn7......sad day...
@YefeiGao Did you solve this problem?
I compiled caffe2 with source code instead of using conda and it works well now @ljd16
Looks like the Detectron support Jetson TX1, correct me if I am wrong, so I think this issue can be closed.
@CarryJzzZ Hi, do you meet some problems when install caffe2 on TX2?
I have install caffe2 successfully as well as COCOAPI . When run python2 $DETECTRON/tests/test_spatial_narrow_as_op.py, Error occurs like follow:
ERROR: test_large_forward (main.SpatialNarrowAsOpTest)
Traceback (most recent call last): File "./tests/test_spatial_narrow_as_op.py", line 68, in test_large_forward self._run_test(A, B) File "./tests/test_spatial_narrow_as_op.py", line 39, in _run_test workspace.RunOperatorOnce(op) File "/usr/local/caffe2/python/workspace.py", line 176, in RunOperatorOnce return C.run_operator_once(StringifyProto(operator)) RuntimeError: [enforce fail at context_gpu.h:170] . Encountered CUDA error: invalid device function Error from operator: input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
====================================================================== ERROR: test_small_forward_and_gradient (main.SpatialNarrowAsOpTest)
Traceback (most recent call last): File "./tests/test_spatial_narrow_as_op.py", line 59, in test_small_forward_and_gradient self._run_test(A, B, check_grad=True) File "./tests/test_spatial_narrow_as_op.py", line 39, in _run_test workspace.RunOperatorOnce(op) File "/usr/local/caffe2/python/workspace.py", line 176, in RunOperatorOnce return C.run_operator_once(StringifyProto(operator)) RuntimeError: [enforce fail at context_gpu.h:170] . Encountered CUDA error: invalid device function Error from operator: input: "A" input: "B" output: "C" name: "" type: "SpatialNarrowAs" device_option { device_type: 1 cuda_gpu_id: 0 }
Ran 3 tests in 1.526s
FAILED (errors=2)
I have no idea about this.