NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.01k stars 283 forks source link

Starting the project #30

Closed Dutzel closed 5 years ago

Dutzel commented 5 years ago

Hi, for my master thesis, I am interested in trying out this report for object recognition. Unfortunately, I am getting every time the same error even after updating cuda and torch as required.

Here is the error message:

Loading DOPE parameters from '/home/dustin/catkin_ws/src/dope/config/config_pose.yaml'... Parameters loaded. Loading DOPE model '/home/dustin/catkin_ws/src/dope/weights/soup_60.pth'... /home/dustin/.local/lib/python2.7/site-packages/torch/cuda/init.py:114: UserWarning: Found GPU0 GeForce RTX 2080 which requires CUDA_VERSION >= 9000 for optimal performance and fast startup time, but your PyTorch was compiled with CUDA_VERSION 8000. Please install the correct PyTorch binary using instructions from http://pytorch.org

warnings.warn(incorrect_binary_warn % (d, name, 9000, CUDA_VERSION)) Model loaded in 468.114012003 seconds. Running DOPE... (Listening to camera topic: '/dope/webcam_rgb_raw') Ctrl-C to stop Traceback (most recent call last): File "/home/dustin/catkin_ws/src/dope/src/dope.py", line 271, in run_dope_node(params) File "/home/dustin/catkin_ws/src/dope/src/dope.py", line 208, in run_dope_node config_detect File "/home/dustin/catkin_ws/src/dope/src/inference/detector.py", line 265, in detect_object_in_image out, seg = net_model(image_torch) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, kwargs) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 112, in forward return self.module(*inputs[0], *kwargs[0]) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(input, kwargs) File "/home/dustin/catkin_ws/src/dope/src/inference/detector.py", line 105, in forward out1 = self.vgg(x) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, *kwargs) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward input = module(input) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(input, **kwargs) File "/home/dustin/.local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 301, in forward self.padding, self.dilation, self.groups) RuntimeError: CUDNN_STATUS_MAPPING_ERROR


Do you guys have any idea that I can do to fix this issue? I hope to hear from you, thank you very much!

Dutzel commented 5 years ago

This is the current state of the CUDA Device Query:

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2080" CUDA Driver Version / Runtime Version 10.1 / 9.0 CUDA Capability Major/Minor version number: 7.5 Total amount of global memory: 7951 MBytes (8337227776 bytes) MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM (46) Multiprocessors, ( 64) CUDA Cores/MP: 2944 CUDA Cores GPU Max Clock rate: 1710 MHz (1.71 GHz) Memory Clock rate: 7000 Mhz Memory Bus Width: 256-bit L2 Cache Size: 4194304 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1024 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 3 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 8 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 9.0, NumDevs = 1 Result = PASS

TontonTremblay commented 5 years ago

I think you did not install the right version of pytorch for your cuda driver. The code should be compatible with pytorch 1.0. Please let me know if this helps you.

Dutzel commented 5 years ago

@TontonTremblay Thanks for that! It worked... i just had some hidden issues...

What solved my problem:

  1. I went to: https://pytorch.org/get-started/previous-versions/

  2. Downloaded the "cu90/torch-1.0.0-cp27-cp27mu-linux_x86_64.whl" - for python 2.7 UCS2

  3. Went to downloads and typed "pip install cu90/torch-1.0.0-cp27-cp27mu-linux_x86_64.whl

  4. build: (1) Terminal for "roscore"; (2) Terminal for "rosrun dope camera.py"; (3) Terminal for "rosrun dope dope.py"

  5. Running "rosrun rviz rviz" for vizualization

DONT get confused about this THCudaCheck message:

Model loaded in 5.1633849144 seconds. Running DOPE... (Listening to camera topic: '/dope/webcam_rgb_raw') Ctrl-C to stop THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument

Cuda Verison

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation tools, release 9.0, V9.0.176

Pytorch Version

Python 2.7.12 (default, Nov 12 2018, 14:36:49)

import torch print(torch.version) 1.0.0

TontonTremblay commented 5 years ago

The code might not be compatible with the newest version of pytorch.

TontonTremblay commented 5 years ago

Where you successful in solving this issue?

g1y5x3 commented 5 years ago

I had the same issue where I was using GeForce RTX 2060 with cuda driver 9.0.

Had the exact same error message at the exact same location.

After we did the following installation _python2.7 -m pip install torch==1.0.0 -f https://download.pytorch.org/whl/cu90/torch-1.0.0-cp27-cp27mu-linux_x86_64.whl_

We were able to run the package successfully!

avinashsen707 commented 4 years ago

I also dealed with the same issue in installing pytorch. But when i looked in , found that there are python 2.7 and python 3.0 in my system and the installation was going to python 3.

after that i changed o python 2.7 by pip2 install "command" and solved.

I let you know, if it could help or not!! Thanyou!