leggedrobotics / darknet_ros

YOLO ROS: Real-Time Object Detection for ROS
BSD 3-Clause "New" or "Revised" License
2.14k stars 1.16k forks source link

Unable to access GPU Acceleration #327

Closed devrostech closed 3 years ago

devrostech commented 3 years ago

Issue Hi, I am unable to use my GPU (NVidia RTX 3060) to allow the GPU acceleration for the package.

Initially, I installed CUDA 10.1 alongwith cuDNN and followed all the necessary steps mentioned in this tutorial . I used the arch code to be 86 for my GPU but it seems like CUDA 10.1 does not supports my GPU architecture.

I wanted to ask, what will be the possible CUDA, CuDNN, Tensorflow and openCV etc versions that I need to install and what will be the procedure that I need to take care of?

System:

Ar-Ray-code commented 3 years ago

Hi.

I don't have an RTX30 card so I don't know for sure, but I'd like to offer some suggests.

The RTX 30 series is an Ampere architecture GPU, so it needs CUDA 11 or later to work.

Please check this link.

So, I would recommend installing CUDA 11.3.

CUDA 11.3 on Ubuntu18 -> https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04&target_type=deb_local

darknet-ros works without installing CuDNN or TensorFlow.

Please contact me if you make any progress.

I hope it works.

devrostech commented 3 years ago

Hi @Ar-Ray-code, after posting this question I solved the error myself. I already had cuda 11.3 and cuDNN 8+ I removed other architectures and added sm86 for my GPU arch.

then I built it and it worked.

also I added some lines in cmake file of darknet_ros

`IF (CUDA_VERSION VERSION_LESS "11.0") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_30,code=sm_30") ENDIF() set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_35,code=sm_35")

set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_50,code=sm_50") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_52,code=sm_52")

IF (CUDA_VERSION VERSION_GREATER "7.6") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_60,code=sm_60") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_61,code=sm_61") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_62,code=sm_62") ENDIF()

IF ((CUDA_VERSION VERSION_GREATER "9.0") OR (CUDA_VERSION VERSION_EQUAL "9.0")) set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70") ENDIF()

IF ((CUDA_VERSION VERSION_GREATER "10.0") OR (CUDA_VERSION VERSION_EQUAL "10.0")) set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_75,code=sm_75") ENDIF()

IF ((CUDA_VERSION VERSION_GREATER "11.0") OR (CUDA_VERSION VERSION_EQUAL "11.0")) set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_80,code=sm_80") ENDIF()

IF ((CUDA_VERSION VERSION_GREATER "11.2") OR (CUDA_VERSION VERSION_EQUAL "11.2")) set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_86,code=sm_86") ENDIF()`

devrostech commented 3 years ago

@Ar-Ray-code I need help with training custom dataset, if we can connect?

Ar-Ray-code commented 3 years ago

I am also working on a darknet tutorial. Please check out this repository: https://github.com/Ar-Ray-code/Darknet-quick-training-tutorial.

Please close this issue.