matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.52k stars 11.68k forks source link

Tensorflow 2.6 and RTX3090 #2708

Open smittyjaggerman opened 2 years ago

smittyjaggerman commented 2 years ago

I've got following problem:

I am training a TensorFlow 2.6 Matterport MaskRCNN-Port on my RTX3090. I've installed CUDA 11.4 and CUDNN. The GPU is shown in Tensorboard.

When I'm training the GPU is idle on 0% GPU-util and suddenly spikes after a time, processing a batch of images. The CPU (64 core) is going 100% on ~2 cores beforehand.

Why is there as much idle time? Tensorboard says there is no input-pipeline "lag".

Any sugesstions?

Specs:

Linux, RTX3090, 64-core CPU (EPYC), CUDNN 8.0.2, CUDA 11.4.1

AndySung320 commented 2 years ago

I work with: RTX 3060 python 3.8 cuda_11.0.2_451.48_win10 cudnn-11.0-windows-x64-v8.0.4.30 TF: 2.3.0 Keras: 2.4.3 and it runs with GPU well. btw, I use https://github.com/leekunhee/Mask_RCNN for tf2.X

BlueTurtle01 commented 2 years ago

I found that transferring the model to my 3090 took like 75% of the time. Inference time was lower on the CPU but I was only predicting one new image at a time. If you were doing batches though, at some point the increased time to move the model and data to the GPU would outweigh the slowness of the CPU's inference capabilities.

noelquah commented 2 years ago

I've got following problem:

I am training a TensorFlow 2.6 Matterport MaskRCNN-Port on my RTX3090. I've installed CUDA 11.4 and CUDNN. The GPU is shown in Tensorboard.

When I'm training the GPU is idle on 0% GPU-util and suddenly spikes after a time, processing a batch of images. The CPU (64 core) is going 100% on ~2 cores beforehand.

Why is there as much idle time? Tensorboard says there is no input-pipeline "lag".

Any sugesstions?

Specs:

Linux, RTX3090, 64-core CPU (EPYC), CUDNN 8.0.2, CUDA 11.4.1

Its quite likely that you have an incompatible cuda and cudnn version. I believe your cuda version should be 11.2 for tensorflow v2..6.

I would recommend to use tensorflow 2.5 though. I had encountered some inference issues when using 2.6. v2.5 uses the same cuda version.