AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.63k stars 7.95k forks source link

'CUDA error' and exit when I use 'darknet.exe' train and generate text RNN #2084

Open chinapsu opened 5 years ago

chinapsu commented 5 years ago

Hello, I build yolov3 in windows 10. and training image use gpu is succeeded. when I train and generate text by rnn like 'https://pjreddie.com/darknet/rnns-in-darknet/' use 'darknet_no_gpu.exe' is succeeded too.

But it will display 'CUDA error' and exit when I use 'darknet.exe' train and generate text .

AlexeyAB commented 5 years ago

@chinapsu Hi,

Try to copy files cublas64_100.dll cudart64_100.dll cufft64_100.dll cusolver64_100.dll cusparse64_100.dll cudnn64_7.dll to the same directory where is darknet.exe

from C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin

and from C:\Program Files\NVIDIA GPU Computing Toolkit\cudnn7.3.1\bin

chinapsu commented 5 years ago

I copy all CUDA8 dll to darknet.exe same directory. But it display CUDA error no memory , so I changed subdivisions=1 to subdivisions=16 in rnn.train.cfg file ,But the program also crashes after running 1 for about 10 seconds. Check the system log, prompting the cudnn64_6.dll module error, exception code: 0xc0000005. Is it the cause of insufficient memory? Is there any other solution?

AlexeyAB commented 5 years ago

@chinapsu

chinapsu commented 5 years ago

I try to set subdivision=128 the program also crashes after running 1 for about 10 seconds.

I tried 2 computers, and all of them have this problem. Both computers use the GTX 1060 6G graphics card. Below are the information of these two machines.

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe
Thu Dec 20 23:37:23 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 391.35                 Driver Version: 391.35                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106... WDDM  | 00000000:01:00.0  On |                  N/A |
| 38%   31C    P8     9W / 150W |    176MiB /  6144MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1380    C+G   Insufficient Permissions                   N/A      |
|    0      3740    C+G   ...x64__8wekyb3d8bbwe\Microsoft.Photos.exe N/A      |
|    0      4152    C+G   ...hell.Experiences.TextInput.InputApp.exe N/A      |
|    0      6884    C+G   C:\Windows\explorer.exe                    N/A      |
|    0      7332    C+G   ...2.0_x64__8wekyb3d8bbwe\WinStore.App.exe N/A      |
|    0      7924    C+G   ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A      |
|    0      8108    C+G   ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A      |
|    0      8572    C+G   ...36.52.0_x64__kzf8qxf38zg5c\SkypeApp.exe N/A      |
|    0      8808    C+G   ...osoft.LockApp_cw5n1h2txyewy\LockApp.exe N/A      |
|    0     10408    C+G   ...mmersiveControlPanel\SystemSettings.exe N/A      |
+-----------------------------------------------------------------------------+
C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe
Thu Dec 20 23:31:19 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 416.94       Driver Version: 416.94       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106... WDDM  | 00000000:01:00.0  On |                  N/A |
|  0%   36C    P8    10W / 130W |    262MiB /  6144MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2952    C+G   C:\Windows\system32\Dwm.exe                N/A      |
|    0      4008    C+G   ...6)\Google\Chrome\Application\chrome.exe N/A      |
+-----------------------------------------------------------------------------+

C:\Program Files\NVIDIA Corporation\NVSMI>
jlorrain commented 5 years ago

I have the same error after few minutes of training with my GTX 1080 8GB !