AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.79k stars 7.97k forks source link

Cuda Error with LSTM model, after lot of tries. All files provided ! #6781

Open arnaud-nt2i opened 4 years ago

arnaud-nt2i commented 4 years ago

@AlexeyAB I'm trying to train one of the YoloV3 LSTM models on #3114 But with all of them I got the same error:

CUDA status = cudaDeviceSynchronize() Error: file: C:/Users/nt2i_arnaud_pauwelyn/Documents/darknet-master(02-10)/darknet-master/src/blas_kernels.cu : axpy_ongpu_offset() : line: 741 : build time: Oct  2 2020 - 10:35:39

CUDA Error: an illegal memory access was encountered

My config is that one with the last repo ( but I also tried with August 15 repo and with cuda 10.0) I add that every model without LSTM works great (train and infer), yoloV3 or YoloV4, sam, pan... with the same dataset. yolo_v3_tiny_lstm.cfg.txt train.txt images samples : sample.zip obj.names.txt obj.data.txt

 .\darknet.exe detector train data/obj.data cfg/yolo_v3_tiny_lstm.cfg yolov3-tiny.conv.14 -map -dont_show -cuda_debug_sync -benchmark_layers
 CUDA-version: 10020 (10020), cuDNN: 7.6.5, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 3.4.0
 Prepare additional network for mAP calculation...
 0 : compute_capability = 750, cudnn_half = 1, GPU: GeForce GTX 1660 Ti
net.optimized_memory = 0
mini_batch = 1, batch = 1, time_steps = 1, train = 0

I am not the only one encountering issue at training lstm : #6531, #6708 What should we do to solve this?

arnaud-nt2i commented 4 years ago

Ok I have found a walkaround, by using "Yolo v3 optimal" here: https://github.com/AlexeyAB/darknet/releases/tag/darknet_yolo_v3_optimal

aotiansysu commented 3 years ago

Setting bottleneck=1 for every conv_lstm layer in yolo_v3_tiny_lstm.cfg works for me:

[conv_lstm]
batch_normalize=1
size=3
pad=1
output=128
peephole=0
bottleneck=1
activation=leaky