AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.74k stars 7.96k forks source link

csresnext50-panet-spp UDA Error: unspecified launch failure #4562

Open nhaxin204 opened 4 years ago

nhaxin204 commented 4 years ago

(next mAP calculation at 7392 iterations) Last accuracy mAP@0.5 = 66.38 %, best = 66.38 % 7392: 0.476720, 0.489567 avg loss, 0.001000 rate, 3.874000 seconds, 295680 images

calculation mAP (mean average precision)... 92 detections_count = 1847, unique_truth_count = 763 lass_id = 0, name = 0, ap = 68.86% (TP = 455, FP = 133)

for conf_thresh = 0.25, precision = 0.77, recall = 0.60, F1-score = 0.67 for conf_thresh = 0.25, TP = 455, FP = 133, FN = 308, average IoU = 56.60 %

IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.688620, or 68.86 % otal Detection Time: 18.000000 Seconds

et -points flag: -points 101 for MS COCO -points 11 for PascalVOC 2007 (uncomment difficult in voc.data) -points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

mean_average_precision (mAP@0.5) = 0.688620 ew best mAP! aving weights to backup//csresnext50-panet-spp_best.weights oaded: 0.001000 seconds 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 115 Avg (IOU: 0.829679, GIOU: 0.824828), Class: 0.991411, Obj: 0.035573, No Obj: 0.000014, .5R: 1.000000, .75R: 1.000000, count: 1, loss = 0.315224, class_loss = 0.237647, iou_loss = 0.077577 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 126 Avg (IOU: -nan(ind), GIOU: -nan(ind)), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.000009, .5R: -nan(ind), .75R: -nan(ind), count: 0, loss = 0.000145, class_loss = 0.000145, iou_loss = 0.000000 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 137 Avg (IOU: 0.729193, GIOU: 0.724001), Class: 0.998351, Obj: 0.719568, No Obj: 0.001550, .5R: 1.000000, .75R: 0.333333, count: 3, loss = 0.488564, class_loss = 0.224932, iou_loss = 0.263632 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 115 Avg (IOU: 0.621749, GIOU: 0.567630), Class: 0.997423, Obj: 0.000001, No Obj: 0.000094, .5R: 1.000000, .75R: 0.000000, count: 1, loss = 0.982542, class_loss = 0.596237, iou_loss = 0.386305 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 126 Avg (IOU: 0.749441, GIOU: 0.742319), Class: 0.998545, Obj: 0.191126, No Obj: 0.000229, .5R: 1.000000, .75R: 0.333333, count: 3, loss = 1.006724, class_loss = 0.700831, iou_loss = 0.305894 3 (mse loss, Normalizer: (iou: 0.750000, cls: 1.000000) Region 137 Avg (IOU: 0.788961, GIOU: 0.787725), Class: 0.998753, Obj: 0.247293, No Obj: 0.000678, .5R: 1.000000, .75R: 0.500000, count: 2, loss = 0.492125, class_loss = 0.336246, iou_loss = 0.155879 UDA status Error: file: d:\darknet-master\src\dark_cuda.c : cuda_push_array() : line: 443 : build time: Dec 21 2019 - 07:56:27 UDA Error: unspecified launch failure

sctrueew commented 4 years ago

@AlexeyAB Hi,

When I want to train csresnext50-panet-spp I got this error:

When I set CUDNN_HALF=1 CUDA status Error: file: ....\src\dark_cuda.c : cuda_free() : line: 428 : build time: Dec 21 2019 - 10:37:45 CUDA Error: invalid device pointer CUDA Error: invalid device pointer: No error Assertion failed: 0, file ....\src\utils.c, line 297

I have updated the latest version.

AlexeyAB commented 4 years ago

@zpmmehrdad @nhaxin204 HI,

What CUDA, cuDNN, and OPENCV versions do you use? What params do you use in the Makefile? What command do you use for training? Check bad.list and bad_label.list Do you train with random=1 in the last [yolo] in cfg-file? Show output of commands

nvcc --version
nvidia-smi
sctrueew commented 4 years ago

@AlexeyAB Hi,

I use CUDA 10, cuDNN 7.4 and OpenCV 4 Yes I'm using random=1 I don't have bad.list or bad_label.list

WongKinYiu commented 4 years ago

@zpmmehrdad

could you use the repo before yesterday to train? i try to use 2019.12.19's repo with CUDNN_HALF=1 for training, and it works.

nhaxin204 commented 4 years ago

I use CUDA 10, cuDNN 7.6 and OpenCV 3.4.6

sctrueew commented 4 years ago

@WongKinYiu @AlexeyAB I solved my problem. I've compiled it again and it works but I still have the previous problem that GPUs are using maximum 40%. I don't know what should I do? I have 2 RTX 2080 ti

MrCuiHao commented 4 years ago

@zpmmehrdad

could you use the repo before yesterday to train? i try to use 2019.12.19's repo with CUDNN_HALF=1 for training, and it works. @WongKinYiu could you share your 2019.12.19's repo with Google cloud disk?