low IoU. After 1000 iteration mAP is dropping down, high loss

Whisper94 commented 3 years ago

@AlexeyAB Hello! Thank you for your great work and support at anytime! I'm working with YOLO since about 8 months for my thesis at the university and trying different constellations. I read ever different issues here before I open a new issue but this time I couldn't find any relevant information. Because of it I collect all my open questions and I hope you can answer and help me!

Since I got a new GPU for new tests, I could try my new approach for detection. Bevor I worked not with darknet directly (python forks). Now I tested my (small) custom dataset with different resolutions (416x288; 736x544; 1024x736)

What is an issue? After about 1000 iterations my mAP training drops down more then half from about 98% to 0-50% and it is no more stabil. All tested resolutions has the same effect. To get it I've changed mAP calculation after each 50 iterations in detector.c. So I've seen very precise that for my approach the best solution is between 500 and 900 iterations. Bevor and after these range the mAP is very low at all tested resolutions. Ok probably it's not an issue. But at all these tests I get only IOU between 50% and 75% (after dropping under 30% also like mAP)

Dataset: What is my dataset? I have similar images with 2 classes. The objects are really simple (circles) to classify but there are very small. The images of my dataset have 4k resolution and because of cuda memory I can train only with max 1024x736 (because of resizing factor 1.4 I get "out of memory"). My objects are about 120x120 pixel (bounding box) on original image. Redusing of resolution from 4k to 1024 shouldn't have any loss of performance because of simplicity of objects. After training (1000 iterations) I've tested several weights and found that with threshold 50% I can detect pretty good all objects! Most of objects have threshold a bit more than 50%. But at some images it could detect only with threshold about 25%. How can I achieve more an threshold (confidence 90%+) and detect all objects? Probably longer training but after 1000 iterations it drops down... (chart uploaded) Next question is how to achieve more IOU than only 60%? I've also found that some objects become double bounding boxes and if I'd like to eliminate it I should increase threshold but then I miss some objects that are no more detected. How can I generally eliminate double bounding boxes during detection?

By the way during the training I have very high loss (90+) after about 600 iterations and it decreases only minimal. Smaller then 60 I haven't had yet but it has no influence on mAP calculation and on my accuracy because of it I don't care about it (you have also mentioned it in some posts) For better detection of small objects I've changed layer to 23, stride to 4 according to your manuell/FAQ.

which commands I've used? Training: darknet.exe detector train data.txt yolov4.cfg yolov4.conv.137 -map Testing mAP: darknet.exe detector map data.txt yolov4.cfg yolov4_xxx.weights (xxx for number of iteration according to chart with best mAP) Testing images: darknet.exe detector test data.txt yolov4.cfg yolov4_xxx.weights

Chart: Example at 416x288 resolution chart_yolov4-loetstellen

Example with 736x544 chart_yolov4-loetstellen

I've checked my dataset -> no bad.list (empty since compiling darknet). Not detected images in test mode are also listed but only 2 test images. 'bad_label.list' doesn't exist. Also randomly check with augmentation (-show_imgs) has schown no abnormalities.

yolov4-1024.txt

I also tried to use for detection another resolution (resize factor to original training up to 1.5)

For Training I have installed CUDA 11.1. CUDNN8. For Testing I use another machine with RTX 2070. There is darknet with CUDA 10.2 compiled but I have also 11.0; 10.2, 10.1, 10.0 installed but not referenced. Image with Testing (machine) darknet.

Would you recommend to try new YOLOx4-mish or Scaled yolo Px for my issue or I should stay at YOLOv4 until stabil version exists?

Now I hope these infos are enough to find a solution for my issue.

I would like to try darknet.py but compiling yolo_cpp_dll.dll fails. I have another issue according it opened. There are different errors during compiling on RTX 3090 with CUDA 11.1. But also with CUDA 10.x are some errors, which I cann't solve my self.

I would be very glad if you could help me to answer my questions and help to improve my experience with YOLO. THANK YOU!

(Have a nice christmas!)

Whisper94 commented 3 years ago

Update: I've tried different versions (x-mish, csp, tiny...) Only CSP doesn't drops after 1000 iterations and has the best performance at my dataset. I could achieve mAP 99,99% with some changes in cfg file. But I'd like to improve IoU to reach also mAP over 90% at IoU@90. At the moment I have only 99% mAP at IoU@50. @90 it is only 14% and less. Could you help me to improve my issue?

Thank you!

sandeepwadhwa0101 commented 3 years ago

Hi -there I am experiencing similar behavior. Could you share what changes in your cfg file helped you increase mAP?

namasang1 commented 3 years ago

how to calculate map before 1000 iteration?

AlexeyAB / darknet

low IoU. After 1000 iteration mAP is dropping down, high loss #7175