AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Updated repository YOLOv3 and YOLOv4 slower and showing no-detections on data detected with previous version of Yolov3 #5951

Closed Wazaki-Ou closed 4 years ago

Wazaki-Ou commented 4 years ago

I have been doing some training on my custom dataset (humans and dog in the same room) for a while using Yolov3 before the repository gets updated, but I had no-detections issues in cases where both humans and a dog were in the same room (mainly detects human only)

I tried with YOLOv4 hoping it would solve the issue, but I was surprised to notice that Yolov4 failed to even detect humans when they are alone, something the previous model has no problem doing. Even for the dog, the detection of the old model seems better in the same data.

In order to improve the detection, I tried "adding negative data" of the room empty with some different setups, changing the size to 608*608, but it still did not solve this issue.

Even more surprisingly, training a Yolov3 with the latest update seemed to have the same problems that Yolov4 has. Both of them are much slower when detecting and both fail to detect humans. I was expecting at least Yolov3 from this update to be similar to the one from the previous release, but it behaves more like Yolov4.

If there is no way to improve detection with this updated release, I am willing to still use the previous one (I have both built and working), but then I would really appreciate if I can get help with improving detection of both human and dog when in the same room (especially when they are a little close). In order to do that, I have few concerns/ideas of what could be wrong:

1- In many of the train/valid data, I have pictures with both classes (human & dog), but the labeling is only done for one of them. let's say I have 2 humans and 1 dog in a picture. The label would be only about one human. In some cases, I would reuse the picture with a different name and different label, let's say for dog. Would this be an issue ?? I was thinking that maybe the model when learning on these pictures would at some point detect all subjects, but then finds the label of one only and hence learn to ignore the other subjects. is it possible to have multi-labeling for the same picture or should I just avoid using pictures with more than one object to label?

2- Many pictures of the negative data are duplicates since I am retrieving data from videos and it's hard to change the room set up many times. When I say duplicates, I mean a lot of duplicates sometimes around 500 or more images are all the same image actually. I don't think this would affect negatively, but just in case, I am mentioning that.

I'm sorry for the long message. I just wanted to put as much information as possible to help understand my issue. I would really appreciate any help I can get here.

Thanks a lot!!

If you have an issue with training - no-detections / Nan avg-loss / low accuracy:

AlexeyAB commented 4 years ago

Read: https://github.com/AlexeyAB/darknet/wiki/FAQ---frequently-asked-questions

Wazaki-Ou commented 4 years ago

Thank you for the link @AlexeyAB . I have already checked it before, but I still cannot identify the issue. What do you think about the 2 points I mentioned in my question (1 and 2). Could they be the cause? And is it normal that Yolov3 in the updated repository behaves like Yolov4 in terms of detection speed and no-detection behavior? Shouldn't it behave like Yolov3 from the previous release? Thanks a lot!!

AlexeyAB commented 4 years ago
  1. yes
  2. no problems

Read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection


  1. Download the latest Darknet
  2. Train YOLOv4 from the begining using whole dataset
  3. Check and show mAP on training dataset
  4. Check and show mAP on validation dataset
  5. Show chart.png with Loss and mAP
Wazaki-Ou commented 4 years ago

@AlexeyAB Thanks again for your reply. I have already followed the instructions of that part as well, but the problem was not solved. I will try to work on the dataset again to make sure all objects present are labelled (the cases I mentioned in question (1)) Then I will try again with the latest Darknet and let you know in a couple of days if the issue is solved or not.

AlexeyAB commented 4 years ago

show mAP on training dataset show mAP on validation dataset

Wazaki-Ou commented 4 years ago

@AlexeyAB There you go: For training set:

image

For Validation set:

image

AlexeyAB commented 4 years ago

Show examples of bad detection. And show simial training image with bounded boxes.


if you get high mAP for both Training and Validation datasets, but the network detects objects poorly in real life, then your training dataset is not representative - add more images from real life to it

for each object which you want to detect - there must be at least 1 similar object in the Training dataset with about the same: shape, side of object, relative size, angle of rotation, tilt, illumination. So desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds - you should preferably have 2000 different images for each class or more, and you should train 2000*classes iterations or more

Wazaki-Ou commented 4 years ago

@AlexeyAB Thank you so much for all the quick replies. I don't have access to the computer I am using for training now, but I will get you some examples tomorrow when I go to school. For the data, I am sure it is representative since all the data used for training, validating and for checking the detection at the end was collected in the same environment with the same subjects. However, I will also work on the data for a couple days to make sure it's all well labelled (especially the pictures containing more than 1 object ). Again, thank you for all your help so far. I'll post again tomorrow!!

Wazaki-Ou commented 4 years ago

So sorry for the long wait. I went through all the data, reviewed my script (resolved a small bug) and added multi-labels in the pictures that contained more than one object. I also downloaded again the latest repository just in case, and to make it short, my issue seems to be solved. It took me a long time to add the extra labels and double check everything, and the training took some time. Then I wanted to test with different scenarios to be sure. All the issues I previously had are gone. What I noticed:

I am sure refining the dataset helped solved part of the issue, but YOLOv4 definitely outperforms YOLOv3.