AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.67k stars 7.96k forks source link

At which point adding new data to a training set, will not improve training accuracy #3115

Open corentin87 opened 5 years ago

corentin87 commented 5 years ago

Hi,

I guess this is more a general question about training a CNN but YOLO is the one i'm using. I've started my training set for 'person' detections by labelling some data from different cameras videos (in similar environment).... Every time I was adding new data for a new camera I was retraining YOLO, which actually improved the detection for this camera. For the training, I split my data randomly into training/validation set. I use the validation set to compute accuracy. This is not overfitting as all the previous data are also used in the training.

Now, I've gathered more than 100 000 labelled data. I was expecting to not have to train anymore at this point as my data set is pretty big. But looks like I still need to do it. if i'm getting a new camera video, labelling 500-1000 samples, adding them to my huge data set and training again, the accuracy is improving for this camera. I don't understand really understand why. Why do i still need to add new data to my set? Why is the accuracy improving a lot on the new data, while there are 'drawn' in the thousands of already existing data? Is there a point where I will be able to stop training because adding new data will not improve the accuracy?

Thanks for sharing your thoughts and ideas!

AlexeyAB commented 5 years ago

@corentin87 Hi,

When new images will be similar to the existing images - then adding new data will not improve training accuracy.

Check that for each object from new video there is at least 1 similar object in the old Training dataset: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

for each object which you want to detect - there must be at least 1 similar object in the Training dataset with about the same: shape, side of object, relative size, angle of rotation, tilt, illumination. So desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds - you should preferably have 2000 different images for each class or more, and you should train 2000*classes iterations or more

corentin87 commented 5 years ago

thanks for your answer Alex. My training set for one class includes 21 000 images with 170 000 bounding boxes. They are all taken from similar environment (fisheye, indoor,..).So I believe that new images are already similar to some images. Maybe i didn't go deep enough in the the understanding of CNN, but by adding 100 new images to a pool of 21 000 previous images for the training (which is a small portion), i'm getting much better accuracy on the new videos. Does this sounds right to you? The fact that the network can learn new features while he add already a lot to learn with. Thanks

AlexeyAB commented 5 years ago

It means that new images are very different from old images.