Class imbalance - Githubissues

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.66k stars 7.96k forks source link

Class imbalance #4144

Open May-forever opened 4 years ago

May-forever commented 4 years ago

Dear Dr. AlexyAB,

I want to train a network on darknet for two class objects detection.

However, one class has 1000 examples, while the other one has only 50 examples.

I learned that I can use data augmentation, such as flip, cut......

In addition to data augmentation, is there any way to solve the problem of class imbalance?

Looking forward to hearing from you, thank you in advance

Best regards, May

AlexeyAB commented 4 years ago

Data augmentation is used automatically.

Class imbalance will be solved by using objectness rather than focal_loss: https://arxiv.org/pdf/1804.02767v1.pdf

However, one class has 1000 examples, while the other one has only 50 examples.

You can also duplicate lines in your train.txt file for these 50 images.

sctrueew commented 4 years ago

@AlexeyAB Hi,

I have this the problem too. some of my classes has less than 100 images and some of them has almost 2k images. So if I add duplicate files to the train.txt and I balance all of the classes to 2k, I can avoid the overfitting. Is it right? If I add 1.9k duplicate files to the train.txt for each class, it's okay?

Thanks in advance.

AlexeyAB commented 4 years ago

So if I add duplicate files to the train.txt and I balance all of the classes to 2k, I can avoid the overfitting. Is it right?

Yes, there will not be overfitting due to automatic data augmentation.

If I add 1.9k duplicate files to the train.txt fpr each class, it's okay?

Yes.

spaul13 commented 4 years ago

@zpmmehrdad can u please tell me how to tackle the class imbalance problem for a classification model (not using yolo layers)?

WilburZjh commented 4 years ago

Hi @AlexeyAB

You can also duplicate lines in your train.txt file for these 50 images.

I am facing the same problem, since I have multiple objects in each training images, some objects occur frequently than others. I think duplicate files could not be suitable in my case, can you suggest some other solutions?

Best

syjeon121 commented 4 years ago

@WilburZjh hi https://github.com/AlexeyAB/darknet/issues/4792 wouldn't this issue help you? counters_per_class or focal_loss

WilburZjh commented 4 years ago

Hi @syjeon121 , thanks for the reply! May I know where should I modify the counters_per_class or focal_loss? or I should manually add them? I am using yolov4-custom.cfg, I did not find any related information...

Best

syjeon121 commented 4 years ago

@WilburZjh yes you should add them in yolo layer like this https://github.com/AlexeyAB/darknet/issues/4792#issuecomment-581393331

WilburZjh commented 4 years ago

Hi @syjeon121 , this is helpful, I will add them into net structure. May I know I should add them in each yolo layer?

syjeon121 commented 4 years ago

hi @WilburZjh i think yes, but try various case