AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Multiclasses custom problem #5988

Open marcusbrito opened 4 years ago

marcusbrito commented 4 years ago

Hello, I am working on an application whose goal is to detect a series of classes that are not in the COCO Dataset. The classes have nothing to do with each other, in general they are quite different.

My question is: Should I create a single model for all of these classes, or a model for each one?

I am currently developing separate models for each of the classes, and the results are relatively good, but this implies a higher computational cost.

Thank you!

iaslannascimento commented 4 years ago

I have the same problem!

AlexeyAB commented 4 years ago

I would use one model, but I need to know the details of the task.

marcusbrito commented 4 years ago

I would use one model, but I need to know the details of the task.

@AlexeyAB

I am developing a project related to public security, in which I try to identify objects such as firearms, paper documents, cash, and some others.

There are 6 classes in total, and the number of training images varies between 600 and 4000 per class.

I'm using the python imgaug library to do data augmentation, multiplying the dataset by 10x, randomly adding: flip, crop, rotation, translation, brightness and contrast variation, gaussian blur and noise.

The position of the objects is not very important, the main concern is just to indicate if there are such classes in the image.

At the moment I developed 6 different models, using the yolov3-tiny with 416x416 input. I'm following all the instructions of the github page, and training with: ./darknet detector train data/obj.data cfg/yolov3-tiny_custom.cfg yolov3-tiny.conv.15 -map -dont_show

I'm using the same training data for train and valid. The final average loss varies between 0.4 and 2, and the mAP varies between 70% and 90%, depending on the class.

I am considering changing my approach, and using a single yolov4 model, with the aim of increasing accuracy at the cost of processing time for prediction. In my tests, yolov4 takes approximately 10x longer to process than yolov3-tiny, but since I'm running 6 tiny models at the moment, the time would go from 6 t to 10 t (66% increase).

The models are being trained in a Quadro GV100, but in the final application the prediction is being made in CPU(at least for now), with OpenCV.

AlexeyAB commented 4 years ago

but in the final application the prediction is being made in CPU(at least for now), with OpenCV.

YOLOv4 512x512 can work about 3.5 FPS on CPU Core i7-6700K by using OpenCV. And faster for 416x416 or 320x320.

I am considering changing my approach, and using a single yolov4 model, with the aim of increasing accuracy at the cost of processing time for prediction. In my tests, yolov4 takes approximately 10x longer to process than yolov3-tiny, but since I'm running 6 tiny models at the moment, the time would go from 6 t to 10 t (66% increase).

Yes, try to train a single yolov4 model, just make different class_id for each class.

I am developing a project related to public security, in which I try to identify objects such as firearms, paper documents, cash, and some others.

Just check that if in the training image with paper there are firearms, cash ... - all these objects should be labeld too.

marcusbrito commented 4 years ago

Yes, try to train a single yolov4 model, just make different class_id for each class.

Ok, I will try it!

Just check that if in the training image with paper there are firearms, cash ... - all these objects should be labeld too.

Thanks for remembering, I believe this will be the part that takes the most time before training.

Just a last question, which way do you think is the best to choose the confidence threshold? At the end of training, the log says:

for conf_thresh = 0.25, precision = 0.90, recall = 0.82, F1-score = 0.86

So I believe that 0.25 would be recommended as a good starting point. I'm trying to optimize the F1-score.