Average Precision of a class is lower when trained as a single class configuration compared to training as a multi class configuration

indrajitkurmi commented 4 years ago

Hi all, I have recently started working with the darknet and trained a newly available FLIR thermal dataset with a multi-class configuration (3 classes - Person, bicycle, car). In this training instance for different iterations such as 4000, 5000 and 6000 iterations ( person class average precision are 77.25%, 76.29% and 72.41%). However, I wanted only single-class classification i.e detecting persons within FLIR thermal dataset hence I changed the configuration files, dataset annotations, names as described here:-

https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

i.e for single class config files ( filters preceding Yolo layer to be 5+3*n ie. 18 for single class and classes within Yolo layer to be: classes = 1)

[convolutional] filters=18 activation=linear

[yolo] mask = 6,7,8 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 classes=1

and my .txt files for each image look likes for single class 0 0.100781 0.474609 0.020313 0.039062

However after training with these new configurations for single-class, for almost all iterations such as 4000, 5000 and 6000 iterations ( person class average precision are 73.45, 64.24 and 68.61%)

Could anyone point out why training with single-class classification configuration always yields lower average precision compared to a multi-class configuration? am I making some error in configurations or is there any general explanation for this?

Thank you for your time and help.

Thanks and Best Regards Indrajit Kurmi

AlexeyAB commented 4 years ago

@indrajitkurmi Hi,

Do you use the same number of images in both cases?
Did you check your Training and Validation dataset by using Yolo_mark?
Did you check mAP on Training or Validation dataset?
Did you change classes= and filters= in 3 places in yolov3-spp.cfg?

indrajitkurmi commented 4 years ago

@AlexeyAB Hi, Thank you for your quick response.

Do you use the same number of images in both cases? I am training with the same number of images in both cases however some of them in the new configuration does not contain the specified class (persons within them) hence the annotation text along with the image is empty now.
Did you check your Training and Validation dataset by using Yolo_mark? No, I did not check the training and validation dataset using Yolo_mark. The dataset was annotated in coco format and I used an available python script to convert the annotation from coco to darknet format.
Did you check mAP on Training or Validation dataset? I am always checking the mAP on the validation dataset.
Did you change classes= and filters= in 3 places in yolov3-spp.cfg Yes, I did change classes= and filters= in all 3 places ( Yolo layers) in the yolov3-spp.cfg file.

AlexeyAB commented 4 years ago

In this training instance for different iterations such as 4000, 5000 and 6000 iterations ( person class average precision are 77.25%, 76.29% and 72.41%).

However after training with these new configurations for single-class, for almost all iterations such as 4000, 5000 and 6000 iterations ( person class average precision are 73.45, 64.24 and 68.61%)

Just try to train more iterations. if the mAP will be still higher in the first case, then check the annotations using yolo_mark. https://github.com/AlexeyAB/Yolo_mark

indrajitkurmi commented 4 years ago

@AlexeyAB Hi,

Just try to train more iterations. For both instances, I have trained the network until 17000 iterations and the performance of person class average precision remains similar ( i.e average precision for person class trained within multi-class configuration is higher than single-class configuration). For multi-class instance, for different iterations such as 15000 and 17000 iterations ( person class average precision are 71.40% and 76.31% ) whereas for single-class instance person class average precision is 70.42% and 68.62% for the same number of iterations respectively which is always lower than the multi-class instance.
if the mAP will be still higher in the first case, then check the annotations using yolo_mark. I am in the process of checking the annotations for all annotated images. However with a cursory view through the dataset, I could identify some instances of person class visible in images but not annotated or labeled (both in training and validation images). But I was not able to identify any wrongfully annotated image at least within a cursory preview of the dataset.

Could these instances of person class visible in images but not annotated be one of the reasons for this behavior? Does YoloV3 become super specific in identifying instances of a single class and identify persons who are not annotated? If this is the reason, I was wondering why it is different in multi-class instances and the same not annotated instances of person class not identified as the person class causing the same drop in average precision?

I am going to annotate all the instances of person class in all the images of the dataset and retrain to see if that helps.

However, it would be really helpful if you could provide any further insight or details regarding this learning behaviour.

Thanks for your help and guidance.

AlexeyAB commented 4 years ago

I could identify some instances of person class visible in images but not annotated or labeled (both in training and validation images).

Do you want to detect Persons in both cases? So all objects (Persons) must be labeled on all images.

Read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

check that each object that you want to detect is mandatory labeled in your dataset - no one object in your data set should not be without label. In the most training issues - there are wrong labels in your dataset (got labels by using some conversion script, marked with a third-party tool, ...). Always check your dataset by using: https://github.com/AlexeyAB/Yolo_mark

Could these instances of person class visible in images but not annotated be one of the reasons for this behavior? Does YoloV3 become super specific in identifying instances of a single class and identify persons who are not annotated?

This is the reason. Any Human or ML-algorithm will be confused if he is simultaneously trained to detect an object and not detect the same object.

If this is the reason, I was wondering why it is different in multi-class instances and the same not annotated instances of person class not identified as the person class causing the same drop in average precision?

I don't know.

AlexeyAB / darknet

Average Precision of a class is lower when trained as a single class configuration compared to training as a multi class configuration #4143