Finetune YOLO model more to increase mAP score

HCA97 commented 11 months ago

The training loss curves of YOLO don't look good, I think the reason might we are doing fine tuning and the initial lr might be too much or too little.

results

HCA97 commented 11 months ago

Hi,

I looked at the YOLO documentation and figured out why our learning curves were strange (https://docs.ultralytics.com/modes/train/#arguments).

The first sudden increase happens because of the warm-up step. I believe they set the starting learning rate too high (0.01). After 3 epochs, they lowered it to 0.0001. The sharp drop at the end of training is because of data augmentation. They turned off mosaic data augmentation, causing a sudden decrease in loss.

I tested this idea by reducing the initial learning rate and turning off mosaic data augmentation. This gave me a much smoother learning curve. I divided the dataset into two groups: genus and species, and fine-tuned the YOLO model again.

TWO CLASS (lower LR + w/o mosaic data augmentation)

results

TWO CLASS (default params)

results

HCA97 commented 11 months ago

I am playing with the YOLO params, I doubt I will get somewhere.

HCA97 commented 11 months ago

I've trained another YOLO model and made some adjustments. I changed the starting learning rate to avoid sudden jumps in the beginning. I also used mosaic data augmentation. When I looked at the cases where the IoU score was less than 0.75, I noticed something interesting. In some cases, the YOLO annotations seemed to be even better than the ground truth annotations. One thing I observed was that the YOLO boxes were usually bigger than the ground truth ones. To address this, I came up with a workaround. I reduced the box size to 10 pixels. This helped me go from 440 failed cases to 304.

I'm not entirely sure if this is a good approach though! 😊

Results

mosaic yolo results: https://drive.google.com/file/d/1GOtLaPkWlJGUXLtTITJHtxnuxKdmHoZ8/view?usp=drive_link mosaic yolo -10 boxes results: https://drive.google.com/file/d/1E0ioAvIq-Hs57CFM4GNN5P8M7ya20D4o/view?usp=drive_link

Note

examples_failed_cases_yolo folder contains cases where the IoU score is less than 0.75.
examples_failed_cases_really_bad_yolo folder contains cases where the IoU score is less than 0.1
annoations_yolo.json contains the boxes detected by yolo.
I'm only detecting the box with the highest confidence, so for each image, we only have one box.

HCA97 commented 11 months ago

hmm it seems like the result is worse than our original YOLO model (https://gitlab.aicrowd.com/hca97/mosquitoalert-2023-phase2-starter-kit/-/issues/122)

This submission doesn't contain the box reduction that I mentioned above

fkemeth commented 11 months ago

Hi @HCA97 ,

thank you for the analysis - this is very interesting! For the "bad" misclassifications, it seems 95% is due to multiple mosquitos in an image - so I think the model is quite decent already. For the shift in bounding boxes, I completely agree! This links to my question earlier - if we get the rough location right, we might just keep finetuning the last layer, or, as you do, postprocess the bounding boxes.

In the images that you show, are the Yolo bounding boxes in blue? Maybe it might be worth finding the optimal difference in pixeln with which we should postprocess. The 0.75 threshold is not differentiable, but maybe we can evaluate different valeus (like 2, 5, 8, 10, 15) and check for which we get the most examples above IoU=0.75.

How do you postprocess exactly - do you add 5 to x0, y0 and subtract 5 from x_width, y_width?

fkemeth commented 11 months ago

Your f1-score is, given then low IoU, crazy good it seems. Do you use the hierarchical classifier?

fkemeth commented 11 months ago

Maybe it might be worth finding the optimal difference in pixeln with which we should postprocess.

We could also do this for the Yolo model you had before if you think this better.

HCA97 commented 11 months ago

Your f1-score is, given then low IoU, crazy good it seems. Do you use the hierarchical classifier?

Nope, hierarchical performed very poorly. I am using our best model. I am impressed by how well the model is very resilient.

We could also do this for the Yolo model you had before if you think this better.

Yes makes sense, I will look into that. I want to have both YOLO and the CLIP use the same train and validation dataset so I can find the best parameters (how much I should shrink the box? how can I fuse YOLO's prediction with the CLIP model, etc.)

This links to my question earlier - if we get the rough location right, we might just keep finetuning the last layer, or, as you do, postprocess the bounding boxes.

Yes, I am not sure how can I finetune the last layer, I will look into that. One bad side of using YOLO is a very high level and not much customization you can do.

How do you postprocess exactly - do you add 5 to x0, y0 and subtract 5 from x_width, y_width?

I added +5 to x0 and y0 and -5 to x1 and y1 -> [bbox[0]+5, bbox[1]+5, bbox[2]-5, bbox[3]-5]

In the images that you show, are the Yolo bounding boxes in blue?

Yes, blue is YOLO and Green is ground truth.

HCA97 commented 11 months ago

Hi @fkemeth,

I tried moving the box by 12 pixels, and it improved the detection performance.

With Shrinkage: 0.79 Link Without Shrinkage: 0.77 Link

However, I lost track of our experiments, so now I'm not sure which YOLO model is the best one. This is a bit frustrating :smiling_face_with_tear:.

Additionally, I conducted an experiment by training on YOLO annotations instead of using the challenge annotations. Maybe this will give us a score that's close to our local evaluation.

HCA97 commented 10 months ago

I found one of our best YOLO models and re-submitted it with shrinkage. It actually performed worse. Hmm maybe it sometimes works sometimes doesn't.

With Shrinkage: 0.78 https://www.aicrowd.com/challenges/mosquitoalert-challenge-2023/submissions/239529 Without Shrinkage: 0.82 https://www.aicrowd.com/challenges/mosquitoalert-challenge-2023/submissions/239507

HCA97 / Mosquito-Classifiction