Overfit with small dataset, like KITTI

intellisense-team commented 4 years ago

❔Question

How to reduce the overfit of this repo for training with other datasets, not just COCO.

I have trained the KITTI dataset with the original yolov3(https://pjreddie.com/darknet/yolo/) and the pytorch version(https://github.com/DeNA/PyTorch_YOLOv3). Both the repos can get a good results as the paper(https://arxiv.org/abs/1904.04620) said.

Since the great improvement of YOLOv4 and YOLOv5, I want to trained the KITTI with this repo. However, I can't get a considerable result even compared with the YOLOv3 repos. It is quiet easy to become overfit! I am doing two experiments with all the default parameters(and epochs=300), one is training from scratch, the other one is training from the COCO pre-trained weight. And I get the resules as follows:

It seems that both experiments are overfit, especially the obj_loss, and the mAP@0.5 is quite low comparing with the YOLOv3's results.

Additional context

Is the hyper-parameter optimized for COCO one of the reasons?

github-actions[bot] commented 4 years ago

Hello @intellisense-team, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

Cloud-based AI systems operating on hundreds of HD video streams in realtime.
Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

glenn-jocher commented 4 years ago

@intellisense-team that's an on-point question. It's informative to understand that the Ultralytics repos previously targeted good performance training COCO from scratch to 300 epochs.

The main use case is obviously not this however, instead it is finetuning one of our pretrained models for 50 epochs on your custom dataset. This requires a second set of hyperparameters better suited to finetuning. Just recently we moved hyps into their own yaml files, and duplicated them for the two tasks.

For finetuning: https://github.com/ultralytics/yolov5/blob/master/data/hyp.finetune.yaml

For training from scratch: https://github.com/ultralytics/yolov5/blob/master/data/hyp.scratch.yaml

The repo is smart, so if you assign pretraind weights, it will automatically assign hyps based on hyp.finetune.yaml, and if you assign --weights '' then it will know you intend to train from scratch and it will use hyp.scratch.yaml. That said, untill very recently, these two files were identical, because we had nor the time nor the resources to run dedicated hyp evolution for finetuning scenarios.

So the current finetuning hyps are simply our best educated guess for what works better there, and include more aggressive augmentation, reduced obj gain, etc. If you want to reproduce our finetuning results, our command is here:

python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --epochs 50 --cache --img 512

IF you have significant GPU resources available to you, we can send you docker commands for you to participate in finetuning evolution studies that would allow us to update the finetuning hyps sooner.

glenn-jocher commented 4 years ago

@intellisense-team BTW, KITTI is a fairly common dataset, if you'd like to submit a PR for including it in the /data folder that would help everyone else get started with it much faster.

See https://github.com/ultralytics/yolov5/blob/master/data/voc.yaml for a VOC example, which goes along with a download script in https://github.com/ultralytics/yolov5/tree/master/data/scripts for data autodownload.

intellisense-team commented 4 years ago

@glenn-jocher Thanks for your response and your great work! I have done more experiments and found that the reason leading to different results is the different splitting of the dataset. In the paper of gaussian-yolov3(link), the training set of KITTI have been splitted into two part randomly, half for training, and half for validation. When I followed this splitting of dataset, I could get a considerable result, just as follow.

The experiment which I done before follows the other paper (Frustum PointNet), training in a special splitting of the training set of KITTI (3712 images for training, and 3769 images for validation). The paper claimed that this splitting can ensure the training and validation set do not come from the same video sequences. So, I think this splitting is harder to train. Maybe I will pre-train in the BDD dataset and make the transfer learning to KITTI.

glenn-jocher commented 4 years ago

@intellisense-team ah got it. I've started hyperparameter evolution on VOC, and have some early results that are surprisingly good, and also shockingly different than what I was expecting. Momentum, lr0, and anchor threshold are far lower than their generation 0 starting points.

You might want to try plugging these VOC-evolved hyps into your training to see if helps KITTI also.

Command:

python train.py --batch 64 --weights yolov5m.pt --data voc.yaml --img 512 --epochs 50 --hyp hyp_evolved.yaml

# Hyperparameter Evolution Results
# Generations: 58
# Metrics:      0.656     0.915     0.887      0.67    0.0112   0.00805   0.00188

lr0: 0.0053
momentum: 0.871
weight_decay: 0.00065
giou: 0.0287
cls: 0.381
cls_pw: 0.531
obj: 0.518
obj_pw: 0.956
iou_t: 0.2
anchor_t: 2.0
fl_gamma: 0.0
hsv_h: 0.0205
hsv_s: 0.9
hsv_v: 0.604
degrees: 0.508
translate: 0.153
scale: 0.9
shear: 0.987
perspective: 0.0
flipud: 0.00987
fliplr: 0.395
mixup: 0.262

Vottivott commented 2 years ago

Hi @intellisense-team! I have a question regarding what you said here:

I have trained the KITTI dataset with the original yolov3(https://pjreddie.com/darknet/yolo/) and the pytorch version(https://github.com/DeNA/PyTorch_YOLOv3). Both the repos can get a good results as the paper(https://arxiv.org/abs/1904.04620) said.

You mentioned that you had trained KITTI using YOLOv3 and gotten good results. Did you use the random split for that or the Frustum PointNet split? If you used the Frustum PointNet, that would still mean that YOLOv3 performed better than the same split on YOLOv5, right?

ultralytics / yolov5

Overfit with small dataset, like KITTI #821

❔Question

Additional context