Custom dataset training vs Transfer-Learning

shubzk commented 7 months ago

I have trained the model on a custom dataset of 5000 images (1024 x 1024). After that, I trained the model on 500 images (3072x3072). However, there was no improvement in the detections, possibly even got worse. Now, my question is, when I do incremental training, should I train the model with the previous weights, or do transfer learning?

dsbyprateekg commented 7 months ago

You must train with 5000+500=5500 images simultaneously.

shubzk commented 7 months ago

So that means, once I get a weight file after training with the initial 5000 images (1024 x 1024), there is no way to improve upon the weight file with new data?

dsbyprateekg commented 7 months ago

Yes, there is not other way. You have to train with combined data which will generate 1 weight file. May I know why were you doing so?

shubzk commented 7 months ago

I had collected about 5000 labelled images initially to train the model on a custom dataset. However, I am getting new images everyday which need to be used for training the model. How can I go about doing this? Because if I have to retrain the model with all the images every time I get new images, I am not getting the use of transfer learning.

To give you a better picture of what I am trying to accomplish, I trained the model on 5000 images (1024x 1024) using: python train.py --workers 8 --device 0 --batch-size 8 --data data/data.yaml --img 1024 1024 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

After this, I have gotten a 'best.pt' weight file, which I used for the following training on 500 images using the transfer learning code line: python train.py --workers 8 --device 0 --batch-size 4 --data data/custom.yaml --img 3072 3072 --cfg cfg/training/yolov7-custom.yaml --weights 'best.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml

I am using an NVIDIA A100 80 GB GPU virtual machine.

dsbyprateekg commented 7 months ago

As I said, you have to add new data to your existing dataset and retrain the model from scratch. I am assuming you have to retrain with new images since your existing trained model is unable to detect in new images, right?

shubzk commented 7 months ago

So i have a set of 20 images I am using for benchmarking different versions of the model. Lets call the weight file from training 5000 images as 5000.pt, and then I am using transfer learning to train over the initially trained model using the 5000.pt as the weight file for the new 500 images.

However, when testing, the detection from 5000.pt is better than the new weight file I am getting after training on additional 500 images.

While writing this, something occured to me, do I have to increase the number of classes in the cfg file and use that in the following line? python train.py --workers 8 --device 0 --batch-size 4 --data data/custom.yaml --img 3072 3072 --cfg cfg/training/yolov7-custom.yaml --weights 'best.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml

dsbyprateekg commented 7 months ago

are you saying, the number of classes in 5000 image dataset is different to the 500 image dataset?

shubzk commented 7 months ago

The number of classes stays the same, so should I make a new yolov7-custom.yaml file and write the custom number of classes, and then use that? I think I might be missing some fundamental knowledge, because to me this sounds like this model can't be used for continuous improvement with new datasets?

dsbyprateekg commented 7 months ago

The number of classes(nc) must be changed in your data/custom.yaml and cfg/training/yolov7-custom.yaml from default(80) to yours.

shubzk commented 7 months ago

So if I do this I can do incremental training? Meaning, suppose I get 500 images every week, and I want to improve the model, I can train the model with 500 images every week with the weight file from the previous week?

dsbyprateekg commented 7 months ago

No, you cannot. You have to add new image to your existing 5000 dataset and then start training using command- python train.py --workers 8 --device 0 --batch-size 8 --data data/data.yaml --img 1024 1024 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

Here weight will be retained yolo not the weight you trained earlier like 5000.pt.

shubzk commented 7 months ago

So basically, everytime I get new data, I need to train the model from scratch? So I can't improve the model with new data?

dsbyprateekg commented 7 months ago

This is the only way we follow with our custom data. Every time the new images we get where detection is failing, we annotate--add those in the existing dataset--and retrain the model from starting.

shubzk commented 7 months ago

So could you please give me use case of transfer learning (I think I am unable the understand the point of the transfer learning code in this case)? Thank you for your help.

dsbyprateekg commented 7 months ago

This is transfer learning only- train a custom dataset using a pre-trained weight to detect our custom classes. Only the way you were doing is wrong.

shubzk commented 7 months ago

Can i get in touch with you through whatsapp? Need some guidance, and you seem very experienced in this matter. My email ID - shubzkumbhar@gmail.com.

whxuexi commented 1 month ago

This is the only way we follow with our custom data. Every time the new images we get where detection is failing, we annotate--add those in the existing dataset--and retrain the model from starting.

Let's say I had 5000 images and got best1.pt, but now I have 500 more images, can I use dataset 5000+500=5500 for training, decrease the learning rate, and incrementalize training with best1.pt to get best2.pt?

whxuexi commented 1 month ago

Can i get in touch with you through whatsapp? Need some guidance, and you seem very experienced in this matter. My email ID - shubzkumbhar@gmail.com.

Let's say I had 5000 images and got best1.pt, but now I have 500 more images, can I use dataset 5000+500=5500 for training, decrease the learning rate, and incrementalize training with best1.pt to get best2.pt?

WongKinYiu / yolov7

Custom dataset training vs Transfer-Learning #2012