AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

How to Pretrain a dataset and do Transfer Learning on Darknet? #5940

Closed renjiraND closed 4 years ago

renjiraND commented 4 years ago

I'm currently doing research on this object detection methods and i'm trying to improve using some pipeline that uses two dataset which needs to be pretrained first and continued training on the other one. Is there any way to do that in this version of darknet.

After some searching, i found some information that its better to do fine-tuning rather than transfer learning, which is a gray area to me. I currently have no problems for training darknet for just one dataset. But i don't know how to do pretraining or transfer learning with this darknet.

Should i just train again after training my model to another dataset or is there another proper way to do it? thank you, and sorry if i'm really not good with this yet, i'm new with this darknet, YOLO, i'm still learning...

klauscf commented 4 years ago

same question as you

renjiraND commented 4 years ago

Eventually, the research didn't go that way. I ended up doing a different stuff on darknet. But anyway, i found that you could do partial on darknet using the command i found here

./darknet partial cfg/yolov4.cfg yolov4.weights yolov4.137 137, and that will give you the frozen model up till layer 137 (this is what i get cmiiw)

After that, you retrain it with another dataset. i used it to retrain the yolov4-tiny to another dataset because I couldn't find the pretrained model (only the coco trained one) in the github, not like yolov4, which has the pretrained partial.

So for the partial, you need to find the number of layer just before the first yolo layer (in my case, yolov4, its the 30th layer) so i did ./darknet partial cfg/yolo-tiny.cfg yolo-tiny.weights yolo-tiny.30 30 then I retrained them with new dataset, and it works nicely.

Other stuff you can do is to train from scratch, so use the training command without the weights in the arguments and run it, then you do the partial after training from scratch. I think that is how you would do the pretraining. sorry for my bad english, still learning it.

Hope it helps anyone out there!

klauscf commented 4 years ago

first i already read carefully about this #2139

i do not know about yolov4, i am sting working on yolov3, here is my case, i use 30k images to train yolov3,it work nice,but when it comes to another scene,eg night scenes which not included in previous training dataset,then i want transfer it,

so i make 1k new train image about the night scenes , first i use

./darknet partial cfg/yolov3.cfg yolov3_last.weights yolov3.conv.74 74
./darknet.exe detector train data/obj.data yolov3.cfg yolov3.conv.74 -map

where yolov3_last.weights is what i get in previous 30k images, when i check the new train model ,the new night works well,but when i check for original scene, the new train model work poor then the original model(i mean yolov3_last.weights from 30k images),Especially in accuracy,i dont know why?

then i read all the issues,i find some case like me,then i do

./darknet partial cfg/yolov3.cfg yolov3_last.weights yolov3.conv.81 81
./darknet.exe detector train data/obj.data yolov3.cfg yolov3.conv.81 -map

thing get better, the new model works better in original model(i mean yolov3_last.weights from 30k images),but i still confuse it Should i set stopbackward=1,and where shold i set ?i did not set it,but i want try,should i try, i mean training yolo takes a lots of time...

then another confuse came ,why is 81? is the first 81 layers are feature extract layers,so when i do transfer ,i freeze it,but why not 80,79,78 or 77 and so on,

and when i want freeze some layer,should i set stopbackward=1 after that layer,or should i ingore stopbackward=1 just do

./darknet partial cfg/yolov3.cfg yolov3_last.weights yolov3.conv.k k
./darknet.exe detector train data/obj.data yolov3.cfg yolov3.conv.k -map

which k means k layers

thank you very much!

renjiraND commented 4 years ago

Wow, I'm surprised you found that! I think that answers all my question and curiosity about transfer this issue.

For your night model, my personal opinion cmiiw, I think that your model is overfitting to the dataset (or just forgetting what it has trained to before) so it have good accuracy on night, but not on day dataset, which is trained before it.

About stopbackwards , if you set it to 1 before the last layer of convolution, it will train faster, but will have lower accuracy than if left untouched, which you can read here it said that:

to speedup training (with decreasing detection accuracy) set param stopbackward=1 for layer-136 in cfg-file

note that it's in layer 136, just before 137 (which is the last layer of yolov4 before [yolo] layers afaik)

renjiraND commented 4 years ago

but then I read again, using stopbackward, freeze the layers (unchanging the weights trained for it). I think this is the one to use for transfer learning and fine tuning. doing training without stopbackwards will just train all the weights (including initial weights).

So I think you should probably use stopbackwards after training day and before training to night dataset.

hussienWehbi commented 3 years ago

Hi @renjiraND

Other stuff you can do is to train from scratch, so use the training command without the weights in the arguments and run it, then you do the partial after training from scratch

On darknet i cant do training without giving a pre-trained weights in the training command if i want to train from scratch, can you help me ?

renjiraND commented 3 years ago

@hussienWehbi yes, you cannot train without a weight(but not pretrained weight). it’s been a long time since i’ve worked this, but i think there is a way to train it using a weight which is actually from scratch, not using the pretrained from darknets person db, while you actually need to use one.

so the baseline is, you freeze the pretrained weight and make it into an untrained weight by removing the head of the weight (which will be the most important thing for the object you’d want to detect)

sorry couldn’t explain more clearly, hope it helps!

hussienWehbi commented 3 years ago

@renjiraND thank you very much that helped me a lot just a little question , if i did this command "./darknet partial cfg/yolov3-tiny.cfg yolov3-tiny.weights yolov3-tiny_3l.conv.1 1" does this mean that i froze only the first conv layer and keep its weights un-touched ? and training will train all remaining layers ?

renjiraND commented 3 years ago

@hussienWehbi yes, that is correct.

but i think that you should just freeze from layer 136 if i’m not mistaken. not freeze until layer 1.

i think that its for the general object detection, they already trained the weight with preselected data for general object detection so you only need to use frozen 136 layers to train to a specific object, because its actually from scratch if you were to train for a new object

using the 1 layer weight will break the model unless you really know how to train darknet from scratch

lol i’m even confused with my explanation

hussienWehbi commented 3 years ago

@renjiraND i am confused too, lol you mean that its better to freeze layers up to the first yolo layer (15 in yolov3-tiny for example) and train the remaining layers to get the benefit of trained weights that authors had reached , is that correct ? again thanks a lot

renjiraND commented 3 years ago

@hussienWehbi yes

https://github.com/AlexeyAB/darknet#how-to-train-tiny-yolo-to-detect-your-custom-objects

here’s the example for yolov4-tiny. training from this model (29 layer/ first convolutional layer) is actually training from scratch for specific object

hope it helps!

hussienWehbi commented 3 years ago

@renjiraND thank you very much you really helped me a lot

mgupta70 commented 1 year ago

Eventually, the research didn't go that way. I ended up doing a different stuff on darknet. But anyway, i found that you could do partial on darknet using the command i found here

./darknet partial cfg/yolov4.cfg yolov4.weights yolov4.137 137, and that will give you the frozen model up till layer 137 (this is what i get cmiiw)

After that, you retrain it with another dataset. i used it to retrain the yolov4-tiny to another dataset because I couldn't find the pretrained model (only the coco trained one) in the github, not like yolov4, which has the pretrained partial.

So for the partial, you need to find the number of layer just before the first yolo layer (in my case, yolov4, its the 30th layer) so i did ./darknet partial cfg/yolo-tiny.cfg yolo-tiny.weights yolo-tiny.30 30 then I retrained them with new dataset, and it works nicely.

Other stuff you can do is to train from scratch, so use the training command without the weights in the arguments and run it, then you do the partial after training from scratch. I think that is how you would do the pretraining. sorry for my bad english, still learning it.

Hope it helps anyone out there!

Hey!! I have a doubt. If you use partial for layer 137 does it mean that weights uptill 137 layers are frozen and weights for all the next layers are still initialized by the pretrained weights (instead of randomly) but they are trainable. Am I correct?