Open yangulei opened 6 years ago
Did you get mAP for validation dataset - images that weren't used during training? If yes, then you can use any weights-file since ~54 000 iterations. I.e. get any weights-file with the highest mAP. (Or weights-file with highest mAP, Precision and Recall)
which shows that the metrics doesn't go down like this plot:
Overfitting is rare for Yolo v3/v2. It can be only:
Thank you for your timing reply! Yes, I do randomly split my ~8000 image samples into training and validation dataset with a ratio of 8:2, and validate the models with the validation dataset after training. Looks like I could reduce the training iterations to save some training time. Thank you again, I can proceed without the worry about overfitting now.
@yangulei Looking at your learning strategy... What're the purpose of scales and steps? Thx
@EscVM It's a learning rate (LR) schedule. The parameter "steps" refers to the incremental steps to adjust the LR, and the parameter "scales" refers to the multipliers, see the answer on stackoverflow. In my personal opinion, the schedule is aimed to balance the computation time and the convergence accuracy. You can find more detail about this at CS231n and the YOLOv1 paper.
@yangulei Thank you. Gotcha!
Insead, I've tried your code, but I got this error:
`--------------------------------------- KeyErrorTraceback (most recent call last)
@EscVM Sorry for reply so late. Did you change the darknet executable, the data path, the config path and the weights path to your own in get_metrics() function? It's line 11 to 17 in the script.
I train my model on Ubuntu 16.04 with the command below:
darknet detector train <data_file> <cfg_file> darknet19_448.conv.23 | tee log.txt
Here is my learning rate strategy in the cfg_file:
And here is a chart during the training:
After the training, I use my Python script to validate the models with different training steps. The script runs the command:
darknet detector map <data_file> <cfg_file> <weight_file> 1>log_file
and parses the output to get the metrics, then plots them out. Here is a plot I got:
which shows that the metrics doesn't go down like this plot: So, when should I stop training, or which weight file should I chose?
BTW: my Python script is ugly, while it does works.