Closed kulkarnikeerti closed 2 years ago
@kulkarnikeerti That's all right. If you have any problems about this project, do not hesitate to ask.
--resume
is to keep training as long as the training stage is interrupted unexpectedly.
After evaluation stage, the model weight will be saved as long as the current mAP is higher than the best mAP(default is -1.).
In train.py
the path_to_save
corresponds to the path to save model weight.
@yjh0410
I understand that completely. What I don't understand is the --resume
value.
By defaults it's set to None
. If I want to resume the model from where I left, what needs to be the value of this? Is it the saved model weight path after evaluation stage?
If yes, would it also consider the epoch, optimizer and other parameters from previous training to resume the training? Because from code its quite confusing for me, since it doesn't save other details.
@kulkarnikeerti
You can give a path to .pth
file to --resume
,for example, --resume weight/coco/yolo_nano/yolo_nano.pth
.
For now, it dose not consider the epoch, optimizer or other training parameters, so it is not complete. It only saves the model weight.
@yjh0410 Okay thanks. I would modify that according to my requirement. Thanks a lot for clarifying that:)
@kulkarnikeerti You are welcome.
To be honest, when I build this project, some of my knowledge of YOLO is incomplete, so there may be errors in some details, which may lead to ambiguity. However, I don't have the spare energy to refactor this project right now, so while some models perform well, they still need to be refined.
@yjh0410 Sorry for these many clarifications. I was wondering what needs to be the input to the
--resume
to resume the model training. Is it the weights we save at the end ofeval_epoch
? Because, I don't see any checkpoints being saved during training.