Closed VYRION-Ai closed 1 year ago
@VYRION-Ai
You can use the following command:
python train.py --model
Be sure to give more than 20 epochs for training this time as the last weights file contains the information about the number of epochs that it has been trained for and won't resume training if given 20 epochs or less.
@sovit-123 i got this error
python train.py --model fasterrcnn_resnet50_fpn --weights run_normal2/last_model_state.pth --data data.yaml --resume --epochs 30
`device cuda
Creating data loaders
Number of training samples: 41316
Number of validation samples: 3904
Loading pretrained weights...
RESUMING TRAINING...
Traceback (most recent call last):
File "train.py", line 550, in <module>
main(args)
File "train.py", line 322, in main
if checkpoint['epoch']:
KeyError: 'epoch'`
this is line 320 because i do some lines
if checkpoint['epoch']: start_epochs = checkpoint['epoch'] print(f"Resuming from epoch {start_epochs}...")
Which .pth
file did you use? Please use 'last_model.pth'.
@sovit-123 what is the different between last_model_state.pth and last_model.pth
The code saves three weights:
last_model.pth
: This is saved after every epoch and contains all the information including the epochs and optimizer state dictionary. Ideal for resuming training.last_model_state.pth
: This is also saved after every epoch but only saves the model state dictionary (weights). This is ideal if trying to run inference using the latest model. Note that these may not be the best weights.best_model.pth
This model is only saved when an epoch's validation mAP surpasses the last highest mAP. This also contains only the model weights and is the most suitable for running inference for getting good results.@sovit-123 thank you very much , i have more question, what is the best number of epoic i can start with , i have 22k images in folder training for two classes (mask and no mask ) , and this is map.jpg, it seems results is not good
@VYRION-Ai I would say, the results are not too bad. You are getting more than 85% mAP at 0.50 IoU and around 47% mAP at 0.50:0.95 IoU. However, I can suggest a few things:
--no-mosaic
. It may improve performance.--use-train-aug
.I would suggest starting with the above two. If you get better graphs, please post them here. I would also like to know how the model performs on various datasets out of the box and improve the code even more.
@sovit-123 i run training on my pc and power was off i need to resume training.
and also sorry if i do training for like 20 epoic and training finished but the result i got is not good how to resume from last weight ,like start from epoic 21 .