Closed care55 closed 3 years ago
π Hello @care55, thank you for your interest in π YOLOv5! Please visit our βοΈ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a π Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training β Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@glenn-jocher can you help me please?
models are automatically saved in runs/expXX/weights/
, if you want to use it in another place, you can load it by model = torch.hub.load('ultralytics/yolov5', 'custom', path_or_model='your.pt')
@care55 maybe you can try !python train.py --resume
Thanks @wudashuo I try it but I have this error
@pravastacaraka !python train.py --resume runs/train/yolov5s_results/weights/last.pt I tried this but when I trained a lot of epochs maybe 2000 or more all cells after a period of time shutted down for no reason so I have to restart all cells to get the runs file and all of my work is lost
How can I fix this issue? can I fix this problem by using !python train.py --resume without specify the path?
@care55 if your training was interrupted for any reason you may continue where you left off using the --resume
command. If your training fully completed then you can start a new training starting from a fully trained model using the --weights
command. Examples:
You may not change settings when resuming, and no additional arguments other than --resume
should be passed:
python train.py --resume # automatically find latest checkpoint (searches yolov5/ directory)
python train.py --resume path/to/last.pt # specify resume checkpoint
Multi-GPU DDP trainings must be resumed with the same GPUs and DDP command, i.e. assuming 8 GPUs:
python -m torch.distributed.launch --nproc_per_node 8 train.py --resume # resume latest checkpoint
python -m torch.distributed.launch --nproc_per_node 8 train.py --resume path/to/last.pt # specify resume checkpoint
If you would like to start training from a fully trained model, use the --weights
argument, not the --resume
argument:
python train.py --weights path/to/best.pt # start from pretrained model
Good luck and let us know if you have any other questions!
@glenn-jocher Thanks ,so now I have to put instead of quotation in --weights '' the path of my trained weights like this: !python train.py --img 416 --batch 16 --epochs 1 --data '../data.yaml' --cfg ./models/custom_yolov5s.yaml --weights path/to/best.pt --name yolov5s_results --cache Right?
@care55 yes, though note that if you specify --weights then --cfg is not needed.
I appreciate your help, that works Thanks @glenn-jocher
@glenn-jocher I have more questions if you can help please ... Can I train yolov5 model with coco weights ?and if I could how to make this? Can you please suggest the best weights to train yolov5 model for object detection?
@care55 yes you can start training from any pretreind YOLOv5 weights using the --weights command. I would recommend you start from the Train Custom Data tutorial which answers many of these questions:
So @glenn-jocher can I train yolov5x the first time for 100 epochs and save it in drive then pass these weights in --weight and train it with another 100 epochs so it becomes 200 epochs because every 50 epochs in yolov5x take almost two hours to complete and I can't keep my laptop opened all the day and the gpu will be full, is that possible or the training must be continue without interrupting?
Because my data is about 4000 images and I tried to do this with yolov5x ...I have reached to 400 epochs by this technic but I don't find much improvement than 50 epochs when I see my images how many epochs I have to train to reach a good accuracy ? And is there any technic while training to see that my model is improving or only from the images I can see the results ? Thanks....
@john8822 if your training was interrupted for any reason you may continue where you left off using the --resume
argument. If your training fully completed, you can start a new training from any model using the --weights
argument. Examples:
You may not change settings when resuming, and no additional arguments other than --resume
should be passed, with an optional path to the checkpoint you'd like to resume from. If no checkpoint is passed the most recently updated last.pt
in your yolov5/
directory is automatically found and used:
python train.py --resume # automatically find latest checkpoint (searches yolov5/ directory)
python train.py --resume path/to/last.pt # specify resume checkpoint
Multi-GPU DDP trainings must be resumed with the same GPUs and DDP command, i.e. assuming 8 GPUs:
python -m torch.distributed.launch --nproc_per_node 8 train.py --resume # resume latest checkpoint
python -m torch.distributed.launch --nproc_per_node 8 train.py --resume path/to/last.pt # specify resume checkpoint
If you would like to start training from a fully trained model, use the --weights
argument, not the --resume
argument:
python train.py --weights path/to/best.pt # start from pretrained model
Good luck and let us know if you have any other questions!
@glenn-jocher Is there any technic while training to see that my model is improving or only when training is completed I can see the result from the images?
@john8822 see W&B Logging tutorial:
βQuestion
hi , I'm trying to save my trained model in yolov5 to load it in another session and trained the model from the epoch it stopped how can I save this in a model
!python train.py --img 416 --batch 16 --epochs 1 --data '../data.yaml' --cfg ./models/custom_yolov5s.yaml --weights '' --name yolov5s_results --cache
to call it in: torch.save(model.state_dict(), path) and after saving it how can I load it?
Additional context
note : I want to save my work in drive and load it from it too I used this code to train my data: https://colab.research.google.com/drive/1gDZ2xcTOgR39tGGs-EZ6i3RTs16wmzZQ#scrollTo=wbvMlHd_QwMG thanks...