Closed twangnh closed 2 years ago
👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.
When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
git pull
or git clone
a new copy to ensure your problem has not already been solved in master.If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.
Thank you! 😃
Hi there! I had the same problem with this. It's normal for the YOLOv5 model results to not show the same results for every run, even on the same weights.
You might need to round off the decimal values to make the validation work.
👋 Hello! Thanks for asking about training reproducibility. YOLOv5 🚀 uses a single training seed which is set here using the init_seeds()
function. CPU and Single-GPU trainings should be fully reproducible with torch>=1.12.0
. Multi-GPU DDP trainings are still not reproducible unfortunately. This is an open issue for us and we could use any help in tracking down this problem.
https://github.com/ultralytics/yolov5/blob/7215a0fb41a90d8a0bf259fa708dff608a1f0262/train.py#L104
This function sets python
, numpy
and torch
seeds and updates cudnn
settings:
https://github.com/ultralytics/yolov5/blob/7215a0fb41a90d8a0bf259fa708dff608a1f0262/utils/general.py#L198-L214
To set a new training seed for example:
python train.py --seed 3 # default seed=0
Note that even when using the same seed trainings may produce different results, especially when using CUDA backends. See https://pytorch.org/docs/stable/notes/randomness.html for details on factors affecting PyTorch training reproducibility and sources of randomness.
Good luck 🍀 and let us know if you have any other questions!
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
I had similar problem. This command on my machine produced slightly DIFFERENT lines of results on nvidia GPU:
for i in {1..15}; do python val.py --weights yolov5x.pt --data coco.yaml --img 640 --verbose --save-txt > 1.txt 2>&1 && cat 1.txt | grep person ; done
After two weeks of trials I changed the following on my computer:
Now everthing is working propertly - yeah! Lines exactly the same!
I do not sure, but I feel maybe some problems in reproducibility of CUDA operations if kernel module compiled version of driver mismatch cuda version, as a side effect of using "runtime" cuda installer (which I used before)
Hope this helps.
@vlesu got it, thanks for your feedback! Yes some CUDA ops in torch may have reproducibility issues that CPUs do not have, but in general operations should be reproducible.
Search before asking
Question
Hi I tried evaluating a same model multiple times, I did not modify any part of the code, but the result is not the same across different runs, could anyone help give a hint?
Additional
No response