ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.2k stars 16.43k forks source link

inconsistent validation result #9319

Closed twangnh closed 2 years ago

twangnh commented 2 years ago

Search before asking

Question

Hi I tried evaluating a same model multiple times, I did not modify any part of the code, but the result is not the same across different runs, could anyone help give a hint?

Additional

No response

glenn-jocher commented 2 years ago

👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

ZoellaUber commented 2 years ago

Hi there! I had the same problem with this. It's normal for the YOLOv5 model results to not show the same results for every run, even on the same weights.

You might need to round off the decimal values to make the validation work.

glenn-jocher commented 2 years ago

👋 Hello! Thanks for asking about training reproducibility. YOLOv5 🚀 uses a single training seed which is set here using the init_seeds() function. CPU and Single-GPU trainings should be fully reproducible with torch>=1.12.0. Multi-GPU DDP trainings are still not reproducible unfortunately. This is an open issue for us and we could use any help in tracking down this problem. https://github.com/ultralytics/yolov5/blob/7215a0fb41a90d8a0bf259fa708dff608a1f0262/train.py#L104

This function sets python, numpy and torch seeds and updates cudnn settings: https://github.com/ultralytics/yolov5/blob/7215a0fb41a90d8a0bf259fa708dff608a1f0262/utils/general.py#L198-L214

To set a new training seed for example:

python train.py --seed 3  # default seed=0

Note that even when using the same seed trainings may produce different results, especially when using CUDA backends. See https://pytorch.org/docs/stable/notes/randomness.html for details on factors affecting PyTorch training reproducibility and sources of randomness.

Screen Shot 2022-01-08 at 11 20 59 AM

Good luck 🍀 and let us know if you have any other questions!

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

vlesu commented 2 years ago

I had similar problem. This command on my machine produced slightly DIFFERENT lines of results on nvidia GPU:

for i in {1..15}; do python val.py --weights yolov5x.pt --data coco.yaml --img 640 --verbose --save-txt > 1.txt 2>&1 && cat 1.txt | grep person ; done

After two weeks of trials I changed the following on my computer:

Now everthing is working propertly - yeah! Lines exactly the same!

I do not sure, but I feel maybe some problems in reproducibility of CUDA operations if kernel module compiled version of driver mismatch cuda version, as a side effect of using "runtime" cuda installer (which I used before)

Hope this helps.

glenn-jocher commented 2 years ago

@vlesu got it, thanks for your feedback! Yes some CUDA ops in torch may have reproducibility issues that CPUs do not have, but in general operations should be reproducible.