ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.41k stars 16.27k forks source link

The Doka training run got stuck here. #13350

Closed SpiderJack0516 closed 1 week ago

SpiderJack0516 commented 2 weeks ago

Search before asking

Question

Specific logs: 1728559285621

Additional

No response

UltralyticsAssistant commented 2 weeks ago

👋 Hello @SpiderJack0516, thank you for reaching out and using YOLOv5 🚀! This is an automated response to help guide you through solving your issue, and an Ultralytics engineer will be with you shortly.

If this is a 🐛 Bug Report, please share a minimum reproducible example so we can debug it efficiently. This will help us identify any potential issues in your environment or setup.

For optimal assistance with your question, ensure that you've followed our Tips for Best Training Results. Additionally, provide any relevant details such as logs or dataset examples to support your inquiry.

Requirements

Make sure that your environment meets the following conditions:

Python>=3.8.0 with necessary packages from requirements.txt installed, and ensure you have PyTorch>=1.8.

To set up YOLOv5:

git clone https://github.com/ultralytics/yolov5  # clone the repo
cd yolov5
pip install -r requirements.txt  # install dependencies

Environments

YOLOv5 supports various environments for seamless operation:

Status

Check the CI status for the latest pass results: YOLOv5 CI.

If green, all YOLOv5 GitHub Actions are passing, confirming smooth operation of our training, validation, inference, export, and benchmarks across all platforms.

Introducing YOLOv8 🚀

Don't miss out on our latest state-of-the-art model, YOLOv8 🚀. Optimized for speed, accuracy, and user-friendliness, it's perfect for your object detection, segmentation, and classification projects.

Explore the YOLOv8 Docs to get started:

pip install ultralytics

Feel free to reach out with more details if needed! 😊

SpiderJack0516 commented 2 weeks ago

To add to this, the first few days were still fine for normal training, now it's impossible to train with DDP Training.

pderrenger commented 2 weeks ago

@SpiderJack0516 please ensure your environment is up-to-date with the latest YOLOv5 version and dependencies. If the issue persists, try re-cloning the repository and reinstalling the requirements. If you need further assistance, feel free to provide more details.

SpiderJack0516 commented 1 week ago

@pderrenger I've reset to a new environment and it's still stuck there. I suspect it's because my server is not networked.

SpiderJack0516 commented 1 week ago

@pderrenger Now, I git pull the latest code and the question is gone.

pderrenger commented 1 week ago

Great to hear that updating the code resolved the issue! If you have any more questions, feel free to ask.