Closed jackfaubshner closed 7 months ago
👋 Hello @jackfaubshner, thank you for your interest in YOLOv3 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
git clone https://github.com/ultralytics/yolov3 # clone
cd yolov3
pip install -r requirements.txt # install
YOLOv3 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv3 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv3 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!
Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.
Check out our YOLOv8 Docs for details and get started with:
pip install ultralytics
@jackfaubshner hello!
Thank you for the detailed issue description. It seems like you're encountering two main problems when training YOLOv3-tiny: segmentation faults on powerful equipment and the "Killed" message on your CPU-only setup.
Segmentation Faults: This issue frequently relates to environment-specific constraints rather than the model itself. Ensure your PyTorch and CUDA versions are compatible. Also, try reducing the batch size to see if it alleviates the problem.
"Killed" Message: This typically happens due to an out-of-memory error, especially on systems with limited resources like your CPU-only laptop. The training process requires a considerable amount of RAM, and when you increase your batch size or your system runs out of memory, the OS might terminate the process. Try reducing the --batch-size
(e.g., to 16 or 32) and see if it solves the issue.
Lastly, it's essential to keep your repository up to date, as mentioned in your logs. Though the message points towards cloning YOLOv5, it's just about ensuring your YOLOv3 version is current. For detailed investigations and advanced troubleshooting, consult the documentation at https://docs.ultralytics.com.
Keep in mind, the YOLO community and we at Ultralytics are here to help, and we appreciate your contribution to making YOLOv3 better! 🚀
@jackfaubshner hello!
Thank you for the detailed issue description. It seems like you're encountering two main problems when training YOLOv3-tiny: segmentation faults on powerful equipment and the "Killed" message on your CPU-only setup.
1. **Segmentation Faults**: This issue frequently relates to environment-specific constraints rather than the model itself. Ensure your PyTorch and CUDA versions are compatible. Also, try reducing the batch size to see if it alleviates the problem. 2. **"Killed" Message**: This typically happens due to an out-of-memory error, especially on systems with limited resources like your CPU-only laptop. The training process requires a considerable amount of RAM, and when you increase your batch size or your system runs out of memory, the OS might terminate the process. Try reducing the `--batch-size` (e.g., to 16 or 32) and see if it solves the issue.
Lastly, it's essential to keep your repository up to date, as mentioned in your logs. Though the message points towards cloning YOLOv5, it's just about ensuring your YOLOv3 version is current. For detailed investigations and advanced troubleshooting, consult the documentation at https://docs.ultralytics.com.
Keep in mind, the YOLO community and we at Ultralytics are here to help, and we appreciate your contribution to making YOLOv3 better! 🚀
Thank you kind sir, I was able to get it to work on my old laptop, not that I am going to train on it, just wanna check if the code works. It would probably take more time to train on that laptop than the heat death of the universe
Also, yes, the issue with the workstation is probably with CUDA. It has CUDA 12.0 which no version of PyTorch supports
I'm gonna close this issue but is there any parameter I can add to the command below to train it on CPU only? Cause I don't think I can change the CUDA version on this workstation as other people are using it.
python3 train.py --data coco.yaml --epochs 300 --weight '' --cfg yolov3-tiny.yaml --batch-size 128
@jackfaubshner, great to hear you got it working on your laptop, even if just for a test! Regarding training on the CPU, you can indeed run your training on a CPU by specifying the device. Just add --device cpu
to your command like so:
python3 train.py --data coco.yaml --epochs 300 --weights '' --cfg yolov3-tiny.yaml --batch-size 128 --device cpu
This tells the script to ignore any GPUs and run the training process on the CPU only. Keep in mind, as you've probably guessed, training on a CPU is significantly slower than on GPUs. 😊
Should you have any more questions or run into issues, feel free to ask. Happy training! 🚀
Search before asking
YOLOv3 Component
No response
Bug
Hello,
I am trying to train YOLOv3-tiny from scratch for a small research project but I seem to have run into some weird issue
I have made no changes to the code whatsoever
I first tried it on a training workstation I have, which has the following specs:
CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores GPU: 3 x NVIDIA RTX A6000 49140MiB RAM: 256GB OS: Ubuntu 20.04.4 LTS Python: 3.8.10 CUDA Version: 12.0 torch: 2.2.1
I made sure to run requirements.txt to make sure all packages are updated
I cloned the repository, made no changes to it and I directly ran the following command and this is its subsequent output:
So I looked around the issues section and saw a few people mention I should try running train.py without any parameters to see how it runs as default, so below is that happened:
Then I thought it might be an issue with the workstation, so I fresh installed Ubuntu 22.04.4 LTS on a laptop (CPU only, no GPU), cloned the repo and this time, I first ran train.py without any parameters. It started training the model (don't have output of this one as I did not save it). I cancelled it with Ctrl + C after 10 minutes
Laptop Specs: CPU: Intel i5-5200U GPU: None RAM: 8GB OS: Ubuntu 22.04.4 LTS Python: 3.10.12 torch: 2.2.1
I made sure to run requirements.txt to make sure all packages are updated
Then, I ran "python3 train.py --data coco.yaml --epochs 300 --weight '' --cfg yolov3-tiny.yaml --batch-size 128" and again, it crashed but this time it just says "Killed". Below is the output
Then I ran train.py without parameters (which automatically defaulted to YOLOv3-tiny and the following happened:
I'm not sure what is wrong but it feels like this a YOLOv3-tiny training issue?
Apologies for not making a pull request, I don't want to mess things up
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?