Closed tjasmin111 closed 2 days ago
Hey there!
It seems like your training session may be experiencing a snag post-checks. Given the large dataset size, it's possible that the halt could be related to memory constraints. YOLOv8 Nano is designed to be lightweight and efficient, but the sheer volume of your dataset might indeed push its limits. Here are a couple of suggestions that might help:
If the issue persists, consider sharing the logs right before the halt occurs for a deeper dive. Additionally, running the training with a smaller subset of your dataset could also provide some insights.
Happy to assist further if needed! π
But it worked for a Small model before. And I'm already passing a 320 imgsz. Also my systems' RAM and GPU are large enough and shouldn't be the problem.
Where are the logs stored? How can I get them?
Hi there! π
Great to hear that it worked with a Smaller model and you're already using a 320 imgsz. If your system's resources are sufficient, let's look into the logs for more clues.
Logs for YOLOv8 training sessions, including any errors or warnings, are typically stored in the runs/train/exp*
directories, with detailed TensorBoard logs in runs/train/exp*/events.out.tfevents.*
.
To access TensorBoard logs and visualize your training progress, you can run:
tensorboard --logdir runs/train
and then open http://localhost:6006/ in your browser.
If you encounter any specific errors in those logs, feel free to share here for further assistance! π
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
Search before asking
Question
I'm trying to train a Yolov8-cls Nano classifier on a dataset size of 350K images. Idk why the training gets halted and don't proceeds after the
checks passed
. I'm not sure if this is an issue with the machine or a maybe Nano can not handle that many images/memory requirements. Any advice on this?Additional
No response