Closed chun92 closed 1 year ago
👋 Hello @chun92, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more, and see our ⭐️ HUB Guidelines to quickly get started uploading datasets and training YOLO models.
If this is a 🐛 Bug Report, please provide screenshots and steps to recreate your problem to help us get started working on a fix.
If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.
We try to respond to all issues as promptly as possible. Thank you for your patience!
@chun92 thanks for the bug report! Is this reproducible every time you train the same model on this same dataset or did it just happen this once?
It seems like final model upload may have been interrupted, while at the same time leaving nothing to resume since all epochs completed successfully.
@chun92 ok I've taken a look at trainer.py and attempted a fix in https://github.com/ultralytics/ultralytics/pull/2200/commits/07662cba6c73ffe55977ee5e96273a86afb7cea8
I can't be sure that your resume will work, but I think this particular bug should now be resolved in ultralytics 8.0.86
. Please update your package with pip install -U ultralytics
and try again, and let us know if this resolves your issue.
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐
Search before asking
HUB Component
Training
Bug
I'm training my model in colab. Because of the unstable server the training stopped after last epochs done. When I tried to resume it, colab give me the following error.
The hub page show "100% Optimizing weights". But soon it shows disconnected
I think that hub tried to resume start_epoch 100, but there's no epoch variable made at trainer.py: 294 line, it makes error.
for epoch in range(self.start_epoch, self.epochs):
What can I do in hub to pass this bug and complete the training with Colab? Could you update the version to fix it?
Environment
Colab Plus
Minimal Reproducible Example
No response
Additional
No response