ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.39k stars 16.26k forks source link

Celery logger stops working using torch.hub.load #6060

Closed arjitkatare closed 2 years ago

arjitkatare commented 2 years ago

Search before asking

YOLOv5 Component

PyTorch Hub

Bug

For some reasons, while using torch.hub.load in celery worker, logger is getting shutdown. By specifying v6.0 in repo_or_dir, issue seems to be resolved

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

github-actions[bot] commented 2 years ago

👋 Hello @arjitkatare, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 2 years ago

@arjitkatare 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

JonathanSamelson commented 2 years ago

I ran into the same issue with my own loggers after updating PyTorch to 1.10 (from 1.7.1).

The problem comes from this line: self.net = torch.hub.load('ultralytics/yolov5', 'custom', path=self.model_path, verbose=False)

In this case, the output of my script is as follows:

2022-03-11 17:42:03 pluton aptitude-toolbox[22744] INFO Model type YOLO selected.
YOLOv5  2022-1-12 torch 1.10.2+cu113 CUDA:0 (GeForce GTX 1080 Ti, 11264MiB)

Fusing layers...
Model Summary: 213 layers, 7039792 parameters, 0 gradients
Adding AutoShape...
Detector init duration = 4.6338949000000005s
Model type SORT selected.
Tracker init duration = 0.019383399999999718s

Then, I don't have any output from my loggers anymore.

Whereas when adding a tag such as v6.0: self.net = torch.hub.load('ultralytics/yolov5:v6.0', 'custom', path=self.model_path, verbose=False)

My loggers are working and don't stop:

2022-03-11 17:46:15 pluton aptitude-toolbox[12680] INFO Model type YOLO selected.
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Detector init duration = 4.442983099999999s
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Model type SORT selected.
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Tracker init duration = 0.019554799999999872s
.... [After the process ends] ...
2022-03-11 17:46:21 pluton aptitude-toolbox[12680] INFO Average FPS: 12.473039525066566
2022-03-11 17:46:21 pluton aptitude-toolbox[12680] INFO Average FPS w/o read time: 13.190384737142026

As you can see, I still have the details provided by my logger in the latter case.

Also, using the latest tag (v6.1) give me another error, which is not related I think: Exception: path is on mount 'C:', start on mount 'E:'. Cache may be out of date, tryforce_reload=Trueor see https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading for help. I tried using force_reload, the result is the same.

glenn-jocher commented 2 years ago

@JonathanSamelson regarding the loggers, not sure what the problem could be. The current logging code is here, with LOGGER imported in various places: https://github.com/ultralytics/yolov5/blob/b94b59e199047aa8bf2cdd4401ae9f5f42b929e6/utils/general.py#L77-L88

Regarding the PyTorch v6.1 PyTorch Hub usage this works correctly for me, I'm not able to reproduce any errors:

Screenshot 2022-03-11 at 18 05 36
JonathanSamelson commented 2 years ago

@glenn-jocher I think the difference in the format is caused by this line:

logging.basicConfig(format="%(message)s", level=logging.INFO if (verbose and rank in (-1, 0)) else logging.WARNING)

I think it causes the logger to change the format for all loggers instead of only yolov5 logger.

And actually I was wrong in my previous message, the logger does not stop working but the level of the logger changes to INFO so the DEBUG message are filtered.

glenn-jocher commented 2 years ago

@JonathanSamelson ok got it. If you have a fix in mind can you please submit a PR? Thanks!

JonathanSamelson commented 2 years ago

@glenn-jocher At the moment, I don't. I'm using kind of the same lines in my project... I understand it does not matter until it becomes an underlying project.

glenn-jocher commented 2 years ago

@arjitkatare @JonathanSamelson good news 😃! Your original issue may now be fixed ✅ in PR #7296 by @maxstrobel.

To receive this update:

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!