ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.03k stars 16.41k forks source link

Loading model from Torch Hub causes unexpected module collision #2414

Closed Lymkwi closed 3 years ago

Lymkwi commented 3 years ago

🐛 Bug

Using torch.hub.load to load any YOLOv5 model in an environment where either utils or models (and I assume data or weights) are top-level modules causes an undetected conflict resulting in a crash.

To Reproduce

The following minimalist directory structure :

 - main.py
 + utils/
 | - __init__.py

with the following content for main.py :

import utils as just_another_name_to_avoid_conflict
import torch
torch.hub.load('ultralytics/yolov5', 'yolov5s')

(you can keep __init__.py empty)

Output:

Traceback (most recent call last):
  File "[....]/test2/main.py", line 4, in <module>
    torch.hub.load('ultralytics/yolov5', 'yolov5s')
  File "[....]/.local/lib/python3.9/site-packages/torch/hub.py", line 339, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "[....]/.local/lib/python3.9/site-packages/torch/hub.py", line 365, in _load_local
    hub_module = import_module(MODULE_HUBCONF, hubconf_path)
  File "[....]/.local/lib/python3.9/site-packages/torch/hub.py", line 74, in import_module
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "[....]/.cache/torch/hub/ultralytics_yolov5_master/hubconf.py", line 12, in <module>
    from models.yolo import Model
  File "[....]/.cache/torch/hub/ultralytics_yolov5_master/models/yolo.py", line 9, in <module>
    from models.common import *
  File "[....]/.cache/torch/hub/ultralytics_yolov5_master/models/common.py", line 12, in <module>
    from utils.datasets import letterbox
ModuleNotFoundError: No module named 'utils.datasets'

Expected behavior

The model should be loaded without crashing, or it should at least be mentioned in #36 that one should be mindful of module name conflict (even when those modules are imported with other names) while using torch.hub.load

Environment

Additional context

I have been running up against this issue for a couple hours now, trying to load a model from yolo's code itself at first then discovering Torch Hub. When using Torch Hub however, I kept encountering import crashes in my image captioning project that made no sense since torch.hub.load should treat imports it does separately from the context where it is ran (and even if it does not, importing modules with a different name should let me avoid this issue). At first I suspected that it was because my code was in a subdirectory of the top level. I modified the entry point of my project to only run torch.hub.load, and still crashed. Finally, after carving out huge chunks of the code base in a copy I made, I discovered that commenting import utils and other references to my utils module (which is where I keep utilities like configuration and logging), torch.hub.load worked.

Considering how hacky python's module import system is, I could see why this is happening (i.e. there's a conflict between yolo's modules when torch imports them), but do not think this is a normal behaviour (since it basically forbids using utils, models or data as top-level module names in any project that loads yolov5 from torch hub).

github-actions[bot] commented 3 years ago

👋 Hello @Lymkwi, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 3 years ago

@Lymkwi thanks for the bug report! I tried to reproduce your issue in a Colab notebook but everything ran correctly for me. The code I used is here. Can you please provide a link to a notebook that reproduces the issue you are seeing?

!mkdir utils
!touch utils/__init__.py

import torch
import utils as utils_temp

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Images
dir = 'https://github.com/ultralytics/yolov5/raw/master/data/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')]  # batched list of images

# Inference
results = model(imgs)
results.print()  # or .show(), .save()
Screen Shot 2021-03-09 at 12 29 40 PM
glenn-jocher commented 3 years ago

@Lymkwi if I place the code in main.py I am now able to reproduce. Yes it seems there is a package naming conflic with the torch hub operations.

The easiest workaround seems to be loading the model first before any other potentially conflicting imports (see main.py in screenshot below). Not sure if there is a general solution to this. What do you think?

Screen Shot 2021-03-09 at 12 36 05 PM
Lymkwi commented 3 years ago

Not sure if there is a general solution to this. What do you think ?

Loading the torch model before anything else can help in some situations but in the architecture of my project it did break the way I designed everything (so I ended up changing names instead).

I'm not familiar enough with torch hub to begin to investigate a proper fix (as I still believe this is a bug), but at least it should deserve proper documentation so that people do not lose time trying to understand why a conflict happens (if it does). Somebody who can edit the Pytorch Hub wiki issue on this repo (which you can I believe?) could at least add a warning of this known problem for now IMO.

glenn-jocher commented 3 years ago

@Lymkwi yes this is a good idea, we will be updating our documentation in the future and this is one of the changes we can make. I think the following guidelines should work for all users right?

# Model
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')

# Imports
# place added imports after model is loaded to avoid conflicts

# Images
dir = 'https://github.com/ultralytics/yolov5/raw/master/data/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')]  # batch of images

# Inference
results = model(imgs)
results.print()  # or .show(), .save()
Lymkwi commented 3 years ago

I think the following guidelines should work for all users right?

According to your experiments, yes, this should cover most situations.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

carloalbertobono commented 3 years ago

Hi, I think I have a similar issue that prevent from loading more than one model using torch.hub

If I'm not mistaken by reading the thread, loading the model shadows some module names, that then become unusable in torch.

I'm using torch '1.9.1+cu102' on a Ubuntu 20.04 machine and to reproduce I do:

import torch
model = torch.hub.load('facebookresearch/detr', 'detr_resnet50', pretrained=True)
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

that ends up in

~/.cache/torch/hub/ultralytics_yolov5_master/hubconf.py in _create(name, pretrained, channels, classes, autoshape, verbose, device)
     28     from pathlib import Path
     29 
---> 30     from models.yolo import Model
     31     from models.experimental import attempt_load
     32     from utils.general import check_requirements, set_logging

ModuleNotFoundError: No module named 'models.yolo'

reversing the load order obviously ends up with:

~/.cache/torch/hub/facebookresearch_detr_master/hubconf.py in <module>
      2 import torch
      3 
----> 4 from models.backbone import Backbone, Joiner
      5 from models.detr import DETR, PostProcess
      6 from models.position_encoding import PositionEmbeddingSine

ModuleNotFoundError: No module named 'models.backbone'

Is there some workaround which I'm not seeing? Thank you very much cb

glenn-jocher commented 3 years ago

@carloalbertobono thanks for the code, am able to reproduce in colab. I'm not sure there's a simple fix, but also this might be slightly outside of our scope. You should definitely raise on pytorch and/or hub repos also:

https://github.com/pytorch/pytorch https://github.com/pytorch/hub