Training inside Python 3.12 miniconda environment currently fails with ModuleNotFoundError

michael-mayo commented 4 days ago

Search before asking

[X] I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Training, Detection

Bug

$ python train.py --epochs 1
train: weights=yolov5s.pt, cfg=, data=data/coco128.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=1, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, bbox_interval=-1, artifact_alias=latest, neptune_token=None, neptune_project=None, neptune_resume_id=None, s3_upload_dir=None, upload_dataset=False, hf_model_id=None, hf_token=None, hf_private=False, hf_dataset_id=None, roboflow_token=None, roboflow_upload=False
requirements: /home/michael/miniconda3/envs/test/lib/python3.12/site-packages/requirements.txt not found, check failed.
YOLOv5 🚀 2024-10-22 Python-3.12.7 torch-2.5.0+cu124 CPU

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0

Dataset not found ⚠️, missing paths ['/home/michael/miniconda3/envs/test/lib/python3.12/site-packages/datasets/coco128/images/train2017']
Downloading https://ultralytics.com/assets/coco128.zip to coco128.zip...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.66M/6.66M [00:00<00:00, 32.9MB/s]
Dataset download success ✅ (2.4s), saved to datasets
ClearML: run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 🚀 in ClearML
Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 🚀 runs in Comet
TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
Traceback (most recent call last):
  File "/home/michael/miniconda3/envs/test/lib/python3.12/site-packages/yolov5/train.py", line 735, in <module>
    main(opt)
  File "/home/michael/miniconda3/envs/test/lib/python3.12/site-packages/yolov5/train.py", line 615, in main
    train(opt.hyp, opt, device, callbacks)
  File "/home/michael/miniconda3/envs/test/lib/python3.12/site-packages/yolov5/train.py", line 132, in train
    result = attempt_download_from_hub(weights, hf_token=None)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/michael/miniconda3/envs/test/lib/python3.12/site-packages/yolov5/utils/downloads.py", line 150, in attempt_download_from_hub
    from huggingface_hub.utils._errors import RepositoryNotFoundError
ModuleNotFoundError: No module named 'huggingface_hub.utils._errors

Environment

YOLO: 2024-10-22 Python-3.12.7 torch-2.5.0+cu124 CPU
OS: Ubuntu 20.04.6 LTS running in WSL on Windows 11
Python: Miniconda env with 3.12

Minimal Reproducible Example

# Install miniconda on linux then do the following:
conda create -n test python=3.12
conda activate test
pip install yolov5
cd `python -c "import yolov5; print(yolov5.__path__[0])"`
python train.py --epochs 1

Additional

The errror occurs both when training a model and also when trying to load a custom local model using torch.hub.load. Error also occurs when running directly on windows, and when I downgrade Pytorch to 2.4. It appears to be caused by the pypi version of yolov5 referencing hugging face modules that no longer exist.

Are you willing to submit a PR?

[x] Yes I'd like to help by submitting a PR!

UltralyticsAssistant commented 4 days ago

👋 Hello @michael-mayo, thank you for reaching out and for your interest in YOLOv5 🚀! This is an automated response to let you know that an Ultralytics engineer will assist you soon.

It looks like you're encountering a ModuleNotFoundError in your Python 3.12 environment when trying to train YOLOv5. For 🐛 Bug Reports like this one, it's crucial to have a clear minimum reproducible example to diagnose the issue accurately, which you have provided.

Meanwhile, you might want to consider:

Ensuring all dependencies specified in requirements.txt are installed. You can do this with pip install -r requirements.txt.
Verifying your Python environment has all necessary packages for YOLOv5, including huggingface_hub, as indicated by your error message. Installing missing modules with pip install huggingface_hub may help.
Checking compatibility with your specific setup, as some modules might have different support depending on the Python version.

For custom training ❓ Questions, providing additional context such as specific dataset details and any relevant logs can be beneficial. Also, make sure you're following our best practices for optimal training results.

If you're looking for additional guidance on setups or training environments, consider using our verified environments like Notebooks, Google Cloud, or Docker, which offer dependencies pre-installed.

Stay tuned for further assistance from our team! 😊

michael-mayo commented 4 days ago

Following the bot's suggestion I added pip install huggingface_hub to the above reproducible example which fails to resolve the issue. Calling python train.py --epochs 1 from inside the yolov5 docker image or from a github clone of the repo does work OK however, so I think the issue is that the version on pypi needs to be updated.

pderrenger commented 3 days ago

@michael-mayo thanks for the update. It seems like the PyPI package might be outdated. Please try cloning the latest version from GitHub directly, as it should resolve the issue. If the problem persists, let us know.

michael-mayo commented 3 days ago

@michael-mayo thanks for the update. It seems like the PyPI package might be outdated. Please try cloning the latest version from GitHub directly, as it should resolve the issue. If the problem persists, let us know.

Yes that works, but in a production environment I cannot rely on having git available. I currently need to install everything using pip. There is a way to install the latest commit directly via pip using the command pip install git+https://github.com/ultralytics/yolov5.git but this also fails with a different error:

pip install git+https://github.com/ultralytics/yolov5.git
Collecting git+https://github.com/ultralytics/yolov5.git
  Cloning https://github.com/ultralytics/yolov5.git to /tmp/pip-req-build-akqy6tao
  Running command git clone --filter=blob:none --quiet https://github.com/ultralytics/yolov5.git /tmp/pip-req-build-akqy6tao
  Resolved https://github.com/ultralytics/yolov5.git to commit 2f74455adc74a587c9e9d5a6e45df880fce8ea3e
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      error: Multiple top-level packages discovered in a flat-layout: ['data', 'models', 'segment', 'classify'].

      To avoid accidental inclusion of unwanted files or directories,
      setuptools will not proceed with this build.

      If you are trying to create a single distribution with multiple packages
      on purpose, you should not rely on automatic discovery.
      Instead, consider the following options:

      1. set up custom discovery (`find` directive with `include` or `exclude`)
      2. use a `src-layout`
      3. explicitly set `py_modules` or `packages` with a list of names

      To find more information, look for "package discovery" on setuptools docs.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

I have not been able to find a fix for this yet, so if you have one, that would be great. Otherwise an updated pypi release would be much appreciated.

pderrenger commented 2 days ago

To resolve the installation issue via pip, try using the following command to install directly from the GitHub repo:

pip install git+https://github.com/ultralytics/yolov5.git#egg=yolov5

If this doesn't work, please consider using a stable release from GitHub until the PyPI package is updated. We appreciate your patience.

michael-mayo commented 1 day ago

pip install git+https://github.com/ultralytics/yolov5.git#egg=yolov5

That command also does not work with the same error. Does it work at your end?

pderrenger commented 1 day ago

Please ensure you have the latest version of setuptools and wheel installed with pip install --upgrade setuptools wheel, then try the command again. If the issue persists, using a GitHub clone is recommended until the PyPI package is updated.

nenb commented 1 day ago

@pderrenger I can also replicate these issues. To be able to install from the git repo, my understanding is that you will need to update your pyproject.toml e.g. see https://stackoverflow.com/questions/72294299/multiple-top-level-packages-discovered-in-a-flat-layout.

@michael-mayo My current workaround is to pin huggingface-hub<0.25.0. I don't know if this will be possible for you though.

pderrenger commented 1 day ago

Thank you for the suggestion. We'll look into updating the pyproject.toml to address this. Meanwhile, pinning huggingface-hub<0.25.0 is a helpful workaround. If you have further issues, please ensure you're using the latest version from GitHub.

ultralytics / yolov5