roboflow / notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
https://roboflow.com/models
5.09k stars 788 forks source link

Can't execute training on new version of dataset #47

Closed mikearney closed 1 year ago

mikearney commented 1 year ago

Search before asking

Notebook name

train-yolov8-object-detection-on-custom-dataset.ipynb

Bug

image

/content Ultralytics YOLOv8.0.14 πŸš€ Python-3.8.10 torch-1.13.1+cu116 CUDA:0 (A100-SXM4-40GB, 40536MiB) yolo/engine/trainer: task=detect, mode=train, model=yolov8s.pt, data=/content/pb-and-players-4/data.yaml, epochs=25, patience=50, batch=16, imgsz=800, save=True, cache=False, device=, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=False, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, retina_masks=False, classes=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=17, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, save_dir=runs/detect/train2 Traceback (most recent call last): File "/usr/local/bin/yolo", line 8, in sys.exit(entrypoint()) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/cfg/init.py", line 218, in entrypoint func(cfg) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/v8/detect/train.py", line 205, in train model.train(**vars(cfg)) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/model.py", line 199, in train self.trainer = self.TrainerClass(overrides=overrides) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/trainer.py", line 122, in init self.data = check_dataset_yaml(self.data) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/data/utils.py", line 190, in check_dataset_yaml data = check_file(data) File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/utils/checks.py", line 226, in check_file assert len(files), f'File not found: {file}' # assert file was found AssertionError: File not found: /content/pb-and-players-4/data.yaml

Environment

-Google Colab

Minimal Reproducible Example

download snippet:

!pip install roboflow

from roboflow import Roboflow rf = Roboflow(api_key="######") project = rf.workspace("halftone-digital").project("pb-and-players") dataset = project.version(2).download("yolov8")

Additional

This worked just fine with the original version of my dataset, but subsequent versions throw this error:

link: https://app.roboflow.com/halftone-digital/pb-and-players/2#

Are you willing to submit a PR?

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello @mikearney, thank you for leaving an issue on Roboflow Notebooks.

🐞 Bug reports

If you are filing a bug report, please be as detailed as possible. This will help us more easily diagnose and resolve the problem you are facing. To learn more about contributing, check out our Contributing Guidelines.

If you require support with custom code that is not part of Roboflow Notebooks, please reach out on the Roboflow Forum or on the GitHub Discussions page associated with this repository.

πŸ’¬ Get in touch

Do you have more questions about Roboflow that we haven't responded to yet? Feel free to ask them on the Roboflow Discuss forum. Our developer advocates and community team actively respond to questions there.

To ask questions about Notebooks, head over to the GitHub Discussions section of this repository.

SkalskiP commented 1 year ago

Hi, @mikearney πŸ‘‹πŸ»! I tried to replicate your problem and it looks like I know what is the problem. When I execute this part of the code:

from roboflow import Roboflow
rf = Roboflow(api_key="####")
project = rf.workspace("halftone-digital").project("pb-and-players")
dataset = project.version(2).download("yolov8")

It results in:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-16-1cfee0e1a18d>](https://localhost:8080/#) in <module>
      7 rf = Roboflow(api_key="90KQSJx4Nj8Oy8BYWxyS")
      8 project = rf.workspace("halftone-digital").project("pb-and-players")
----> 9 dataset = project.version(2).download("yolov8")

[/usr/local/lib/python3.8/dist-packages/roboflow/core/project.py](https://localhost:8080/#) in version(self, version_number, local)
    257                 return vers
    258 
--> 259         raise RuntimeError("Version number {} is not found.".format(version_number))
    260 
    261     def __image_upload(

RuntimeError: Version number 2 is not found.

And that makes sense because when I visit your project you don't have 2nd version of your model. https://universe.roboflow.com/halftone-digital/pb-and-players/dataset/7 Most likely you removed it at some point. You only have v3 and v7. Take a look here:

Screenshot 2023-01-23 at 15 51 44

Because we fail during the download we don't have the dataset, and that all leads to: AssertionError: File not found: /content/pb-and-players-4/data.yaml.

When I change version of dataset in your code to:

from roboflow import Roboflow
rf = Roboflow(api_key="####")
project = rf.workspace("halftone-digital").project("pb-and-players")
dataset = project.version(3).download("yolov8")

Download works.

I'm closing the issue for now, but if you still face any issues feel free to reopen it.

mikearney commented 1 year ago

@SkalskiP The download portion was always working. It's the Custom Training step which is still throwing the error. It's finding the correct dataset (Note that I just created V8)

image
/content
Downloading https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt to yolov8s.pt...
100% 21.5M/21.5M [00:00<00:00, 259MB/s]

Ultralytics YOLOv8.0.11 πŸš€ Python-3.8.10 torch-1.13.1+cu116 CUDA:0 (Tesla T4, 15110MiB)
yolo/engine/trainer: task=detect, mode=train, model=yolov8s.pt, data=/content/pb-and-players-8/data.yaml, epochs=5, patience=50, batch=16, imgsz=800, save=True, cache=False, device=, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=False, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, retina_masks=False, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=17, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, hydra={'output_subdir': None, 'run': {'dir': '.'}}, v5loader=False, save_dir=runs/detect/train

Dataset not found ⚠️, missing paths ['/content/datasets/pb-and-players-8/valid/images']
Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/cli.py", line 148, in entrypoint
    cli(cfg)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/cli.py", line 84, in cli
    func(cfg)
  File "/usr/local/lib/python3.8/dist-packages/hydra/main.py", line 79, in decorated_main
    return task_function(cfg_passthrough)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/v8/detect/train.py", line 207, in train
    model.train(**cfg)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/model.py", line 199, in train
    self.trainer = self.TrainerClass(overrides=overrides)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/trainer.py", line 126, in __init__
    self.data = check_dataset_yaml(self.data)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/data/utils.py", line 232, in check_dataset_yaml
    raise FileNotFoundError('Dataset not found ❌')
FileNotFoundError: Dataset not found ❌
SkalskiP commented 1 year ago

@mikearney could you take a look right now? Make sure to create a new Google Colab copy, as we added a few changes in the meantime. I just checked the training with your dataset and it works.

mikearney commented 1 year ago

Still getting the same error @SkalskiP :(

/content
Ultralytics YOLOv8.0.11 πŸš€ Python-3.8.10 torch-1.13.1+cu116 CUDA:0 (Tesla T4, 15110MiB)
yolo/engine/trainer: task=detect, mode=train, model=yolov8s.pt, data=/content/pb-and-players-8/data.yaml, epochs=100, patience=50, batch=16, imgsz=800, save=True, cache=False, device=, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=False, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, retina_masks=False, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=17, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, hydra={'output_subdir': None, 'run': {'dir': '.'}}, v5loader=False, save_dir=runs/detect/train2

Dataset not found ⚠️, missing paths ['/content/datasets/pb-and-players-8/valid/images']
Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/cli.py", line 148, in entrypoint
    cli(cfg)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/cli.py", line 84, in cli
    func(cfg)
  File "/usr/local/lib/python3.8/dist-packages/hydra/main.py", line 79, in decorated_main
    return task_function(cfg_passthrough)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/v8/detect/train.py", line 207, in train
    model.train(**cfg)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/model.py", line 199, in train
    self.trainer = self.TrainerClass(overrides=overrides)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/engine/trainer.py", line 126, in __init__
    self.data = check_dataset_yaml(self.data)
  File "/usr/local/lib/python3.8/dist-packages/ultralytics/yolo/data/utils.py", line 232, in check_dataset_yaml
    raise FileNotFoundError('Dataset not found ❌')
FileNotFoundError: Dataset not found ❌
SkalskiP commented 1 year ago

@mikearney can you send me the link to your Google Colab copy of ou Notebook? Something must be different on your side. And I must examine it.

mikearney commented 1 year ago

@SkalskiP https://colab.research.google.com/drive/1uRQlW7C06VgBVKFUvQtbNZ6evQumEDg0?usp=sharing

SkalskiP commented 1 year ago

@mikearney, and suddenly everything makes sense. :) You removed innocent-looking but critical lines from the notebook.

!mkdir {HOME}/datasets
%cd {HOME}/datasets

!pip install roboflow --quiet

from roboflow import Roboflow
rf = Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace("roboflow-jvuqo").project("football-players-detection-3zvbc")
dataset = project.version(1).download("yolov5")

πŸ‘‡πŸ»

!pip install roboflow

from roboflow import Roboflow
rf = Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace("halftone-digital").project("pb-and-players")
dataset = project.version(8).download("yolov8")

You see, mighty YOLOv8 library really, and I mean really, wants you to have your datasets in the datasets directory. If you don't obey, you are cooked. πŸ§‘β€πŸ³

I checked your notebook. When you add back those lines:

!mkdir {HOME}/datasets
%cd {HOME}/datasets

Training magically works once again.

mikearney commented 1 year ago

Wow.. I see it now. I was copying from your snippet and neglected to notice the missing 2 lines when I pasted ughhhhhh.

image
mikearney commented 1 year ago

So yeah, for UX purposes, I'd recommend adding that code to your snippet

!mkdir {HOME}/datasets
%cd {HOME}/datasets
SkalskiP commented 1 year ago

Yup looks like this part is unclear. So we are campaigning hard for that restriction to be removed because it makes no sense. But that logic sits in the YOLOv8 code, not ours. I guess I'll add another warning to our notebook so that users will be more cautious here.