roboflow / notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
https://roboflow.com/models
5.09k stars 788 forks source link

Missing dataset name in test split path in data.yaml #82

Open maxsitt opened 1 year ago

maxsitt commented 1 year ago

Search before asking

Notebook name

YOLOv5 PyTorch Object Detection

Bug

When using the Roboflow data import, the dataset name is not written to the path of the dataset test split in the data.yaml file.

Example:

test: ../test/images
train: dataset_name/train/images
val: dataset_name/valid/images

This can lead to problems, e.g. when trying to validate on the dataset test split (with --task test):

FileNotFoundError: test: /content/yolov5/test/images does not exist

It is no problem adding the dataset name to the test path in the data.yaml file, but an inexperienced user might not know how to do this.

I don't know if there is a reason behind not adding the dataset name to the test path, if not this might be a bug.

Environment

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello @maxsitt, thank you for leaving an issue on Roboflow Notebooks.

🐞 Bug reports

If you are filing a bug report, please be as detailed as possible. This will help us more easily diagnose and resolve the problem you are facing. To learn more about contributing, check out our Contributing Guidelines.

If you require support with custom code that is not part of Roboflow Notebooks, please reach out on the Roboflow Forum or on the GitHub Discussions page associated with this repository.

πŸ’¬ Get in touch

Do you have more questions about Roboflow that we haven't responded to yet? Feel free to ask them on the Roboflow Discuss forum. Our developer advocates and community team actively respond to questions there.

To ask questions about Notebooks, head over to the GitHub Discussions section of this repository.

SkalskiP commented 1 year ago

Hi, @maxsitt! Could you send me the link to your dataset on Roboflow?

maxsitt commented 1 year ago

Hi @SkalskiP!

Sure, here is the link: Roboflow dataset

Thanks for looking into it!

SkalskiP commented 1 year ago

Hi, @maxsitt I just tested, and the download works as expected. Training as well.

I used this code to download your dataset:

%cd /content/yolov5
from roboflow import Roboflow
rf = Roboflow(api_key="API_KEY")
project = rf.workspace("maximilian-sittinger").project("insect_detect_detection")
dataset = project.version(7).download("yolov5")

When I take a look at data.yaml it looks like this:

names:
- insect
nc: 1
roboflow:
  license: CC BY 4.0
  project: insect_detect_detection
  url: https://universe.roboflow.com/maximilian-sittinger/insect_detect_detection/dataset/7
  version: 7
  workspace: maximilian-sittinger
test: ../test/images
train: Insect_Detect_detection-7/train/images
val: Insect_Detect_detection-7/valid/images
maxsitt commented 1 year ago

Yes, as described the only problem occurs when you need to get the path to the dataset test split from the data.yaml, e.g. for validating on it.

If you try to run:

%cd /content/yolov5
!python val.py --weights runs/train/exp/weights/best.pt  --data {dataset.location}/data.yaml --img 320 --task test

The dataset test split is not found because of the wrong path (missing dataset name) in the data.yaml file:

FileNotFoundError: test: /content/yolov5/test/images does not exist
arijitde92 commented 1 year ago

Hi @SkalskiP ,

I think this is happening because the "__reformat_yaml" method of version.py is reformating only the train and val locations (as shown below) when the model_format is "yolov5" or "yolov5pytorch". image I think adding content["test"] = location + content["test"].lstrip(".") after line 728 will solve the issue.

Shall I proceed with making the changes in roboflow-python repository and raise a PR?

SkalskiP commented 1 year ago

Hi, @arijitde92! πŸ‘‹πŸ» Let me ask around internally on Slack first.

arijitde92 commented 1 year ago

Hi @xabierr , did you use your own custom dataset or is "Noosa_2-1" available in the roboflow datasets?

SkalskiP commented 1 year ago

@Jacobsolawetz / @yeldarby / @mo-traor3-ai, is that intentional behavior?

maxsitt commented 1 year ago

Hi @SkalskiP @arijitde92,

any updates on this? Why is the path to the test folder only reformatted for YOLOv6?

Same problem in this issue.

Thanks!

arijitde92 commented 1 year ago

Hi @SkalskiP , did you get any information about whether it is an intentional behavior?

SkalskiP commented 1 year ago

Hi @arijitde92 πŸ‘‹πŸ» Unfortunately, I didn't. Let me try to ping the dev team once again.

Ryandran commented 10 months ago

I have the same problem FileNotFoundError: Dataset 'data.yaml' not found ⚠️, missing paths ['/content/gdrive/MyDrive/yolov8/datasets/valid/images']

fewmysteriessolved commented 8 months ago

I have the same problem FileNotFoundError: Dataset 'data.yaml' not found ⚠️, missing paths ['/content/gdrive/MyDrive/yolov8/datasets/valid/images']

Did you make sure you have all three necessary folders, subfolders and its data/files available in your runtime session or file location? Make sure this is the case.

-- /content
   -- datasets
      -- NAME_OF_YOUR_DATASET
         -- train
         -- valid
         -- test

It works for me even with a missing 'test'-folder, but it is not working when you only have one of these three (e.g. only 'train').