ultralytics / hub

Ultralytics HUB tutorials and support
https://hub.ultralytics.com
GNU Affero General Public License v3.0
134 stars 13 forks source link

unable to process the dataset #730

Open BrianChen0405 opened 4 months ago

BrianChen0405 commented 4 months ago

Search before asking

Question

i dont know why the ultralytics unable to process the dataset, i got train valid test and yaml in my folder

Additional

https://drive.google.com/file/d/19HDxwDcc6Jo_ZAU5hWsymhIFAYuvsqKB/view?usp=drive_link

github-actions[bot] commented 4 months ago

👋 Hello @BrianChen0405, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

pderrenger commented 4 months ago

@BrianChen0405 hello,

Thank you for reaching out and providing the details about your issue. Let's work together to resolve this!

First, please ensure that your dataset is structured correctly. Your dataset directory should contain the train, val, and test folders, along with the YAML file. The YAML file should be placed inside the root directory of your dataset, and all names (YAML, directory, and ZIP) should match. For example, if your dataset is named mydataset, you should have:

mydataset/
  ├── mydataset.yaml
  ├── train/
  ├── val/
  └── test/

Next, zip your dataset directory:

zip -r mydataset.zip mydataset

Before uploading, it's a good idea to validate your dataset to ensure there are no formatting issues. You can use the following code snippet to validate your dataset:

from ultralytics.hub import check_dataset

check_dataset("path/to/mydataset.zip", task="detect")

If everything is in order, you can proceed to upload your dataset to Ultralytics HUB. Navigate to the Datasets page, click on the Upload Dataset button, and follow the prompts.

If you have already followed these steps and are still encountering issues, please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update your packages using:

pip install --upgrade torch ultralytics hub-sdk

If the problem persists, could you please provide a minimum reproducible example? This will help us investigate the issue more effectively. You can find more details on creating a minimum reproducible example here.

Thank you for your patience, and we look forward to resolving this issue for you. If you have any further questions, feel free to ask!

sergiuwaxmann commented 4 months ago

@BrianChen0405 Hello!

I just checked our system and everything works fine.

If you are having issues uploading your dataset:

  1. Follow the steps in our documentation. 1.2. Validate your dataset locally before uploading it to Ultralytics HUB (as @pderrenger suggested).
    from ultralytics.hub import check_dataset
    check_dataset("path/to/dataset.zip", task="detect")
  2. Make sure you have a stable internet connection.

I suggest trying to upload a small dataset (such as COCO8) to confirm that upload works correctly.

github-actions[bot] commented 3 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

zwong91 commented 3 months ago

image

image

Why before upload datasets is ok, but ultralystic hub Loading... and then timeout Timeout

No response from the server.

zwong91 commented 3 months ago

data source is here, almost 2 GB, 20K jpg, yolov8 format zip file roboflow: workspace: augmented-startups project: playing-cards-ow27d version: 4 license: Public Domain url: https://universe.roboflow.com/augmented-startups/playing-cards-ow27d/dataset/4

sergiuwaxmann commented 3 months ago

@zwong91 Do you have a stable internet connection? Unfortunately, the upload ZIP isn't chunked so you need a stable internet connection during the entire upload.

zwong91 commented 3 months ago

@sergiuwaxmann Thank you for your answer. I am using a VPN and the network is really average. It may be better if HUB supports CLoud Provider such as S3 or Cloudflare R2.