ultralytics / hub

Ultralytics HUB tutorials and support
https://hub.ultralytics.com
GNU Affero General Public License v3.0
134 stars 13 forks source link

COCO format #798

Closed zivp100 closed 2 months ago

zivp100 commented 2 months ago

Search before asking

Question

Hey, I have a dataset for object detection in a COCO format, but it is different then the format you are looking for. My json file looks like the attached. Is there a way for me to load my data? custom_train (4).json

Additional

No response

github-actions[bot] commented 2 months ago

👋 Hello @zivp100, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

pderrenger commented 2 months ago

@zivp100 hello,

Thank you for reaching out! If your dataset is in COCO format but differs from the expected structure for Ultralytics HUB, you can convert it to the required format. Ultralytics HUB datasets follow the same structure and label formats as YOLOv5 and YOLOv8 datasets.

Here are the steps to convert your COCO format dataset to the required format:

  1. Extract Annotations and Images: Ensure your images and annotations are organized in a directory structure similar to the example below:

    dataset/
    ├── images/
    │   ├── train/
    │   ├── val/
    └── labels/
        ├── train/
        └── val/
  2. Convert COCO JSON to YOLO Format: You can use a script to convert your COCO annotations to YOLO format. Here is an example script to help you get started:

    import json
    import os
    
    def convert_coco_to_yolo(coco_json_path, output_dir):
        with open(coco_json_path) as f:
            data = json.load(f)
    
        images = {image['id']: image for image in data['images']}
        annotations = data['annotations']
    
        for ann in annotations:
            image_id = ann['image_id']
            image_info = images[image_id]
            image_filename = image_info['file_name']
            image_width = image_info['width']
            image_height = image_info['height']
    
            category_id = ann['category_id']
            bbox = ann['bbox']
            x_center = (bbox[0] + bbox[2] / 2) / image_width
            y_center = (bbox[1] + bbox[3] / 2) / image_height
            width = bbox[2] / image_width
            height = bbox[3] / image_height
    
            yolo_annotation = f"{category_id} {x_center} {y_center} {width} {height}\n"
    
            label_filename = os.path.splitext(image_filename)[0] + '.txt'
            label_filepath = os.path.join(output_dir, label_filename)
    
            with open(label_filepath, 'a') as label_file:
                label_file.write(yolo_annotation)
    
    # Example usage
    convert_coco_to_yolo('path/to/custom_train.json', 'path/to/output/labels/train')
  3. Create Dataset YAML File: Ensure you have a dataset YAML file in the root directory of your dataset. It should look something like this:

    path: ../datasets/your_dataset_name
    train: images/train
    val: images/val
    test:  # optional
    names:
      0: class_name_1
      1: class_name_2
      # Add all your class names here
  4. Zip and Upload: Once your dataset is structured correctly, zip the directory and upload it to Ultralytics HUB. You can follow the detailed instructions in the Ultralytics HUB Datasets documentation.

If you encounter any issues during the conversion or upload process, please let us know, and we'll be happy to assist further. 😊

sergiuwaxmann commented 2 months ago

@zivp100 Unfortunately, you need to use the YOLO format we support.