voxel51 / fiftyone

Refine high-quality datasets and visual AI models
https://fiftyone.ai
Apache License 2.0
8.85k stars 558 forks source link

[?] How can I use FiftyOne to convert OpenImage segmentations to COCO format #1959

Closed santhoshkelathodi closed 2 years ago

santhoshkelathodi commented 2 years ago

Proposal Summary

As there are multiple data formats for creating datasets, there can be a feature to convert one format to another. For example, OpenImages to Coco format, or coco to Yolo, etc. As I understand the visualization from different formats are supported, this can be easier.

Motivation

What areas of FiftyOne does this feature affect?

Details

dataset = foz.load_zoo_dataset(
    "open-images-v6",
    split="test",
    label_types=["segmentations"],
    classes=["Cattle"],
    max_samples=200,
    seed=51,
    shuffle=True,
    dataset_name="open-images-cattle",
)

export_dir = "exports"
label_field = "ground_truth"  # for example
dataset_type = fo.types.COCODetectionDataset
# tagged_view below is a part of the dataset view where some edits are made. 
dataset.export(
    export_dir=export_dir,
    dataset_type=dataset_type,
    label_field=label_field,
)

Willingness to contribute

The FiftyOne Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

jasoncorso commented 2 years ago

The CLI already supports a dataset format conversion command convert and it does so on disk directly. https://voxel51.com/docs/fiftyone/cli/index.html

santhoshkelathodi commented 2 years ago

The CLI already supports a dataset format conversion command convert and it does so on disk directly. https://voxel51.com/docs/fiftyone/cli/index.html

Thank you for the quick reply...

I managed to find out the conversion functions. However, I could not find a way to convert from Google OpenImage segmentation format to coco instance segmentation format. Is it that I am not able to correctly identify the format ?

https://voxel51.com/docs/fiftyone/recipes/convert_datasets.html

brimoor commented 2 years ago

@santhoshkelathodi what's the issue here? The code you shared should work, assuming you provide a valid field name for the label_field argument of export() (or just omit it altogether if your dataset only contains one Detections field, because FiftyOne can automatically infer the correct field):

For example, this works for me:

import fiftyone as fo
import fiftyone.zoo as foz

# Load Open Images dataset
dataset = foz.load_zoo_dataset(
    "open-images-v6",
    split="validation",
    label_types=["segmentations"],
    classes=["Cattle"],
    max_samples=10,
)

session = fo.launch_app(dataset)

# Export in COCO format
dataset.export(
    export_dir="/tmp/coco",
    dataset_type=fo.types.COCODetectionDataset,
    label_field="segmentations",  # this can be omitted bc dataset only contains one `Detections` field
)

# Verify that we can load the COCO dataset
dataset2 = fo.Dataset.from_dir(
    dataset_dir="/tmp/coco",
    dataset_type=fo.types.COCODetectionDataset,
    label_types="segmentations",  # required in order to load masks (otherwise only bboxes are loaded)
)

session.dataset = dataset2
brimoor commented 2 years ago

Just for fun, here's how this would look using only the CLI:

# Download subset of Open Images dataset
fiftyone zoo datasets download open-images-v6 \
    --splits validation \
    --kwargs \
        label_types=segmentations \
        classes=Cattle \
        max_samples=10

# Get location where dataset is stored
INPUT_DIR=$(fiftyone zoo datasets find open-images-v6 --split validation)

# Destination to write the COCO-formatted data
OUTPUT_DIR=/tmp/coco

# Convert segmentations from Open Images dataset to COCO format
fiftyone convert \
    --input-dir ${INPUT_DIR} \
    --input-type fiftyone.types.OpenImagesV6Dataset \
    --input-kwargs \
        label_types=segmentations \
        include_id=False \
    --output-dir ${OUTPUT_DIR} \
    --output-type fiftyone.types.COCODetectionDataset

# Check out results
ls -lah ${OUTPUT_DIR}
python -m json.tool "${OUTPUT_DIR}/labels.json"
santhoshkelathodi commented 2 years ago

Thank you very much... I am able to do the conversion...