Closed santhoshkelathodi closed 2 years ago
The CLI already supports a dataset format conversion command convert
and it does so on disk directly. https://voxel51.com/docs/fiftyone/cli/index.html
The CLI already supports a dataset format conversion command
convert
and it does so on disk directly. https://voxel51.com/docs/fiftyone/cli/index.html
Thank you for the quick reply...
I managed to find out the conversion functions. However, I could not find a way to convert from Google OpenImage segmentation format to coco instance segmentation format. Is it that I am not able to correctly identify the format ?
https://voxel51.com/docs/fiftyone/recipes/convert_datasets.html
@santhoshkelathodi what's the issue here? The code you shared should work, assuming you provide a valid field name for the label_field
argument of export()
(or just omit it altogether if your dataset only contains one Detections
field, because FiftyOne can automatically infer the correct field):
For example, this works for me:
import fiftyone as fo
import fiftyone.zoo as foz
# Load Open Images dataset
dataset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
label_types=["segmentations"],
classes=["Cattle"],
max_samples=10,
)
session = fo.launch_app(dataset)
# Export in COCO format
dataset.export(
export_dir="/tmp/coco",
dataset_type=fo.types.COCODetectionDataset,
label_field="segmentations", # this can be omitted bc dataset only contains one `Detections` field
)
# Verify that we can load the COCO dataset
dataset2 = fo.Dataset.from_dir(
dataset_dir="/tmp/coco",
dataset_type=fo.types.COCODetectionDataset,
label_types="segmentations", # required in order to load masks (otherwise only bboxes are loaded)
)
session.dataset = dataset2
Just for fun, here's how this would look using only the CLI:
# Download subset of Open Images dataset
fiftyone zoo datasets download open-images-v6 \
--splits validation \
--kwargs \
label_types=segmentations \
classes=Cattle \
max_samples=10
# Get location where dataset is stored
INPUT_DIR=$(fiftyone zoo datasets find open-images-v6 --split validation)
# Destination to write the COCO-formatted data
OUTPUT_DIR=/tmp/coco
# Convert segmentations from Open Images dataset to COCO format
fiftyone convert \
--input-dir ${INPUT_DIR} \
--input-type fiftyone.types.OpenImagesV6Dataset \
--input-kwargs \
label_types=segmentations \
include_id=False \
--output-dir ${OUTPUT_DIR} \
--output-type fiftyone.types.COCODetectionDataset
# Check out results
ls -lah ${OUTPUT_DIR}
python -m json.tool "${OUTPUT_DIR}/labels.json"
Thank you very much... I am able to do the conversion...
Proposal Summary
As there are multiple data formats for creating datasets, there can be a feature to convert one format to another. For example, OpenImages to Coco format, or coco to Yolo, etc. As I understand the visualization from different formats are supported, this can be easier.
Motivation
What areas of FiftyOne does this feature affect?
fiftyone
Python libraryDetails
Willingness to contribute
The FiftyOne Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?