voxel51 / fiftyone

The open-source tool for building high-quality datasets and computer vision models
https://fiftyone.ai
Apache License 2.0
8.11k stars 540 forks source link

[FR] Support dicom images in COCODetectionDataset #1932

Open Gaurangkarwande opened 2 years ago

Gaurangkarwande commented 2 years ago

Proposal Summary

Supporting dicom images in COCODetectionDataset type dataset.

Motivation

Loading images using fo.data.Dataset.from_dir() into dataset of type COCODetectionDataset does not support dicom images. Object detection datasets may contain images with different extensions - .png, .jpg, .dcm, etc. Would be nice to add support for this.

What areas of FiftyOne does this feature affect?

Willingness to contribute

The FiftyOne Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

brimoor commented 2 years ago

Hi @Gaurangkarwande 👋

I would recommend first loading the DICOM images and then add the COCO labels to the dataset via add_coco_labels():

import fiftyone as fo
import fiftyone.utils.coco as fouc

dataset = fo.Dataset.from_dir(
    dataset_dir="/path/to/dicom/images",
    dataset_type=fo.types.DICOMDataset,
)

labels_path = "/path/to/coco.json"
classes = [...]

fouc.add_coco_labels(dataset, "predictions", labels_path, classes)
Gaurangkarwande commented 2 years ago

Hi @brimoor, thanks for your reply. My dateset consists of both .png and .dcm images. Doing it this way, I will have to first create a dataset for .png images, and a separate one for .dcm images. Then merge the two and later add the coco labels. Is this the correct workflow? Also would the dataset.evaluate_detections() work on such type of dataset?

brimoor commented 2 years ago

FYI- FiftyOne's fo.types.DICOMDataset format internally converts the DICOM images to PNG or JPG images (depending on the value of fo.config.default_image_ext) so that FiftyOne can visualize them. It doesn't natively render the DICOM images at App load time.

If you're saying that you have .dcm and .png versions of the same images, then just use the .png ones.

But, in general, FiftyOne datasets can have any mix of valid image formats in them (they don't need to be homogeneous).

If you can view the dataset in the FiftyOne App, then you're good to go for any API methods (and evaluate_detections() doesn't even need access to image pixels at all, so definitely not a concern)

Gaurangkarwande commented 2 years ago

I have .dcm and .png images in the same directory. They are not the same images.