DIAGNijmegen / pathology-whole-slide-data

A package for working with whole-slide data including a fast batch iterator that can be used to train deep learning models.
https://diagnijmegen.github.io/pathology-whole-slide-data/
Apache License 2.0
86 stars 24 forks source link

Exemplary JSON Annotation File #60

Closed FraukeWilm closed 1 month ago

FraukeWilm commented 1 month ago

Hi Mart,

Thanks for your open-source contributions for WSI processing! I am currently experimenting with your groups' adaptations to nnUNet, which hopefully make life a lot easier when working with whole slide images.

I was wondering if you could maybe provide an exemplary JSON file of how the annotations should be stored when working with your batch iterator? I have the annotations of all my WSIs stored in a single JSON file using the COCO format and I am currently trying to find out how to best convert them into a format that is compatible with your dataloaders.

Best, Frauke

martvanrijthoven commented 1 month ago

Dear Frauke,

The internal JSON representation is as follows:

[
    # first annotation
    {
        "index": 0,
        "coordinates": [[x1, y1], [x2, y2], [x3, y3], [..., ...]],
        "label": {
            "name": "label1",
            "value": 1
        }
    },
    # seconds annotation
    {
        "index": 1,
        "coordinates": [[x1, y1], [x2, y2], [x3, y3], [..., ...]],
        "label": {
            "name": "label2",
            "value": 2
        }
    }
    # etc
]

However wholeslidedata also supports other annotations formats like QuPath or ASAP. Alternatively, we can make a parser for COCO, such that your annotation files would work without converting them.

Best wishes, Mart

FraukeWilm commented 1 month ago

Dear Mart,

Thanks for the quick reply! I think it should be straight-forward to convert my annotations into the required format without the need for a custom parser. But I will let you know should I face any troubles.

Best, Frauke