DIAGNijmegen / pathology-whole-slide-data

A package for working with whole-slide data including a fast batch iterator that can be used to train deep learning models.
https://diagnijmegen.github.io/pathology-whole-slide-data/
Apache License 2.0
86 stars 24 forks source link

KeyError: 'classification' in qupath annotation parser #51

Closed Mat-Po closed 6 months ago

Mat-Po commented 6 months ago

Hi all and thanks for this amazing library. 😊 TL;DR: Solution for handling cases in which the "Classification" key is not present for all the annotations in the GeoJSON from QuPath.

My process: I am exploring QuPath for different annotation tasks and i am using this package to process it. One possible scenario involve identifying a ROI in which there might be two different types of cells. Then it is possible to identify the cells and compute some measurements, which will be used to train a classifier. The results of this operation is typically an object of type Annotationwhich contains several (hundred/thousands) of object of type Cell .

image

The Annotation usually does not have a "Classification" key, and it is not useful to have it (to my current understanding) since it has an heterogenous class of cells inside it. The Cells instead do have a "Classification" key.

I believe that the current implementation of the QuPath annotation parser assumes that the "Classification" key is always present in the GeoJSON file. https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/97ed381d9e7a32cad391025b48b18856fe335e38/wholeslidedata/interoperability/qupath/parser.py#L13

and raise a KeyErrorif it does not find it.

Proposed solution I solved it successfully as follow:

        labels = set(
            [
        annotation.get("properties", {}).get("classification", {}).get("name")
        for annotation in opened_annotation
        if annotation.get("properties", {}).get("classification") and annotation.get("properties", {}).get("classification", {}).get("name")
            ]
        )

Does this sound reasonable or am i missing something? If you find it useful i can open a pull request

martvanrijthoven commented 6 months ago

Dear Matteo Pozzi,

Thank you for your interest in this package!

The parsing of geojson qupath files is still a bit experimental and any improvements/generalisations are very much welcome. Hence a PR with you change is very much appreciated!

Best wishes, Mart