KeyError: 'classification' in qupath annotation parser

Hi all and thanks for this amazing library. 😊 TL;DR: Solution for handling cases in which the "Classification" key is not present for all the annotations in the GeoJSON from QuPath.

My process: I am exploring QuPath for different annotation tasks and i am using this package to process it. One possible scenario involve identifying a ROI in which there might be two different types of cells. Then it is possible to identify the cells and compute some measurements, which will be used to train a classifier. The results of this operation is typically an object of type Annotationwhich contains several (hundred/thousands) of object of type Cell .

The Annotation usually does not have a "Classification" key, and it is not useful to have it (to my current understanding) since it has an heterogenous class of cells inside it. The Cells instead do have a "Classification" key.

I believe that the current implementation of the QuPath annotation parser assumes that the "Classification" key is always present in the GeoJSON file. https://github.com/DIAGNijmegen/pathology-whole-slide-data/blob/97ed381d9e7a32cad391025b48b18856fe335e38/wholeslidedata/interoperability/qupath/parser.py#L13

and raise a KeyErrorif it does not find it.

Proposed solution I solved it successfully as follow:

        labels = set(
            [
        annotation.get("properties", {}).get("classification", {}).get("name")
        for annotation in opened_annotation
        if annotation.get("properties", {}).get("classification") and annotation.get("properties", {}).get("classification", {}).get("name")
            ]
        )

Does this sound reasonable or am i missing something? If you find it useful i can open a pull request

DIAGNijmegen / pathology-whole-slide-data

KeyError: 'classification' in qupath annotation parser #51