NKI-AI / slidescore-api

Python utilities to interface with SlideScore
Apache License 2.0
3 stars 3 forks source link

downloading of labels crashes when Answer row is not a dictionary / json-like object #25

Open Tkootstra opened 2 years ago

Tkootstra commented 2 years ago

Describe the bug When downloading labels, the answer will get parsed as a string-formatted json string. However, this assumption does not hold for all annotations / scoring sheets. For example the scoring sheet of 1286 (PD-L1) has a question called "Brown Channel Intensity" for which the answer will be a normal string, instead of string-formatted dictionary-like object.

To Reproduce try to download annotations from slidecore study 1286, using the cli

Expected behavior should handle normal strings, probably solvable by doing some type checking

Environment Package Version


certifi 2022.6.15 cffi 1.15.1 charset-normalizer 2.1.0 dlup 0.3.0.dev0 idna 3.3 imageio 2.21.1 networkx 2.8.5 numpy 1.23.1 opencv-python 4.6.0.66 openslide-python 1.2.0 packaging 21.3 Pillow 9.2.0 pip 22.2.2 pkgconfig 1.5.5 pycparser 2.21 pyparsing 3.0.9 pyvips 2.2.1 PyWavelets 1.3.0 PyYAML 6.0 requests 2.28.1 scikit-image 0.19.3 scipy 1.9.0 setuptools 56.0.0 shapely 2.0a1+2.g97df157 slidescore-api 0.1.0 tifffile 2022.8.8 tifftools 1.3.5 tqdm 4.64.0 urllib3 1.26.11

Additional context Stacktrace:

Traceback (most recent call last): File "/homes/tkootstra/datasets_manifests/create_versioned_dataset.py", line 154, in download_labels(slidescore_url="https://slidescore.nki.nl/", File "/homes/tkootstra/venvs/datasets_manifests/lib/python3.8/site-packages/slidescore_api/cli.py", line 310, in download_labels for curr_annotation in annotation_parser.from_iterable( File "/homes/tkootstra/venvs/datasets_manifests/lib/python3.8/site-packages/slidescore_api/utils/annotations.py", line 433, in from_iterable answers = json.loads(_row["Answer"]) File "/homes/tkootstra/miniconda3/lib/python3.8/json/init.py", line 357, in loads return _default_decoder.decode(s) File "/homes/tkootstra/miniconda3/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/homes/tkootstra/miniconda3/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

AjeyPaiK commented 2 years ago

Is this bug still present in slidescore-api @jonasteuwen and @Tkootstra ?

Tkootstra commented 2 years ago

Currently. the download labels call does not have a proper way of dealing with filtering of bulk questions (either json / text question, can be requested with client.get_questions()). Imo, this should still be added to the api before you close this issue.