Project-MONAI / tutorials

MONAI Tutorials
https://monai.io/started.html
Apache License 2.0
1.89k stars 685 forks source link

using collection names with spaces in them breaks tcia_dataset.ipynb #876

Open kirbyju opened 2 years ago

kirbyju commented 2 years ago

Describe the bug In the "Create TciaDataset" step of the notebook if you enter a collection with spaces (e.g. "Lung Phantom") into the collection variable it produces an error. Here is the relevant part of the code:

Let's take the "QIN-PROSTATE-Repeatability" collection for example

collection, seg_type = "QIN-PROSTATE-Repeatability", "SEG"

Here is the error:

KeyError Traceback (most recent call last) in 5 transform = Compose( 6 [ ----> 7 LoadImaged(reader="PydicomReader", keys=["image", "seg"], label_dict=TCIA_LABEL_DICT[collection]), 8 ] 9 )

KeyError: 'Lung Phantom'

Environment (please complete the following information): I was working on Google Colab

wyli commented 2 years ago

could you please have a look @yiheng-wang-nv ?

yiheng-wang-nv commented 2 years ago

Hi @kirbyju , it is not the "names with spaces" issue. First of all, Lung Phantom seems not a collection name, I searched in: https://www.cancerimagingarchive.net/collections/ and found that the corresponding collection name is: Phantom FDA. In addition, within the tutorial and the doc strings of the source code (monai.apps.tcia.TciaDataset), it is mentioned that so far, only SEG or RESTRUCT image types are supported, but Phantom FDA has the type "CT". Therefore, this collection may not be supported. Thanks!

kirbyju commented 2 years ago

Hi @yiheng-wang-nv , Phantom FDA is a collection but Lung Phantom is also one: Lung Phantom. It contains many segmentations (SEG) of a single CT scan. This API call will show you a full inventory of the scans in this collection: https://services.cancerimagingarchive.net/nbia-api/services/v1/getSeries?Collection=Lung%20Phantom.