oncoray / mirp

Medical Image Radiomics Processor
https://oncoray.github.io/mirp/
European Union Public License 1.2
38 stars 11 forks source link

JOSS Review: Functionality Comments #81

Closed surajpaib closed 3 weeks ago

surajpaib commented 2 months ago

Hi @alexzwanenburg. Thank you for updating the tutorial, this allows testing real world functionality much better.

While testing out various functionality mentioned in the docs and the software report, I've made the following observations.

  1. I notice that several parameters have been provided as kwargs for the first tutorial. Especially in the extract_features function call.

    features = extract_features(
        image=os.path.join(save_dir, "sts_images"),
        image_sub_folder=os.path.join("mr_t1", "image"),
        mask=os.path.join(save_dir, "sts_images"),
        mask_sub_folder=os.path.join("mr_t1", "mask"),
        roi_name="GTV_Mass",
        by_slice=True,
        intensity_normalisation="standardisation",
        new_spacing=1.0,
        base_discretisation_method="fixed_bin_number",
        base_discretisation_n_bins=16
    )
    

    It would be useful to link the tutorial section that mentions the overriding to the https://oncoray.github.io/mirp/configuration.html section so that the user has an idea of how and what parameters can be overridden.

  2. Given the large number of parameters, it seems best to manage these with a configuration file which I see is provided via the means of an XML file. However, several entries don't correspond with their names in the python settings classes. Take the case of vol_adapt and ImagePerturbationSettingsClass that seem to map to each other. I recommend these be consistent for the best user experience.

  3. I see that a lot of parameters are dynamic keyword arguments. For instance, image parameter of extract_features. Because this is not defined explicitly in the function signature, I cannot find this in the help(extract_features) check which would make it very difficult for the user to understand what is expected in these arguments.

    I also do not see this in https://oncoray.github.io/mirp/quantitative_image_analysis.html#mirp.extract_features_and_images.extract_features but I see that you have mentioned other keyword arguments here: https://oncoray.github.io/mirp/quantitative_image_analysis.html#mirp.extract_features_and_images.extract_features_and_images_generator. In this one the import_image_and_mask() hyperlink seems to be broken.

    I would make these dynamic keyword arguments very clear in each function call that uses them so as to avoid hidden/suprising behaviour for the end user.

  4. The default argument for image_export_format seems to be a dict. Is there a reason this is preferred over the native type? As a user, I would expect the default behaviour to be native especially given that there is a concrete use-case for this type as seen in the tutorial. Are the users somehow expected to interact with the dict type in their standard workflows?

  5. For the extract_mask_labels, I've tested this with different formats (DICOM-SEG & RTSTRUCT) and it looks great. One suggestion I would have is to also show the mask_modality in the table.

  6. From a data perspective, I wanted to point out that I'm unable to process (with extract_images func) certain DICOM objects available on the Imaging Data Commons which I am able to parse with pydicom_seg.

    Using the code below,

    # Import the necessary libraries
    from idc_index import index
    from pathlib import Path
    
    # Create an IDCClient object
    client = index.IDCClient()
    
    Path("./data").mkdir(exist_ok=True)
    
    client.download_from_selection(
        collection_id="nsclc_radiomics",
        patientId="LUNG1-001",
        downloadDir="./data",
    )

    And then using dicomsort, I get several DICOM-SEG objects, one of them is attached to this issue. It gives me a ValueError: negative dimensions are not allowed error. It would also be useful if this error pointed out the exact file that causes it (when a dir is provided).
    42_2d-tta_nnU-Net_Segmentation-1.dcm.zip

    For the nsclc_radiogenomics cohort fetched as,

    # Import the necessary libraries
    from idc_index import index
    from pathlib import Path
    
    # Create an IDCClient object
    client = index.IDCClient()
    
    Path("./data").mkdir(exist_ok=True)
    
    client.download_from_selection(
        collection_id="nsclc_radiogenomics",
        patientId="AMC-001",
        downloadDir="./data",
    )

    when passing the data directory to extract_images func, Certain PT modality files throw this error: NotImplementedError: Conversion factor for converting PROPCPS to BQML is not implemented

Overall, the package functionality works seamlessly and is easy to use and interpret for the user.

Thank you for your great work with this package and the ease with which DICOM files are parsed and processed.

surajpaib commented 2 months ago

https://github.com/openjournals/joss-reviews/issues/6413

alexzwanenburg commented 2 months ago

Thanks for assessing MIRP's functionality.

I have created a task list so that I can keep track of the various things I need to look into:

Concerning some of the points:

The default argument for image_export_format seems to be a dict. Is there a reason this is preferred over the native type? As a user, I would expect the default behaviour to be native especially given that there is a concrete use-case for this type as seen in the tutorial. Are the users somehow expected to interact with the dict type in their standard workflows?

Yes, extract_images fulfils two main roles:

  1. Help read and process images as part of a deep learning workflow (e.g. within a DataLoader object.). The dict export format provides easy access to the processed image and its sample name.
  2. Visually inspect processed images, as shown in the current tutorials. Visual inspection is not a strict requirement in radiomics studies, hence the default value is dict and not native. I still recommend that people inspect their imaging, and hence included it in the tutorials.

Certain PT modality files throw this error: NotImplementedError: Conversion factor for converting PROPCPS to BQML is not implemented

MIRP tries to convert PET images to SUV by default, which requires a conversion from whatever PET units are present to BQML first. I will look into this particular case to see if there are metadata available to go from "proportional to counts per second" to BQML. Generally, I will try to make the error a bit more clear, because its only an issue when SUV conversion is required.

alexzwanenburg commented 2 months ago

Additional stuff that might be useful:

This may help stage specific images and masks for studies.

alexzwanenburg commented 2 months ago
  1. From a data perspective, I wanted to point out that I'm unable to process (with extract_images func) certain DICOM objects available on the Imaging Data Commons which I am able to parse with pydicom_seg. And then using dicomsort, I get several DICOM-SEG objects, one of them is attached to this issue. It gives me a ValueError: negative dimensions are not allowed error. It would also be useful if this error pointed out the exact file that causes it (when a dir is provided).

I was not able to reproduce the issue. I added a test, not only did the SEG file parse without issue, but the mask also exactly matches that of the RTSTRUCT original: https://github.com/oncoray/mirp/blob/834fce2900f9772764d6a1801b98eba04495864e/test/dicom_seg_test.py

alexzwanenburg commented 2 months ago

Reopened for JOSS review.