MHub / GC - Add PICAI baseline model/algorithm

silvandeleemput commented 1 year ago

This PR adds the PICAI baseline model (from the PI-CAI challenge) to MHub. GitHub Repo: https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm GC page: https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/

Algorithm I/O

Input:
- Transverse T2 MRI of the Prostate (DICOM)
- Transverse High B-Value Prostate MRI of the Prostate (DICOM)
- Transverse Apparent Diffusion Coefficient Prostate MRI of the Prostate (DICOM)
Output:
- Case-level prostate cancer probability value (JSON report)
- Transverse cancer detection map of the prostate (MHA)
Test images for this algorithm can be downloaded from here:
- example input/output: https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm/tree/master/test/ (MHA)
- open-source DICOM data sets on IDC:
  - Prostate-MRI-US-Biopsy
  - ProstateX
  - TCGA-PRAD

Caveats

~The PR target is main, but should be something like m-gc-picai-baseline~
~In the Dockerfile the MHub model repo integration is currently marked TODO since it requires integration of this PR in the main branch~
A Mha pipeline config file was added for testing with MHA data directly.

silvandeleemput commented 11 months ago

Update

This model currently works for DICOM, but the input comes from three different MRI acquisitions. For now, the pipeline is configured using a FileStructureImporter and expects the data in three different folders. This will require some data preparation on the part of the user. However, I think this should be preferred over the alternative of trying to read the DICOM metadata, since the SeriesDescription tag (required for distinguishing the different types of MRI scans) varies per study and often even per patient for all relevant IDC cases.

fedorov commented 11 months ago

Yes, I agree with @silvandeleemput on this one. In the general case, it may be difficult to have a reliable way to identify individual acquisitions for prostate MRI, and this should be done prior to processing with mhub. It makes sense to support this for passing inputs to mhub models.

fedorov commented 11 months ago

@silvandeleemput can you please point to the specific sample from IDC that can be used to confirm the function of this model?

silvandeleemput commented 11 months ago

@silvandeleemput can you please point to the specific sample from IDC that can be used to confirm the function of this model?

Yes, I will prepare and test an example of this next week on Monday. I will do this after I change the interface. It will probably be an example from one of these IDC datasets:

Prostate-MRI-US-Biopsy
ProstateX
TCGA-PRAD

silvandeleemput commented 11 months ago

@silvandeleemput can you please point to the specific sample from IDC that can be used to confirm the function of this model?

You can use any case from the Prostate-MRI-US-Biopsy images:

The expected input structure is:

input_directory
└── case1
├── adc
│   ├── <adc_dicom_files.dcm>
│   └── ...
├── hbv
│   ├── <hbv_dicom_files.dcm>
│   └── ...
└── t2w
    ├── <t2w_dicom_files.dcm>
    └── ...

The corresponding DICOM data inside the Prostate-MRI-US-Biopsy can be identified using the SeriesDescription (0008,103e) tag and look something like this (can vary per case):

* adc: ep2d-advdiff-3Scan-4bval_spair_std_ADC
* hbv: ep2d-advdiff-3Scan-4bval_spair_std_CALC_BVAL
* t2w: t2_spc_rst_axial obl_Prostate

fedorov commented 11 months ago

Can you please test with a specific sample from that collection, and once you confirm it works as expected, point us to the sample that should be used for testing.

silvandeleemput commented 11 months ago

Can you please test with a specific sample from that collection, and once you confirm it works as expected, point us to the sample that should be used for testing.

I did, and it works as expected:

DICOM case from Prostate-MRI-US-Biopsy: 
Prostate-MRI-US-Biopsy-0002
Output:
ProstateCancerLikelihood 0.0171966552734375

LennyN95 commented 11 months ago

@silvandeleemput I now deployed automation checks to all PR's (they will trigger once a new commit is added or we can close/open the PR). The checks currently fail as Dockerfiles now must have the FROM mhubai/base:latest command as the very first line.

In addition, I created a new build-utility which simplifies the sparse checkout.

I suggest adding the following lines to all Dockerfiles, replacing the former sparse checkout for MHub models:

# Import the MHub model definiton
ARG MHUB_MODELS_REPO
RUN buildutils/import_mhub_model.sh gc_picai_baseline ${MHUB_MODELS_REPO}

Where gc_picai_baseline is the name of the model. The script takes a repository as the second argument. If you specify an ARG as in the example, you can now simply specify the fork repository and the branch for local tests at build time. The script will pull from the fork instead of our repository, but that should be identical to what is pushed through the PR, so it's effectively the same. So no more need to keep some copy commands in the Dockerfile for testing or manual mounting of the models folder :)

E.g., during build you command would look something like this (note the repo is passed as $repo::$branch to override the MHub default models repository): docker build --build-arg MHUB_MODELS_REPO="https://github.com/DIAGNijmegen/MHubAI_models::m-gc-picai-baseline" -t dev/gc_picai_baseline .

fedorov commented 11 months ago

GC page: https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/

I do not see this link included in the model metadata. I think it would be very valuable to have it included somewhere - this would establish direct connection between GC and MHub.

fedorov commented 11 months ago

@silvandeleemput do you know or can you ask - how is the repo you use for this model https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm related to the earlier (published, but now archived) repo here: https://github.com/DIAGNijmegen/prostateMR_3D-CAD-csPCa.

Also, can you tag Henkjan on this issue? He mentioned he uses GitHub - he might be a better person to ask this question.

silvandeleemput commented 11 months ago

@silvandeleemput I now deployed automation checks to all PR's (they will trigger once a new commit is added or we can close/open the PR). The checks currently fail as Dockerfiles now must have the FROM mhubai/base:latest command as the very first line.

That's great, it will be a great help!

In addition, I created a new build-utility which simplifies the sparse checkout.

I suggest adding the following lines to all Dockerfiles, replacing the former sparse checkout for MHub models:

This seems like a nice feature to have, but I find it troublesome that this feature comes after I already have iterated on the established quality checklist from two months ago without being aware that such a feature was in the make. I suggest you replace these parts while you test and integrate the models.

silvandeleemput commented 11 months ago

GC page: https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/

I do not see this link included in the model metadata. I think it would be very valuable to have it included somewhere - this would establish direct connection between GC and MHub.

Agreed, I'll add it.

silvandeleemput commented 11 months ago

@silvandeleemput I now deployed automation checks to all PR's (they will trigger once a new commit is added or we can close/open the PR). The checks currently fail as Dockerfiles now must have the FROM mhubai/base:latest command as the very first line.

FROM mhubai/base:latest Is still the first uncommented line. Are comments not allowed above the FROM statement? You may delete the comment if you want.

silvandeleemput commented 11 months ago

@silvandeleemput do you know or can you ask - how is the repo you use for this model https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm related to the earlier (published, but now archived) repo here: https://github.com/DIAGNijmegen/prostateMR_3D-CAD-csPCa.

Both algorithms are prostate cancer detection algorithms, but they are different:

https://github.com/DIAGNijmegen/prostateMR_3D-CAD-csPCa is the 3D U-Net architecture Anindo Saha developed for this paper. It's a U-Net with attention mechanisms.

https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm is one of the PI-CAI baseline algorithms, which used semi-supervised learning. It's a mostly standard nnU-Net with a different loss function. It includes the trained model weights on the PI-CAI Public Training and Development dataset.

LennyN95 commented 11 months ago

This seems like a nice feature to have, but I find it troublesome that this feature comes after I already have iterated on the established quality checklist from two months ago without being aware that such a feature was in the make. I suggest you replace these parts while you test and integrate the models.

MHub is making progress. This feature was only recently added to solve an important problem (importing from a non-existent source). As this improves readability and drastically simplifies testing of unpublished models, it is required for all models.

Let's stay reasonable, this change is minor and only adds two lines to each Dockerfile. If this is a huge workload for you, please bring this up at our next meeting so we can find a solution.

FROM mhubai/base:latest Is still the first uncommented line. Are comments not allowed above the FROM statement? You may delete the comment if you want.

Correct, you can find that this is consistent with our documentation from two months ago: [...] the first line of your Dockerfile must be: FROM mhubai/base:latest [...] here. Unfortunately I've no control over the contributed code, I'll leave this to you ;)

silvandeleemput commented 10 months ago

Let's stay reasonable, this change is minor and only adds two lines to each Dockerfile. If this is a huge workload for you, please bring this up at our next meeting so we can find a solution.

It is indeed not that much work and I understand the nature of software in development, but don't forget that the work for each of your requests gets multiplied by ten for each open model. Also, the contract ended quite a while ago, and I have been lenient enough to provide aftercare for the open model/algorithms. For this, I asked for a quality checklist two months ago and delivered on that promise. At some point, I cannot reasonably be expected to keep adding new features unless we come to some form of agreement.

FROM mhubai/base:latest Is still the first uncommented line. Are comments not allowed above the FROM statement? You may delete the comment if you want.

Correct, you can find that this is consistent with our documentation from two months ago: [...] the first line of your Dockerfile must be: FROM mhubai/base:latest [...] here. Unfortunately I've no control over the contributed code, I'll leave this to you ;)

I am fine with adding this and the previously mentioned features. But as I mentioned above we have to draw a boundary somewhere.

silvandeleemput commented 10 months ago

I updated the Dockerfile, and the checks have passed you should be able to continue testing.

LennyN95 commented 10 months ago

but don't forget that the work for each of your requests gets multiplied by ten for each open model

I did not. But at the same time, it took me no more than 5 minutes to update all the eight models we currently have. Considering that I wrote a detailed description of how to update the Dockerfile, it's a simple "replace this by that" task, that makes the testing of all the remaining 9 models from the original contract drastically faster and easier, I found it reasonable.

Also, the contract ended quite a while ago, and I have been lenient enough to provide aftercare for the open model/algorithms

Wasn't integrating the models into MHub part of the contract? We're at a 0.5/10 here. We should also not forget that the list of models was updated in between without our knowledge ;)

But as I mentioned above we have to draw a boundary somewhere.

If this is an actual problem on your side, I suggest we discuss this privately in our next meeting with the NIH. I'll bring it up.

miriam-groeneveld commented 10 months ago

I'll set up a meeting, indeed better to discuss in person.

silvandeleemput commented 10 months ago

As discussed, let's make finalizing this algorithm/model the priority. @LennyN95 would it be a good idea to create and add an additional priority GitHub label for this PR?

We will attempt to involve the original author of the model/algorithm to help update the model card data as accurately as possible, with the comments above in mind. We expect to roll out an update for this on Thursday.

As a technical note, the DICOM data on IDC for the input (t2w, hbv, adc) can be quite heterogeneous. For the proposed test dataset (Prostate-MRI-US-Biopsy) it works fine as all data are represented as 3D volumes. However, for other datasets the HBV input has to be derived from 4D data, which likely requires some additional processing. We will investigate if there is a standardized solution for this.

LennyN95 commented 10 months ago

We will attempt to involve the original author of the model/algorithm to help update the model card data as accurately as possible, with the comments above in mind. We expect to roll out an update for this on Thursday.

Sounds great!

As a technical note, the DICOM data on IDC for the input (t2w, hbv, adc) can be quite heterogeneous. For the proposed test dataset (Prostate-MRI-US-Biopsy) it works fine as all data are represented as 3D volumes. However, for other datasets the HBV input has to be derived from 4D data, which likely requires some additional processing. We will investigate if there is a standardized solution for this.

@fedorov might be able to help with this ;)

LennyN95 commented 10 months ago

Thank you @joeranbosma for your support! If you agree, we can have you added as a 2nd co-contributor of the MHub model.

joeranbosma commented 10 months ago

@LennyN95 you're welcome! Yeah, that would be great!

joeranbosma commented 10 months ago

@silvandeleemput For a biparametric MRI scan, multiple diffusion scans are acquired with different field strengths (b values) and stored as individual 3D volumes. Depending on the archive, this can be in individual DICOM folders, all DICOM files in a single folder, or the different b value measurements stored as vectors per voxel (instead of the typical one value / greyscale value). Often a low b value is acquired (e.g., 0 or 50) and a mid-high b value (e.g. 800 or 1000), and depending on the center also a mid range b value (e.g., 400 or 500), and sometimes also a high b value (e.g. 1400 or 2000). The acquired high b value diffusion scan is typically very noisy, so most centers calculate this instead of a direct acquisition. To calculate the high b value scan, different vendors have different methods. The most common method is a mono-exponential fit, often with some outlier / physically implausible value removal. Depending on the center, different acquired b value scans can be included in this calculation. A typical choice is e.g. using the low and mid-high b value scans (e.g. 50 and 800).

In the PI-CAI challenge we only included high b value scans calculated by the vendor, to reflect the scans used during clinical routine. This was most often a calculated b=1400 map, but also included e.g. acquired high b value scans.

The mono-exponential model is very straightforward to implement when using two diffusion scans, which I did as a sanity check for an earlier project. If this is useful, I can share this code.

LennyN95 commented 10 months ago

@joeranbosma I came up with another question (sorry if I overlooked it somewhere, just link it in that case): Are you performing any resampling inside your pipeline / are all three input files expected to share the same spacing / it doesn't matter for your specific algorithm (if so, why)?

silvandeleemput commented 10 months ago

Update

Miriam and I, with big help from @joeranbosma, have addressed the open issues regarding the meta.json file and made some of the requested changes to the meta.json file. These changes are now ready to be evaluated.

Some open issues

We would like to add a link to the associated grand-challenge algorithm page: i.e. https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/ . What would be the best place for this in the meta.json file?
Regarding the high b-value input (HBV) what would be the preferred approach here? We could just mention that it is required to provide either a calculated or an acquired scan and leave it up to the user to provide one. This might be the best approach as an automated approach involving the mono-exponential model might not work for all inputs and could be quite tricky to get right for the rest.

joeranbosma commented 10 months ago

Are you performing any resampling inside your pipeline / are all three input files expected to share the same spacing / it doesn't matter for your specific algorithm (if so, why)?

Yes, the algorithm will perform resampling inside the pipeline. This is handled by the nnU-Net framework, which results in all scans to be resampled to 0.5 x 0.5 x 3.0 mm (which is the median spacing in the training dataset, the choice nnU-Net makes for anisotropic datasets like prostate biparametric MRI scans). Even though the image is internally resampled, the detection map (i.e., model prediction at voxel-level) is at the same spatial resolution and physical dimensions as the input axial T2-weighted image. This is achieved by resampling the heatmap (so before extraction of lesion candidates to form the detection map) back to the original T2-weighted physical image space and resolution.

This wasn't specified here yet, only subtly on the GC page: While the input images (T2W, HBV, ADC) can be of different spatial resolutions, the algorithm assumes that they are co-registered or aligned reasonably well and that the prostate gland is localized within a volume of 460 cm³ from the centre coordinate.

LennyN95 commented 10 months ago

Thx for the swift reply @joeranbosma 👍

This wasn't specified here yet, only subtly on the GC page: While the input images (T2W, HBV, ADC) can be of different spatial resolutions, the algorithm assumes that they are co-registered or aligned reasonably well and that the prostate gland is localized within a volume of 460 cm³ from the centre coordinate.

@silvandeleemput if we could clarify this a bit in the Model Card (what is pre-requirements and what is done by the pipeline).

This is handled by the nnU-Net framework, which results in all scans to be resampled to 0.5 x 0.5 x 3.0 mm (which is the median spacing in the training dataset, the choice nnU-Net makes for anisotropic datasets like prostate biparametric MRI scans)

Then 0.5 x 0.5 x 3.0 mm can be set as spacing in the meta.json.

We would like to add a link to the associated grand-challenge algorithm page: i.e. https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/ . What would be the best place for this in the meta.json file?

You could just reference it under intended use for now (as this acts somewhat as a long description). Long-term we might add a general "links" section to the Model Card but for now I'd like to keep the number of optional fields low as the model card is already pretty huge.

LennyN95 commented 10 months ago

NOTE: Test not reproducible with the provided information.

Test Instance / Expected Result:

CASE ID: Prostate-MRI-US-Biopsy-0002
ProstateCancerLikelihood 0.0171966552734375

Since ADC / HBV could not be assigned based on the Series Description (N/A), we tried both combinations:

Bildschirmfoto 2023-12-01 um 09 59 28

Bildschirmfoto 2023-12-01 um 09 59 19

T2:  1.3.6.1.4.1.14519.5.2.1.186749128823666050588710146976747922803
ADC: 1.3.6.1.4.1.14519.5.2.1.9782151183163602689533965679912348574
HBV: 1.3.6.1.4.1.14519.5.2.1.200776032537717955457115457169439698734

ProstateCancerLikelihood 0.066162109375

T2:  1.3.6.1.4.1.14519.5.2.1.186749128823666050588710146976747922803
ADC: 1.3.6.1.4.1.14519.5.2.1.200776032537717955457115457169439698734
HBV: 1.3.6.1.4.1.14519.5.2.1.9782151183163602689533965679912348574

ProstateCancerLikelihood 0.083984375

Neither matches the expected result. Test result: not reproducible.

joeranbosma commented 10 months ago

Then 0.5 x 0.5 x 3.0 mm can be set as spacing in the meta.json.

Just to clarify: this is only an internal spacing used by the nnU-Net framework. The user can provide any input spacing and will receive that spacing back from the algorithm.

LennyN95 commented 10 months ago

Just to clarify: this is only an internal spacing used by the nnU-Net framework. The user can provide any input spacing and will receive that spacing back from the algorithm.

Thank you for the clarification, that's in line with my idea. To add some context: We specify it such that the users know what the internal resolution is (e.g., providing data of higher resolution like .5x.5x1 into a segmentation will still generate (upsampled) delineations of the internal resolution). I see your point that this might not be clear to the reader as of now. However, we generally expect all MHub implementations to handle the resampling internally (either as part of the AI pipeline or within the MHub workflow).

silvandeleemput commented 10 months ago

NOTE: Test not reproducible with the provided information.

Test Instance / Expected Result:
CASE ID: Prostate-MRI-US-Biopsy-0002
ProstateCancerLikelihood 0.0171966552734375
Since ADC / HBV could not be assigned based on the Series Description (N/A), we tried both combinations:
T2:  1.3.6.1.4.1.14519.5.2.1.186749128823666050588710146976747922803
ADC: 1.3.6.1.4.1.14519.5.2.1.9782151183163602689533965679912348574
HBV: 1.3.6.1.4.1.14519.5.2.1.200776032537717955457115457169439698734

ProstateCancerLikelihood 0.066162109375
T2:  1.3.6.1.4.1.14519.5.2.1.186749128823666050588710146976747922803
ADC: 1.3.6.1.4.1.14519.5.2.1.200776032537717955457115457169439698734
HBV: 1.3.6.1.4.1.14519.5.2.1.9782151183163602689533965679912348574

ProstateCancerLikelihood 0.083984375
Neither matches the expected result. Test result: not reproducible.

That don't seem to be the correct cases for the ADC and HBV, I got the following SeriesInstanceUIDs:

T2: 1.3.6.1.4.1.14519.5.2.1.186749128823666050588710146976747922803 ADC: 1.3.6.1.4.1.14519.5.2.1.37233816211175376152145472039799247527 HBV: 1.3.6.1.4.1.14519.5.2.1.189210247425913540085870724026007775300

To get the data I used the NBIA Data Retriever and the manifest from here:

Note that the cases that you used for the ADC and HBV belong to different studies for the Prostate-MRI-US-Biopsy-0002 patient ID, the images should all belong to Study Instance UID : 1.3.6.1.4.1.14519.5.2.1.146363230255268207566333837392591620097

I don't know why the other two SeriesInstanceUIDs aren't listed on the IDC webpage.

fedorov commented 10 months ago

@silvandeleemput NBIA Data Retriever is a great tool to download data from TCIA. IDC is not TCIA. NBIA Data Retriever does not download from TCIA. The content of IDC may be different from TCIA, since updates to the files in TCIA are not propagated in real-time to IDC.

To download from IDC, we use s5cmd, which you can install following these instructions: https://github.com/peak/s5cmd#installation. Once this tool is installed, you can define the manifest and download from IDC following the steps in the quick video below.

2023-12-01_17-18-57

You can also install SlicerIDCBrowser extension, and download without having to use the terminal.

https://github.com/ImagingDataCommons/SlicerIDCBrowser

MHub-GC collaboration is about integration of MHub and GC with IDC. Not about integration with TCIA. In spirit of this collaboration, I encourage you to learn at least a little bit about IDC @silvandeleemput.

silvandeleemput commented 10 months ago

Update

I have updated the meta.json to address the open issues. I think it should be practically complete by now.

The inputs and outputs fields have been updated
- slicethickness fields has been set to "0.5 x 0.5 x 3.0 mm". Is this alright? it seems more to indicate spacing a.t.m.
the intended use section has been updated
- the GC algorithm reference is under intended use
- some of the requirements and assumptions on the input data have been added/updated.

@LennyN95 Please have a look, especially at the intended use section, i.e. if it is sufficiently clear how to present the input data.

LennyN95 commented 10 months ago

Thx for the patch @joeranbosma, the nnUNet output is now captured correctly.

However, we still get some outputs that need to be captured (see screenshot). Bildschirmfoto 2023-12-07 um 13 39 17 These are caused by print statements inside __init__.py files, executed during import. However, we only capture print-outs inside of the Method.task() function (as this is the only place we're expecting code execution inside MHub). One solution would be moving these import statements (nnUnet, picai_prep) inside the task method. However, that greatly impacts transparency, so the CLI entrypoint (as discussed earlier) is our best option.

NOTE: I acknowledge the need (or requirement) for some disclaimer/licensing print outs at run-time of any MHub algorithm. However, the print-out should be controlled by MHub ( we could e.g. fetch that from the meta.json). @silvandeleemput , do you a separate field would be required for that or would it be sufficient to use the cite field from the meta.json?

Once that's fixed, all we need is the test-data (are we waiting for the next IDC release here?) to enter the final phase ;)

silvandeleemput commented 10 months ago

@silvandeleemput , do you a separate field would be required for that or would it be sufficient to use the cite field from the meta.json?

I think the cite field should be sufficient for most models. But I think most developers/users would look at the model cards if they want to know such information anyway, so I don't know if it is really necessary.

LennyN95 commented 10 months ago

@silvandeleemput Can you also provide us with three images to be shown on the algorithm page?

miriam-groeneveld commented 10 months ago

@silvandeleemput Can you also provide us with three images to be shown on the algorithm page?

Providing images is a bit difficult; the created output image is not a true heatmap, but rather a binarized map of most-confident lesions (see https://pypi.org/project/report-guided-annotation/0.2.7/). Resulting in something like this: .

The images from our viewer are not the nicest (as you can tell from the screenshot above) for this particular image type, so creating nicer images is something we would prefer not to spend time on at the moment. Unless you find images like the one above acceptable. Also because the main output of the model is the prostate cancer likelihood. Two things we noticed on the MHub model:

The Inputs & outputs lists a single segmentation output. As stated above, and described in the meta.json, the image output is not a segmentation.
The Prostate Cancer Likelihood is not shown as an output.

LennyN95 commented 10 months ago

Thank you for the feedback @miriam-groeneveld. The input / output are indeed static atm (I've an dynamic implementation prepared that I'll push soon. This will deserve some major update to a) display input and output files of the default pipeline b) allow the same for alternative workflows c) allow insights into the available data such as a likelihood value. But that's for the future).

For the images, I know that non-segmentation models have a disadvantage here. However, I still think images are very valuable not only to give a good first impression and to make the model page look less technical but also to memorize model pages. Therefore, if you have some other images, maybe one you designed for the challenge, that'd be an option too. I'd also include the one you suggested. Maybe just add a red number annotation to the label (e.g., 0.74 on the green area) to indicate the nature of the area-based-probability-scores. The quality is indeed not the best but it gives the user an idea of what to expect from the model.

I made up the following for now (please let me know if you do NOT agree with the usage of any of these images): Bildschirmfoto 2023-12-14 um 15 46 36

fedorov commented 10 months ago

What is the difference between "binarized map" and "segmentation"? How is "binarized map" not a segmentation?

I understand that you do not assign 100% confidence, but it does not change the semantics that the result of the model is binary segmentation of the most probable region.

Also, if there is probability threshold used for binarization, it makes sense to provide it as output of the model. Is it provided?

joeranbosma commented 10 months ago

The model outputs a detection map, which is described here: a 3D detection map of non-overlapping, non-connected csPCa lesions (with the same dimensions and resolution as the T2W image). For each predicted lesion, all voxels must comprise a single floating point value between 0-1, representing that lesion’s likelihood of harboring csPCa.

It's NOT a binarized map, but rather assigns a uniform likelihood to specific regions. I think there was some confusion caused by the difference between a "true heatmap" and a "segmentation". The produced detection map can be described as a segmentation, but not as a heatmap.

There is not a threshold for binarization. The voxel-level heatmap (softmax output) is transformed into the detection map using a dynamic, iterative process. This process is described in the associated paper (https://pubs.rsna.org/doi/10.1148/ryai.230031).

Regarding the images, you can use the first (PI-CAI logo) as model example. The second image is taken from another paper that hasn't been referenced on the model card and was created for an algorithm that's not used here either. We can ask the author (Anindo Saha), but I don't think it makes sense for this algorithm. The third image would make sense. Can I ask where the case that's displayed is coming from?

fedorov commented 10 months ago

So to make sure I understand correctly. The output is probabilistic map, and the discussion above in the screenshot below was in the context of creating a picture for the model page by thresholding that map and rendering the binarized map manually, right?

If that is the case, I suggest we show the heatmap overlay for the purposes of model page. I do not think it is a good idea to show on the picture something that is not generated by the model.

fedorov commented 10 months ago

Also, for the model page, I suggest you show model results as overlay over T2 modality - not DWI or ADC. This will make the results a lot more appealing. And it has more relevance clinically anyway.

LennyN95 commented 9 months ago

I hope you all had a great start to the new year!!

I like what @fedorov suggested. @silvandeleemput do you have such images at hand or is that doable within a minor or reasonable time investment? It would definitely improve the appearance of the algorithm's side a lot.

On a separate note @silvandeleemput, did you set-up a backreference on the GC page of the challenge / algorithm?

silvandeleemput commented 9 months ago

I hope you all had a great start to the new year!!

Thank you. I hope you had a great start as well.

I like what @fedorov suggested. @silvandeleemput do you have such images at hand or is that doable within a minor or reasonable time investment? It would definitely improve the appearance of the algorithm's side a lot.

I agree that it would improve the appearance of the algorithm. However, it is currently not one of the outputs of the algorithm and it would require changing the code base to output the heatmap as an option (which is a bit of work). As it cannot be generated by the algorithm at the moment, presenting such a heatmap image to the user might be confusing as it will differ from the current output.

On a separate note @silvandeleemput, did you set-up a backreference on the GC page of the challenge / algorithm?

That seems like a good idea, I'll ask @joeranbosma here (as he is the algorithm owner) to add a reference. Would a link to this page be sufficient: https://mhub.ai/models/gc_picai_baseline ?

fedorov commented 9 months ago

[heatmap] is currently not one of the outputs of the algorithm

Never mind then - I was confused by the earlier comment from @joeranbosma https://github.com/MHubAI/models/pull/60#issuecomment-1856100405.

LennyN95 commented 9 months ago

That seems like a good idea, I'll ask @joeranbosma here (as he is the algorithm owner) to add a reference. Would a link to this page be sufficient: https://mhub.ai/models/gc_picai_baseline ?

Sure, that's perfect!

LennyN95 commented 9 months ago

We're still waiting for this to be completed.

Technically, we shouldn't have proceeded; please resolve this requested change ASAP so we can merge this PR and build the project.

joeranbosma commented 9 months ago

Hi all!

I can add a backreference to the model on MHub (https://mhub.ai/models/gc_picai_baseline), I'll do that once the pending updates are complete.

Regarding the visualization of the model output, it's a bit tricky as it's neither a heatmap, nor a binary output map, but somewhat in between. My earlier comment (https://github.com/MHubAI/models/pull/60#issuecomment-1856100405) describes this in detail. The third image represents the output of the model correctly. The second image is from another publication and incorrect for this algorithm and should be replaced.

The prediction can also be shown as an overlay on the T2-weighted image (as suggested by @fedorov in https://github.com/MHubAI/models/issues/60#issuecomment-1856139796). The T2-image provides more anatomical detail compared to the ADC map. However, I would not ascribe it more diagnostic value. For cancers in the peripheral zone (accounting for about 70% of prostate cancers), the diffusion (ADC) scan is the leading modality according to PI-RADS. Since the T2-scan is also integral for the diagnosis, it's completely fine to choose it for visualization.

The model weight license is not yet correct, this should be "CC-BY-NC-SA-4.0" instead of "CC-BY-NC-4.0" (check out the corresponding license file: https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm/blob/master/results/LICENSE)

I don't have premade visualizations of this model or detection maps, unfortunately. However, the algorithm is public, so creating these figures can be done by anyone. I'm currently not able to spend the time to make pretty figures.

Best, Joeran

fedorov commented 9 months ago

The prediction can also be shown as an overlay on the T2-weighted image (as suggested by @fedorov in https://github.com/MHubAI/models/pull/60#issuecomment-1856139796). The T2-image provides more anatomical detail compared to the ADC map. However, I would not ascribe it more diagnostic value.

I suggested that not because of different diagnostic value, but because prostate cancer (from what I know) often needs to be evaluated in the anatomic context. I do not think ADC is usable for assessing the location of the capsule margin, or surrounding anatomic structures. And since most likely the algorithm targeted screening applications, it would be followed by targeted biopsy, and I don't think one would use ADC for planning such biopsy. Anyway, this is really the choice of the contributor of the algorithm. It was just a suggestion, and I am completely fine with any screenshot, or with no screenshot at all.

The model weight license is not yet correct, this should be "CC-BY-NC-SA-4.0" instead of "CC-BY-NC-4.0" (check out the corresponding license file: https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm/blob/master/results/LICENSE)

This is potentially very problematic. What are the implications of a license SA clause: "If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original."?

Arguably, one could say MHub "builds upon" this model. We absolutely cannot accept a similar license applied to MHub.

Further, one could argue perhaps that outputs produced by such model are covered by that clause. And then would it mean that any repository that uses those artifacts "builds upon"?

The safest bet would be to not include this model into MHub. And I would not feel comfortable including any artifacts produced by a model with such license into IDC.

MHubAI / models