rcuocolo / PROSTATEx_masks

Lesion and prostate masks for the PROSTATEx training dataset, after a lesion-by-lesion quality check.
https://rcuocolo.github.io/PROSTATEx_masks/
Creative Commons Attribution 4.0 International
78 stars 17 forks source link

Errors in several zone segmentation masks #20

Open meglaficus opened 2 years ago

meglaficus commented 2 years ago

Hi, we are working working with your label set and I have come across some errors in the zonal masks. Here is a list of whole prostate masks with more than one connected component:

ProstateX-0000.nii.gz has 3 ccs ProstateX-0014.nii.gz has 2 ccs ProstateX-0027.nii.gz has 2 ccs ProstateX-0064.nii.gz has 25 ccs ProstateX-0112.nii.gz has 2 ccs ProstateX-0126.nii.gz has 2 ccs ProstateX-0141.nii.gz has 2 ccs ProstateX-0142.nii.gz has 2 ccs ProstateX-0168.nii.gz has 5 ccs ProstateX-0170.nii.gz has 2 ccs ProstateX-0172.nii.gz has 2 ccs ProstateX-0174.nii.gz has 2 ccs ProstateX-0179.nii.gz has 2 ccs ProstateX-0181.nii.gz has 2 ccs ProstateX-0182.nii.gz has 4 ccs ProstateX-0183.nii.gz has 5 ccs ProstateX-0190.nii.gz has 3 ccs ProstateX-0194.nii.gz has 2 ccs ProstateX-0198.nii.gz has 4 ccs ProstateX-0203.nii.gz has 3 ccs ProstateX-087.nii.gz has 8 ccs

Most of these have similar errors in the seperate peripheral and non-peripheral zone masks. I have merged the peripheral and non-peripheral masks for each patient and here are the cases where these errors persist:

ProstateX-0000.nii.gz has 3 ccs ProstateX-0014.nii.gz has 2 ccs ProstateX-0027.nii.gz has 2 ccs ProstateX-0064.nii.gz has 25 ccs ProstateX-0087.nii.gz has 8 ccs ProstateX-0112.nii.gz has 2 ccs ProstateX-0168.nii.gz has 2 ccs ProstateX-0170.nii.gz has 2 ccs ProstateX-0172.nii.gz has 2 ccs ProstateX-0179.nii.gz has 2 ccs ProstateX-0183.nii.gz has 3 ccs

The errors range from small, single voxel abrations close to the actual mask to larger errors that are very far from the mask and have a large impact on distance-based metrics.

Thank you for an otherwise terrific dataset, Jakob

rcuocolo commented 2 years ago

Thank you for signaling the problem. I am out of office but will look into it when I come back to work. I did run a script to clean up this issue in the past, so it is somewhat surprising. It used FSL functions to identify clusters and remove small ones, as misclicks can happen in a manual segmentation process. If you have fixed masks as well (i.e., have already ran a script to remove inappropriate clusters), feel free to make a request to substitute the problematic files with the correct ones, as it would speed up the process.

Il Gio 7 Lug 2022, 11:28 meglaficus @.***> ha scritto:

Hi, we are working working with your label set and I have come across some errors in the zonal masks. Here is a list of whole prostate masks with more than one connected component:

ProstateX-0000.nii.gz has 3 ccs ProstateX-0014.nii.gz has 2 ccs ProstateX-0027.nii.gz has 2 ccs ProstateX-0064.nii.gz has 25 ccs ProstateX-0112.nii.gz has 2 ccs ProstateX-0126.nii.gz has 2 ccs ProstateX-0141.nii.gz has 2 ccs ProstateX-0142.nii.gz has 2 ccs ProstateX-0168.nii.gz has 5 ccs ProstateX-0170.nii.gz has 2 ccs ProstateX-0172.nii.gz has 2 ccs ProstateX-0174.nii.gz has 2 ccs ProstateX-0179.nii.gz has 2 ccs ProstateX-0181.nii.gz has 2 ccs ProstateX-0182.nii.gz has 4 ccs ProstateX-0183.nii.gz has 5 ccs ProstateX-0190.nii.gz has 3 ccs ProstateX-0194.nii.gz has 2 ccs ProstateX-0198.nii.gz has 4 ccs ProstateX-0203.nii.gz has 3 ccs ProstateX-087.nii.gz has 8 ccs

Most of these have similar errors in the seperate peripheral and non-peripheral zone masks. I have merged the peripheral and non-peripheral masks for each patient and here are the cases where these errors persist:

ProstateX-0000.nii.gz has 3 ccs ProstateX-0014.nii.gz has 2 ccs ProstateX-0027.nii.gz has 2 ccs ProstateX-0064.nii.gz has 25 ccs ProstateX-0087.nii.gz has 8 ccs ProstateX-0112.nii.gz has 2 ccs ProstateX-0168.nii.gz has 2 ccs ProstateX-0170.nii.gz has 2 ccs ProstateX-0172.nii.gz has 2 ccs ProstateX-0179.nii.gz has 2 ccs ProstateX-0183.nii.gz has 3 ccs

The errors range from small, single voxel abrations close to the actual mask to larger errors that are very far from the mask and have a large impact on distance-based metrics.

Thank you for an otherwise terrific dataset, Jakob

— Reply to this email directly, view it on GitHub https://github.com/rcuocolo/PROSTATEx_masks/issues/20, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKAVMRCJACQA7OENBLEUDVLVS2PKXANCNFSM524VRDNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

meglaficus commented 2 years ago

Thank you for the quick response. I have made a program to fix these errors: https://github.com/meglaficus/ProstateSeg_QC

In the process I have discovered some further errors: (The images bellow are composites of pz (grey) and tz (white) masks.)

The program takes care of most of these issues (exeption is pt. 0064 that has a unique error which could not be completely fixed). I will make a request to upload the corrected images.

meglaficus commented 2 years ago

I have made a pull request with all the changes. Here is the link to a .csv file logging all the proposed changes. 'perif' = pz, 'central' = tz

The .csv has entries for each patient but only the masks that had errors were updated in the request.

rcuocolo commented 2 years ago

Thank you very much for your help! I'll be back next week from this business trip and will look into the pull request and hopefully update the repository with it ASAP.

Il giorno mar 12 lug 2022 alle ore 11:10 meglaficus < @.***> ha scritto:

I have made a pull request with all the changes. Here is the link https://github.com/meglaficus/PROSTATEx_masks/blob/master/Files/prostate/modification_log.csv to a .csv file logging all the proposed changes. 'perif' = pz, 'central' = tz

  • The 'filtered' columns mean there was a small conected volume removed
  • The 'patched' columns mean there was a hole filled in
  • 'whole_missmatch' means that the whole prostate mask was replaces with the sum of the other two masks as it did not match
  • 'strays converted' denotes that the 'strays' as described in the previous comment were converted.

The .csv has entries for each patient but only the masks that had errors were updated in the request.

— Reply to this email directly, view it on GitHub https://github.com/rcuocolo/PROSTATEx_masks/issues/20#issuecomment-1181516974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKAVMREAFFWGECU3CNLEDFLVTUY7LANCNFSM524VRDNA . You are receiving this because you commented.Message ID: @.***>