rcuocolo / PROSTATEx_masks

Lesion and prostate masks for the PROSTATEx training dataset, after a lesion-by-lesion quality check.
https://rcuocolo.github.io/PROSTATEx_masks/
Creative Commons Attribution 4.0 International
76 stars 17 forks source link

[Q] ProstateX-0025 & ProstateX-0113 #1

Closed mibaumgartner closed 4 years ago

mibaumgartner commented 4 years ago

Hi, thank you for this really nice repository with masks for the ProstateX dataset! :D While I was preparing the data I noticed two small things inside the image_list.csv:

Thanks for your help!

rcuocolo commented 4 years ago

Thank you for your positive feedback. I will check ASAP (probably tomorrow) both signaled cases to see if there are any issues and if so whether they lie in the csv file or data. If you are interested in using this dataset, it could be useful to know I aim to upload in the next days whole gland and zonal segmentation for all 204 training cases. I also hope to have a citation available in the near future to aid in giving visibility and recognition to this work. Please let me know if you find any other issues, and feel free to make pull requests if required.

rcuocolo commented 4 years ago

I have checked both cases. Both issues were due to the fact that DICOM files were converted to the NiFTI format prior to performing the analysis and segmentations. For patient ProstateX-0025, he actually has 2 exams, for which the dcm2niix conversion script added an "a" to distinguish one from the other. The mask works better on the second exam but seems usable for both. The suffix has been removed from the file, to avoid any confusion. For patient ProstateX-0113, I don't know how the series number discrepancy happened, but I checked again my files and the mask was indeed drawn on the ADC map (confirmed both visually and by file size). I have updated the series number in the csv to better match the DICOM dataset. I have also removed the ".nii.gz" file extension, as it was a byproduct of how the masks were performed and information for the csv file retrieved, and could cause some confusion. Please let me know if the issue is solved.

mibaumgartner commented 4 years ago

Great, thanks for the update :D Looks good now 👍 A small follow up: The dataset consists out of 204 subjects while there are only 200 masks. Are the cases without a corresponding mask empty after your recheck? Edit: e.g. case ProstateX-0052

rcuocolo commented 4 years ago

Yes, some cases have been considered to not have clearly definable lesions based on the provided markers, exclusively among the PI-RADS 2 subgroup (i.e. those without biopsy to confirm diagnosis). The paper with additional detailshas not been published yet but will be linked here ASAP. I will try to add further details trying not to infringe in possible future copyrighted material. EDIT: Overall there should be 299 remaining lesions included in the new annotations.