Systematic review of binary ground truth quality

jcohenadad commented 11 months ago

In anticipation of #24, I would like the team to review all GT segmentations for this project. To ease the review, I suggest we make a single QC report for the entire dataset and post it here so we can discuss. Make sure to add the SHA of the data used to create the report.

rohanbanerjee commented 11 months ago

Adding the QC of all the datasets as a single QC report. These QCs contain the binary ground truth. epi_qc_all_data.zip

(Didn't add the SHA of the dataset because we have decided to not upload the dataset to git-annex but to openneuro directly. A list of the datasets can be found at: https://docs.google.com/spreadsheets/d/1xZMuR5OLRRIRWyIJicIqr6znDAaaJUqaOhgLX8qWgJI/edit#gid=0)

jcohenadad commented 11 months ago

Thank you @rohanbanerjee , I will review ASAP and I think it would be good if @MerveKaptan also had a look, so we can then find a consensus

jcohenadad commented 11 months ago

I suggest we all generate a qc_fail.yml and upload it here.

If mask is correct, use ✅
if mask is incorrect in at least one slice, use ❌
if data is of insufficient quality (>50% slices of insufficient quality), use ⚠️

Then, revisit cases with ❌ and fix the mask in the problematic slices. If the data is of insufficient quality in some slices, then do not segment those slices.

jcohenadad commented 11 months ago

Number of experts per subject:

one doing the QC (eg: @jcohenadad)
one doing the manual corr (eg: @rohanbanerjee @MerveKaptan )

jcohenadad commented 11 months ago

Started doing the QC from sub-hamburgP01 --> sub-leipzigR48. ~Here are the reports: Archive.zip~

see https://github.com/sct-pipeline/fmri-segmentation/issues/25#issuecomment-1828376229 for updated report

jcohenadad commented 11 months ago

Additional comments:

`sub-nwM*`: Under-segmentation and missing first/last slices

![Screenshot 2023-11-27 at 1 15 01 PM](https://github.com/sct-pipeline/fmri-segmentation/assets/2482071/339b7dcd-756e-41eb-b5ce-43918077e2b1)

`sub-nwMW*`, `sub-nwT*`: Over-segmentation

![Screenshot 2023-11-27 at 1 15 06 PM](https://github.com/sct-pipeline/fmri-segmentation/assets/2482071/5af045e5-5e8c-40f0-9f4b-d7cc3b78cad2)

sub-ouhscCSMS03: first/last slices missing
sub-ouhscCSMS09: first/last slices missing, some slices in the middle undersegmented
sub-ouhscCSMS10: first/last slices missing
sub-ouhscCSMS11: first/last slices missing
sub-ouhscCSMS14: first/last slices missing
sub-ouhscHCS16 : first/last slices missing
sub-ouhscHCS18 : first/last slices missing
sub-ouhscHCS20: first/last slices missing
sub-ouhscHCS21 : first/last slices missing ; undersegmentation
sub-ouhscHCS22: first/last slices missing
sub-ouhscHCS24: first/last slices missing
sub-stanfordR*: first/last slices missing, the segmentation shape could be better. It seems like the same shape was used across subjects. Example:
sub-stanfordRMFM*: Large Oversegmentation
sub-stanfordRMHC*: Large Oversegmentation
sub-ZurichCervC*: slight oversegmentation. Might be able to fix with a single-voxel erosion. @rohanbanerjee to try

Here is my report on the full dataset: qc_JulienCohen-Adad_20231127_172353.zip

rohanbanerjee commented 8 months ago

There are two more datasets that need to be reviewed which were not included in the provided QC above. These two datasets are Geneva and Leipzig Pain (I missed out on included Geneva before and Leipzig Pain wasn't included because the ground truth was obtained using deepseg -- which we initially decided on including in the training set). The ✅ subjects from this QC will also be included along with the other ✅ images for training the first model.

qc_geneva_leipzigpain_20240229.zip

jcohenadad commented 8 months ago

sub-leipzigP* --> all GT can be used for training sub-genevaR* --> no GT can be used (oversegmentation)

rohanbanerjee commented 7 months ago

Closing this issue as all the ground truths have been reviewed. The predictions/manual correction for each active learning training iteration would be discussed in separate issues.

rohanbanerjee commented 6 months ago

Keeping track of all the artifacts subjects in the yml file below: exclude.yml.zip

Cross-ref the comments:

jcohenadad commented 6 months ago

Keeping track of all the artifacts subjects in the yml file below:

This is not the right location for this. This issue is called "Systematic review of binary ground truth quality". This tracking needs to go in a specific issue, eg: "Tracking images with artifacts". Moreover, the exclude.yml file should be part of the latest version of the dataset, not out-of-sync from it.

rohanbanerjee commented 5 months ago

Done in issue #46

rohanbanerjee commented 4 months ago

Closing this issue as all the purpose of this issue is solved now.

sct-pipeline / fmri-segmentation

Systematic review of binary ground truth quality #25