QC-related question - Githubissues

oliver-xie commented 4 months ago

Hi Jordan, Thanks for providing the QC folder and I am wondering how I should evaluate it. Most of the QC files are self-explanatory, but some are harder to interpret. For example, "from-T1w_to-CITI168_regqc.png", how should I evaluate the registration between T1w data and the CITI template? I assume that the red contours should separate different tissue types, but with these many lines over the brain, it is a bit hard to tell. Would you say the registration is good or bad? It would be nice to have some examples of what an ideal run looks like, e.g., https://github.com/edickie/ciftify/tree/master/docs/demo/qc_fmri Also, for the unetf3d_dice.tsv, is there a range of dice index that we should expect from good segmentation? Like above 0.8 is good, below 0.7 is unacceptable? Thanks, Oliver

jordandekraker commented 4 months ago

Great question! It seems we forgot to cover this in the Documentation, I'll keep this issue open until we can add a section there discussing this.

The registration to CITI168 is one possible source of failure in the pipeline, but I think we have it tuned well now so it almost never fails. Ideally, the red lines should perfectly overlay the different tissue boundaries in the image (e.g. skull, pial-grey-white-csf transitions, etc). Since this is a linear registration, the sulcal/gyral patterns don't perfectly line up, but we can see its overall correct. This registration is really only used to generate a bounding box and crop around the left and right hippocampi, so as long as its not badly offset then this works well, and there's certainly nothing to worry about in your case.

As for the Dice scores, a range of 0.7-0.9 is good. It will never be close to 1.0, and closer isn't necessarily better, we just want it to be >0.7. Explanation: When hippunfold fails, it tends to be a gross error in the UNet tissue segmentation (eg. sometimes includes large swaths of collateral sulcus, or a large chunk of hippocampus is not segmented at all). These are "catastrophic" failures, which we are trying to catch here. This dice score compares to a less precise but more conventional approach to get a whole hippocampus segmentation - nonlinear registration to a template (similar to FSL's hippocampal segmentation method). We expect the less precise to method to roughly (eg. 0.7-0.9 dice) overlap, but it should not perfectly overlap since UNet will always pick out more details (or fail catastrophically). A good practice for a large number of subjects is to simply automatically discard any subjects with a Dice <0.7 (this is typically around 1% of subjects).

I hope that helps! And I'll try to add this to the Documentation soon

oliver-xie commented 4 months ago

Thank you so much! This is very helpful! Oliver

jordandekraker commented 4 months ago

Finally got to addressing this here: https://github.com/khanlab/hippunfold/pull/279

khanlab / hippunfold

QC-related question #274