nimh-dsst / dsst-defacing-pipeline

Defacing algorithm to improve and evaluate accuracy for large datasets.
2 stars 2 forks source link

Improve summary output information #16

Closed Arshitha closed 1 year ago

Arshitha commented 1 year ago

To run defacing pipeline on a dataset, we need the following:

Currently, the generate_mappings.py script doesn't account for the case where datasets have 20 subjects but only 10 subjects have 'anat' directories. We want to capture this information because:

  1. Missing 'anat' directories could point to an undetected failure in the conversion DICOM to BIDS pipeline
  2. If 'anat' data wasn't collected for the subject/session, it'd be more accurate to account for it while generating a list of sessions that don't have a primary (or T1w scan)
  3. Exclude subjects/sessions without anat data from the mapping file.

Discovered this while trying to run the pipeline on MyConnectome dataset that has 102 sessions in total but only 24 sessions have anatomical data.

Arshitha commented 1 year ago

generate_mappings.py

  1. & 2. The stdout now reports total number of 'anat' directories found in the input BIDS dataset. It's upto the user to use that information to figure if that's accurate or not.
  2. Mapping file does not contain subjects and session without anat data.