PennLINC / qsiprep

Preprocessing of diffusion MRI
http://qsiprep.readthedocs.io
BSD 3-Clause "New" or "Revised" License
140 stars 58 forks source link

Some thoughts on QC evaluation of qsiprep processed DWI data #551

Open WilliamFCB opened 1 year ago

WilliamFCB commented 1 year ago

Hi Matt,

Some thoughts on and questions regarding the QC evaluation of qsiprep processed data. I was wondering if the carpetplot provided by qsiprep could be updated to the color scheme/format as in the qc.pdf generated by eddy_quad. The latter gives a clearer picture of the problem volumes/slices. Alternatively, one could maybe use the one provided by dmriprep_viewer (see pictures below. I noticed that the latter displays the order of slices differently than qsiprep/eddy_quad, which both start with 0 at the top).

At the moment, I extract the following QC data (qsiprep, I also calculate median, std, min, max, CV of measures) and eddy_quad qc measures. For the latter, I save/prevent deletion of eddy-related data using the nipype.cfg approach (see my qsiprep running code below). I wrote a bash script to extract the eddy qc measures from the eddy_quad generated qc.json file (see at the end). Not very pretty, but it does the job ;-) Here, I use synthstrip on topup_imain_corrected_avg_brain.nii.gz to get a better-fitting brain mask for use in eddy_quad.

I tried dmriprep_viewer. However, the motion and distortion-corrected images are way too small (at least in Firefox on our cluster) to evaluate if the corrections were successful. Also, the middle slices do not allow discerning more localized problem areas such as orbitofrontal regions).

I now loop through data using fsleyes (not ideal). First, I loop through datasets with >0 raw_num_bad slices. Next, I go through data sets with a fd_max > voxel size. For the remaining data, I go through the carpet/eddy plots and only visually check images if there are potentially problematic movement patterns. For this, I created an HTML with only the carpet plots for all subjects. If I can find the time, I will look in more detail at how the visual evaluation compares to extracted qc-measures. At first glance, the latter seems not always to be straightforward if problems (bad slices/ outliers/ large fd_mean/max, etc) are not very outspoken.

Anyway, I was wondering what your preferred QC procedure is.

Screenshot 2023-04-20 at 13 41 14 Screenshot 2023-04-20 at 13 49 45
running qsiprep with preserving eddy output

!/bin/bash

Purpose: Running qsiprep with saving eddy related output

To get all eddy output for QUAD and SQUAD qc measures

generate a nipype.cfg file with the following two lines (excluding the #):

[execution]

remove_unnecessary_outputs=false

Place the cfg file in your home directory in: ~/.nipype

Create the directory if necessary

SBATCH -J qsi_GPU

SBATCH --cpus-per-task=3

SBATCH --mem-per-cpu=8GB

SBATCH --time=24:00:00

SBATCH --output logs/slurm-%j.txt

SBATCH --partition=HPC

SBATCH --gres=gpu:1

slurmid=${SLURM_ARRAY_TASK_ID} sub=${1} ses=${2} BIDS_DIR=${3} RAWDATA_DIR=${BIDS_DIR}/rawdata SCRIPTS_DIR=${4} site=${5}

qsiprep_sif="/mnt/depot64/singularity/site/qsiprep-0.16.1.sif" DERIV_DIR="/mnt/scratch/VIA_BIDS/qsiprep/${site}_derivatives_EDDY_output" LICENSE_DIR="/mnt/depot64/freesurfer/freesurfer.6.0.0" WORK_DIR="/mnt/scratch/VIA_BIDS/qsiprep/qsiprep_workEDDY${site}"

if [ ! -e ${WORK_DIR} ]; then mkdir -p ${WORK_DIR} fi

if [ ${site} == "DRCMR" ] then res="1.8" elif [ ${site} == "CFIN" ] then res="2" fi

BIDS_DIR_TMP="/mnt/scratch/VIA_BIDS/qsiprep/bidsqsiprep${site}_tmp"

echo $sub $ses $RAWDATA_DIR $BIDS_DIR_TMP $SCRIPTS_DIR $WORK_DIR ${slurmid} ${site}

echo module purge echo module load singularity/3.11.1-ce echo module load cuda/11.2 module purge module load singularity/3.11.1-ce module load cuda/11.2

log=logs/slurm.qsiprep.${sub}.${ses}-gpu_EDDY_output.log mv logs/slurm-${SLURM_JOBID}.txt $log

RUN QSIPREP ORIGINAL RESOLUTION

export SINGULARITYENV_ANTS_RANDOM_SEED=999

DERIV_DIR_ORIGRES="${DERIV_DIR}/qsiprep_origres"

if [ ! -e ${DERIV_DIR_ORIGRES} ]; then mkdir -p ${DERIV_DIR_ORIGRES} fi

time singularity run --cleanenv --contain --nv \ -B ${LICENSE_DIR} \ -B ${BIDS_DIR_TMP}/:/data_in \ -B ${RAWDATA_DIR}/:${RAWDATA_DIR} \ -B ${SCRIPTS_DIR}/:${SCRIPTS_DIR} \ -B ${DERIV_DIR_ORIGRES}/:/data_out \ -B ${WORK_DIR}/:/work \ -B /tmp:/tmp \ -B /lib:/drcmrlib \ -B /mrhome/wimb/.nipype/ \ --env "LD_LIBRARY_PATH=/drcmrlib:\$LD_LIBRARY_PATH" \ ${qsiprep_sif} \ /data_in \ /data_out \ participant \ -w /work \ --participant-label ${sub}${ses/-/} \ --skip_bids_validation \ --use-plugin ${SCRIPTS_DIR}/tmp/qsiprepplugin/${sub}${ses}_qsiprep_plugin.yml \ --fs-license-file ${LICENSE_DIR}/license.txt \ --unringing-method mrdegibbs \ --dwi-denoise-window 5 \ --output-resolution ${res} \ --hmc-model eddy \ --eddy_config ${SCRIPTS_DIR}/tmp/eddyparams/${sub}${ses}_eddy_params.json \ --output-space T1w \ --nthreads ${SLURM_CPUS_PER_TASK} \ -vv

RUN QSIPREP HIGH RESOLUTION

export SINGULARITYENV_ANTS_RANDOM_SEED=999

DERIV_DIR_HIGHRES="${DERIV_DIR}/qsiprep_highres"

if [ ! -e ${DERIV_DIR_HIGHRES} ]; then mkdir -p ${DERIV_DIR_HIGHRES} fi

time singularity run --cleanenv --contain --nv \ -B ${LICENSE_DIR} \ -B ${BIDS_DIR_TMP}/:/data_in \ -B ${RAWDATA_DIR}/:${RAWDATA_DIR} \ -B ${SCRIPTS_DIR}/:${SCRIPTS_DIR} \ -B ${DERIV_DIR_HIGHRES}/:/data_out \ -B ${WORK_DIR}/:/work \ -B /tmp:/tmp \ -B /lib:/drcmrlib \ -B /mrhome/wimb/.nipype/ \ --env "LD_LIBRARY_PATH=/drcmrlib:\$LD_LIBRARY_PATH" \ ${qsiprep_sif} \ /data_in \ /data_out \ participant \ -w /work \ --participant-label ${sub}${ses/-/} \ --skip_bids_validation \ --use-plugin ${SCRIPTS_DIR}/tmp/qsiprepplugin/${sub}${ses}_qsiprep_plugin.yml \ --fs-license-file ${LICENSE_DIR}/license.txt \ --unringing-method mrdegibbs \ --dwi-denoise-window 5 \ --output-resolution 1 \ --hmc-model eddy \ --eddy_config ${SCRIPTS_DIR}/tmp/eddyparams/${sub}${ses}_eddy_params.json \ --output-space T1w \ --nthreads ${SLURM_CPUS_PER_TASK} \ -vv

bash script for extracting eddy qc measures

!/bin/bash

script to derive eddy_quad qc measures from qsiprep processed data

this script uses synthstrip to create a more optimal brain mask for extracting eddy snr and cnr measures (see https://surfer.nmr.mgh.harvard.edu/docs/synthstrip/)

The section extracting outlier data below needs to be adjusted in case you have more than one b-shell

[ "$1" == "" ] && echo "Please specify a site e.g. DRCMR or CFIN" && exit [ "$2" == "" ] && echo "Please specify a qsiprep output resolution e.g. highres, origres" && exit

site=${1} res=${2} BIDS_DIR="/mnt/projects/VIA_BIDS/BIDSMRI${site}" SCRIPTS_DIR="/mnt/projects/VIA_BIDS/scripts/qsiprep" DERIV_DIR="/mnt/scratch/VIA_BIDS/qsiprep/${site}_derivatives_EDDYoutput/qsiprep${res}/qsiprep" WORK_DIR="/mnt/scratch/VIA_BIDS/qsiprep/qsiprep_workEDDY${site}"

module load fsl/6.0.5 module load singularity/3.11.1-ce

if [ ${site} == "DRCMR" ] then ses="ses-01DRCMRprisma" elif [ ${site} == "CFIN" ] then ses="ses-01CFINskyra" fi

for i in ${DERIV_DIR}/sub-via504*${ses/-}

for i in ${DERIV_DIR}/sub*${ses/-} do

subid=${i##*/}
echo ${subid}

# copy relevant eddy files to OUT_DATA_DIR

OUT_DATA_DIR=${i}/eddy_qc/data
if [ ! -e ${OUT_DATA_DIR} ]; then
    mkdir -p ${OUT_DATA_DIR}
fi

OUT_QC_DIR=${i}/eddy_qc/qc # this directory is made on the fly by eddy_quad. If it exist eddy will fail
if [ -e ${OUT_QC_DIR} ]; then
    echo  "${OUT_QC_DIR} exist"
    rm -rf  ${OUT_QC_DIR}
fi

WORK_SUB="${WORK_DIR}/qsiprep_wf/single_subject_${subid/sub-}_wf/dwi_preproc_acq_hardi_run_01_wf/hmc_sdc_wf"

rsync -ruvt ${WORK_SUB}/eddy/* ${OUT_DATA_DIR}
rsync -ruvt ${WORK_SUB}/gather_inputs/eddy_* ${OUT_DATA_DIR}
rsync -ruvt ${WORK_SUB}/pre_eddy_b0_ref_wf/enhance_and_mask_b0/topup_imain_corrected_avg_brain.nii.gz ${OUT_DATA_DIR}
rsync -ruvt ${WORK_SUB}/topup_to_eddy_reg/topup_reg_image_flirt.mat ${OUT_DATA_DIR}
rsync -ruvt ${WORK_SUB}/topup/fieldmap_HZ.nii.gz ${OUT_DATA_DIR}
rsync -ruvt ${DERIV_DIR}/${subid}/dwi/${subid}_acq-hardi_run-01_space-T1w_desc-preproc_dwi.bv* ${OUT_DATA_DIR}
rsync -ruvt ${SCRIPTS_DIR}/tmp/slspec_files/${subid/ses*}*_slspec.txt ${OUT_DATA_DIR}

#run eddy_quad
cd ${OUT_DATA_DIR}

# we here use a mask generated with synthstrip ensure any nonbrain voxels are excluded: this affects snr and cnr measures
synthstrip-singularity -i topup_imain_corrected_avg_brain.nii.gz -o topup_imain_corrected_avg_brain_mask.nii.gz

mask=$(ls topup_imain_corrected_avg_brain_mask.nii.gz)
bval=$(ls *.bval)
bvec=$(ls *.bvec)
fieldmap=$(ls fieldmap_HZ.nii.gz)
slspec=$(ls *slspec.txt)

eddy_quad eddy_corrected -idx eddy_index.txt -par eddy_acqp.txt -m ${mask} -b ${bval} -g ${bvec} -f ${fieldmap} -s ${slspec} -o ${OUT_QC_DIR} -v

cd ${OUT_QC_DIR}

define output csv filename

qc_csv="${OUT_QC_DIR}/qc_measures.csv"

write header line to output csv

no indent here as otherwise, echo produces spaces after a comma

echo subid,sessionid,b0_mean_snr,b1_mean_cnr,b0_std_snr,b1_std_cnr,\ perc_outliers_tot,perc_outliers_pe,perc_outliers_b1,\ mean_v2v_move_mm_abs,mean_v2v_move_mm_rel,vox_displ_stdv,\ mean_v2v_trans_mm_x,mean_v2v_trans_mm_y,mean_v2v_trans_mm_z,\ mean_v2v_rot_deg_x,mean_v2v_rot_deg_y,mean_v2v_rot_deg_z,\ std_v2v_EC_lin_x,std_v2v_EC_lin_y,std_v2v_EC_lin_z,\ mean_wv_std_trans_mm_x,mean_wv_std_trans_mm_y,mean_wv_std_trans_mm_z,\ mean_wv_std_rot_deg_x,mean_wv_std_rot_deg_y,mean_wv_std_rot_deg_z,\ qsiprep_dir,work_dir > ${qc_csv}

# extract eddy_qc_measures
# the awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}' code is used to convert potential scientific notation to decimal, and defines the number of decimals in the output

bs_cnr=$(cat qc.json | grep -A 2 qc_cnr_avg | tr '\n' ' ' | tr ',' ' ' | tr -s ' ' | cut -d" " -f4-8)
b0_mean_snr=$(echo ${bs_cnr} | cut -d" " -f1 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')
b1_mean_cnr=$(echo ${bs_cnr} | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')

echo "mean_snr/cnr: b0, b1: $b0_mean_snr $b1_mean_cnr"

bs_std=$(cat qc.json | grep -A 2 qc_cnr_std | tr '\n' ' ' | tr ',' ' ' | tr -s ' ' | cut -d" " -f4-8)
b0_std_snr=$(echo ${bs_std} | cut -d" " -f1 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')
b1_std_cnr=$(echo ${bs_std} | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')

echo "std_snr/cnr: b0, b1: $b0_std_snr $b1_std_cnr"

#total percentage of outliers. THIS PART NEEDS TO BE ADJUSTED IN CASE YOU HAVE MORE THAN ONE B-SHELL
perc_outliers_tot=$( cat qc.json | grep qc_outliers_tot | tr -s ' ' | cut -d" " -f3 | cut -d, -f1 | awk -F"E" 'BEGIN{OFMT="%10.4f"} {print $1 * (10 ^ $2)}')
perc_outliers_pe=$(cat qc.json | sed -n '/qc_outliers_pe/{n;p}' | tr -s ' ' | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')
perc_outliers_b1=$(cat qc.json | sed -n '/qc_outliers_b/{n;p}'  | tr -s ' ' | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')

echo "perc_outliers: tot, pe, b1: ${perc_outliers_tot} ${perc_outliers_pe}; ${perc_outliers_b1}"

# movement in mm: 
mean_v2v_move_mm_abs=$( cat qc.json | grep qc_mot_abs | tr -s ' ' | cut -d" " -f3 | cut -d, -f1 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')
mean_v2v_move_mm_rel=$( cat qc.json | grep qc_mot_rel | tr -s ' ' | cut -d" " -f3 | cut -d, -f1 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')

echo "mean_v2v_movement: absolute, relative: ${mean_v2v_move_mm_abs} ${mean_v2v_move_mm_rel})"

# Susceptibility distortions: standard deviation of voxel displacement values within the brain mask susceptibility induced distortions
vox_displ_std=$( cat qc.json | grep qc_vox_displ_std | tr -s ' ' | cut -d" " -f3 | cut -d, -f1 | awk -F"E" 'BEGIN{OFMT="%10.3f"} {print $1 * (10 ^ $2)}')

echo "vox_displ_std: ${vox_displ_std}"

# translations and rotations
param=$(cat qc.json | grep qc_params_avg -A 9 | paste -s -d ' ' | tr ',' ' ' | tr -s ' ' | cut -d" " -f4-12)
mean_v2v_trans_mm_x=$(echo ${param} | cut -d" " -f1 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_v2v_trans_mm_y=$(echo ${param} | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_v2v_trans_mm_z=$(echo ${param} | cut -d" " -f3 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')

# rotations are stored in radians. Convert to degrees by multiplying with 57.2957795
rad2deg="57.2957795"
mean_v2v_rot_rad_x=$(echo ${param}   | cut -d" " -f4 | awk -F"E" 'BEGIN{OFMT="%10.10f"} {print $1 * (10 ^ $2)}')
mean_v2v_rot_deg_x=$(echo "${mean_v2v_rot_rad_x}*${rad2deg}" | bc -l | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}') 
mean_v2v_rot_rad_y=$(echo ${param}   | cut -d" " -f5 | awk -F"E" 'BEGIN{OFMT="%10.10f"} {print $1 * (10 ^ $2)}')
mean_v2v_rot_deg_y=$(echo "${mean_v2v_rot_rad_y}*${rad2deg}" | bc -l | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}') 
mean_v2v_rot_rad_z=$(echo ${param}   | cut -d" " -f6 | awk -F"E" 'BEGIN{OFMT="%10.10f"} {print $1 * (10 ^ $2)}')
mean_v2v_rot_deg_z=$(echo "${mean_v2v_rot_rad_z}*${rad2deg}" | bc -l | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}') 

#standard deviations of the eddy current linear terms 
std_v2v_EC_lin_x=$(echo ${param}   | cut -d" " -f7 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
std_v2v_EC_lin_y=$(echo ${param}   | cut -d" " -f8 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
std_v2v_EC_lin_z=$(echo ${param}   | cut -d" " -f9 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')

echo "mean_v2v_translation:          x, y, z:  ${mean_v2v_trans_mm_x} ${mean_v2v_trans_mm_y} ${mean_v2v_trans_mm_z}"
echo "mean_v2v_rotation:             x, y, z:  ${mean_v2v_rot_deg_x} ${mean_v2v_rot_deg_y} ${mean_v2v_rot_deg_z}"
echo "std Eddy current linear term : x, y, z:  ${std_v2v_EC_lin_x} ${std_v2v_EC_lin_y} ${std_v2v_EC_lin_z}"

#mean standard deviations of the translations (x3) and rotations (x3) that eddy has estimated for each excitation (i.e., when correcting for within-volume motion).
param_std=$(cat qc.json | grep qc_s2v_params_avg_std -A 6 | paste -s -d ' ' | tr ',' ' ' | tr -s ' ' | cut -d" " -f4-12)
mean_wv_std_trans_mm_x=$(echo ${param_std} | cut -d" " -f1 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_wv_std_trans_mm_y=$(echo ${param_std} | cut -d" " -f2 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_wv_std_trans_mm_z=$(echo ${param_std} | cut -d" " -f3 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_wv_std_rot_deg_x=$(echo ${param_std}  | cut -d" " -f4 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_wv_std_rot_deg_y=$(echo ${param_std}  | cut -d" " -f5 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')
mean_wv_std_rot_deg_z=$(echo ${param_std}  | cut -d" " -f6 | awk -F"E" 'BEGIN{OFMT="%10.5f"} {print $1 * (10 ^ $2)}')

echo "mean_wv_std_translation: x, y, z: ${mean_wv_std_trans_mm_x} ${mean_wv_std_trans_mm_y} ${mean_wv_std_trans_mm_z}"
echo "mean_wv_std_rotation:    x, y, z: ${mean_wv_std_rot_deg_x} ${mean_wv_std_rot_deg_y} ${mean_wv_std_rot_deg_z}"

write extracted qc meausures to csv

echo ${subid/ses*},${ses},$b0_mean_snr,$b1_mean_cnr,$b0_std_snr,$b1_std_cnr,\ ${perc_outliers_tot},${perc_outliers_pe},${perc_outliers_b1},\ ${mean_v2v_move_mm_abs},${mean_v2v_move_mm_rel},${vox_displ_std},\ ${mean_v2v_trans_mm_x},${mean_v2v_trans_mm_y},${mean_v2v_trans_mm_z},\ ${mean_v2v_rot_deg_x},${mean_v2v_rot_deg_y},${mean_v2v_rot_deg_z},\ ${std_v2v_EC_lin_x},${std_v2v_EC_lin_y},${std_v2v_EC_lin_z},\ ${mean_wv_std_trans_mm_x},${mean_wv_std_trans_mm_y},${mean_wv_std_trans_mm_z},\ ${mean_wv_std_rot_deg_x},${mean_wv_std_rot_deg_y},${mean_wv_std_rot_deg_z},\ ${i},${WORK_SUB} >> ${qc_csv}

done

script to concatenate individual qc_measures.csv files

!/bin/bash

[ "$1" == "" ] && echo "Please specify a site e.g. DRCMR or CFIN" && exit [ "$2" == "" ] && echo "Please specify a qsiprep output resolution e.g. highres origres" && exit

site=${1} res=${2} BIDS_DIR="/mnt/projects/VIA_BIDS/BIDSMRI${site}" SCRIPTS_DIR="/mnt/projects/VIA_BIDS/scripts/qsiprep" DERIV_DIR="/mnt/scratch/VIA_BIDS/qsiprep/${site}_derivatives_EDDYoutput/qsiprep${res}/qsiprep" OUT_DIR="/mnt/scratch/VIA_BIDS/qsiprep/${site}_derivatives_EDDY_output"

if [ ${site} == "DRCMR" ] then ses="ses-01DRCMRprisma" elif [ ${site} == "CFIN" ] then ses="ses-01CFINskyra" fi

outfile="${OUTDIR}/VIA11${site}qsiprep${res}_eddy_qc_measures.csv" echo creating ${outfile} echo ""

counter=0 list=$(ls ${DERIV_DIR}/sub*${ses/-}/eddy_qc/qc/qc_measures.csv) for i in ${list} do echo ${i} echo "" echo "counter: $counter" echo "" if [[ ${counter} == 0 ]] then echo in rm -f ${outfile} cp -v ${i} ${outfile} chmod 750 ${outfile} let counter++ else tail -n +2 $i >> ${outfile} let counter++ fi done

mattcieslak commented 1 year ago

Hi @WilliamFCB Thanks for bringing this up, it's something that could use some discussion.

We typically rsync all the imageqc.json files to a laptop and use dmriprep_viewer on those files. It only needs those json files to function and web browsers on clusters are painful to use.

WilliamFCB commented 1 year ago

Indeed, I would be interested to know, what criteria you employ when evaluating the quality of processed data Cheers