PennLINC / xcpEngine

Official public repository for the XCP Engine. This tool is deprecated in favor of XCP-D and ASLPrep.
MIT License
66 stars 42 forks source link

Scrubbing produces errors in regress module #417

Closed jacobtfisher closed 2 years ago

jacobtfisher commented 3 years ago

Describe the bug Attempting to use censoring with AROMA.

Cohort file Paste cohort file between the triple backticks

    id0 id1 img
1   sub-03  run-1   /data/sub-03/func/sub-03_task-game_run-1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz

Design File Paste your entire design (.dsn) file between the triple backticks

#!/usr/bin/env bash

###################################################################
#  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  ⊗  #
###################################################################

###################################################################
# This design file stores the values of all variables required to
# execute a complete neuroimage processing pipeline. You may
# execute the analysis specified in this design file by calling
# (in any v4 or higher bash terminal):
#
# xcpEngine -d <design> -c <cohort> -o <output> <options>
#
# Variables fall into five general categories:
# * ANALYSIS VARIABLES are used at all stages of this analysis.
# * PIPELINE specifies the modules that comprise the analysis.
# * MODULE VARIABLES are used during one stage of the analysis.
#                  These are typically array variables with array
#                  indices equal to the index of the module that
#                  calls them.
# * OUTPUT VARIABLES may be used at all stages of the analysis.
#                  These are typically array variables with array
#                  indices equal to the value of the primary
#                  subject identifier. They will appear only in
#                  localised design files.
###################################################################

###################################################################
# ANALYSIS VARIABLES
###################################################################

analysis=xcp_jtf
design=/design/fc-aroma-new.dsn
sequence=anatomical
standard=MNI%2x2x2

###################################################################
# PIPELINE
###################################################################

pipeline=prestats,confound2,regress,fcon,roiquant,qcfc

###################################################################
# 1 PRESTATS
###################################################################

prestats_rerun[1]=1
prestats_cleanup[1]=1
prestats_process[1]=FMP

###################################################################
# 2 CONFOUND2
###################################################################

confound2_rps[2]=0
confound2_rms[2]=0
confound2_wm[2]=1
confound2_csf[2]=1
confound2_gsr[2]=1
confound2_acompcor[2]=0
confound2_tcompcor[2]=0
confound2_aroma[2]=1
confound2_past[2]=0
confound2_dx[2]=0
confound2_sq[2]=1
confound2_censor[2]=1
confound2_censor_contig[2]=4
confound2_framewise[2]=fds:1.25,dv:2
confound2_rerun[2]=1
confound2_cleanup[2]=1

###################################################################
# 3  REGRESS
###################################################################

regress_tmpf[3]=butterworth
regress_hipass[3]=0.01
regress_lopass[3]=0.2
regress_tmpf_order[3]=1
regress_tmpf_pass[3]=2
regress_tmpf_ripple[3]=0.5
regress_tmpf_ripple2[3]=20
regress_dmdt[3]=2
regress_1ddt[3]=1
regress_smo[3]=6
regress_sptf[3]=susan
regress_usan[3]=default
regress_usan_space[3]=
regress_rerun[3]=1
regress_cleanup[3]=1
regress_process[3]=DSP-DMT-TMP-REG

###################################################################
# 4 FCON
###################################################################

fcon_atlas[4]=power264
fcon_metric[4]=corrcoef
fcon_thr[4]=N
fcon_window[4]=10
fcon_pad[4]=FALSE
fcon_rerun[4]=1
fcon_cleanup[4]=1

###################################################################
# 5 REHO
###################################################################

reho_nhood[5]=vertices
reho_roikw[5]=0 # does nothing at the moment
reho_sptf[5]=susan
reho_smo[5]=6
reho_rerun[5]=0
reho_cleanup[5]=1

###################################################################
# 6 ALFF
###################################################################

alff_hipass[6]=0.01
alff_lopass[6]=0.25
alff_sptf[6]=susan
alff_smo[6]=6
alff_rerun[6]=0
alff_cleanup[6]=1

###################################################################
# 7 ROIQUANT
###################################################################

roiquant_atlas[7]=power264
roiquant_globals[7]=1
roiquant_vol[7]=0
roiquant_rerun[7]=0
roiquant_cleanup[7]=1

###################################################################
# 8 NORM
###################################################################
norm_primary[8]=1
norm_rerun[8]=1
norm_cleanup[8]=1

##################################################################
# 9 QCFC
###################################################################
qcfc_atlas[9]=power264
qcfc_sig[9]=fdr
qcfc_rerun[9]=1
qcfc_cleanup[9]=1

Error message Paste your error message between the backticks

Current processing step:
Temporally filtering image and confounds
····································································
· [butterworth filter]
· [High pass frequency: 0.01]
· [Low pass frequency: 0.2]
· Interpolating over masked-out epochs...
· This will be slow
/tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT.nii.gz
Voxel bin 1 out of 75
Traceback (most recent call last):
  File "/xcpEngine/utils/interpolate.py", line 172, in <module>
    n_features))
MemoryError: Unable to allocate 36.3 GiB for an array with shape (2548, 637, 3000) and data type float64
Error in asNifti.default(image, internal = TRUE) : 
  Failed to read image from path /tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT_TMP_interpol.nii.gz
Calls: dumpNifti -> asNifti -> asNifti.default -> .Call
In addition: Warning message:
In asNifti.default(image, internal = TRUE) :
  nifti_image_read: failed to find header file for '/tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT_TMP_interpol.nii.gz'
Execution halted

· [A major error has occurred.]
· [The processing stream will now abort.]
· [Preparing diagnostics]
····································································

Module Workflow Map
····································································
· START
· @0.1
· @1
· @1.1
· @1.2
· @1.3
· @1.4
· @7
· @7.2
· @7.3
· @7.5a
· @7.5b
· @7.6
· @2
· @2.2
· @2.3
· @2.4.1
· @2.5
· @2.6
· @2.8
· @2.12a
· @2.12b
· @2.12c
· ERROR
····································································

····································································
· [An error occurred while processing module regress.]
· [The error was detected at signpost @2.12c.]
· [The most recent command logged was]
· [tfilter -i /tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT.nii.gz -o /tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT_TMP.nii.gz -a 0.400000 -f butterworth -h 0.01 -l 0.2 -m /out/sub-03/run-1/prestats/sub-03_run-1_mask.nii.gz -n /out/sub-03/run-1/confound2/mc/sub-03_run-1_tmask.1D -r 1 -d 2 -1 confmat:/tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT_confmat.1D]
· [For additional details, view the processing log. For improved]
· [diagnostics, increment verbosity using the -t option.]
· []
· [Processing failed to produce required output: /tmp/sub-03_run-1-regress-914592166~TEMP~_DSP_DMT_TMP.nii.gz]
· []
· [stream abort]
····································································

Runtime Information Running latest Docker version (pulled today) on Ubuntu 18.

Additional context The pipeline runs without issue if I remove the censoring step. This is multiband data (400ms TR), so the censoring thresholds are different than the example to account for multiplying by the < 1 sec TR to arrive at the final threshold.

jacobtfisher commented 3 years ago

Update: I attempted to run the analysis on a cluster node with a bit more memory to see if that could get around the MemoryError, and that produced a new error:

Current processing step:
Temporally filtering image and confounds
····································································
· [butterworth filter]
· [High pass frequency: 0.01]
· [Low pass frequency: 0.2]
· Interpolating over masked-out epochs...
· This will be slow
/tmp/sub-03_run-1-regress-836821185~TEMP~_DSP_DMT.nii.gz
Voxel bin 1 out of 75
Traceback (most recent call last):
  File "/xcpEngine/utils/interpolate.py", line 197, in <module>
    term_recon[i,:,:] = np.outer(term_prod[i,:],s[i,:])
ValueError: could not broadcast input array from shape (649,3000) into shape (648,3000)
Error in asNifti.default(image, internal = TRUE) : 
  Failed to read image from path /tmp/sub-03_run-1-regress-836821185~TEMP~_DSP_DMT_TMP_interpol.nii.gz
Calls: dumpNifti -> asNifti -> asNifti.default -> .Call
In addition: Warning message:
In asNifti.default(image, internal = TRUE) :
  nifti_image_read: failed to find header file for '/tmp/sub-03_run-1-regress-836821185~TEMP~_DSP_DMT_TMP_interpol.nii.gz'
Execution halted

· [A major error has occurred.]
· [The processing stream will now abort.]
· [Preparing diagnostics]
····································································
jacobtfisher commented 3 years ago

On further examination, it looks like the error occurs whether or not I use AROMA, so it might just be a problem with interpolate.py that is independent of using AROMA or not.

a3sha2 commented 3 years ago

Sorry Jacob for late response with *contig[2]=4 and *censor[2]=1, you are doing scrubbing. The interpolation over masked-out timepoints is not complete. the issue is likel memory and this may due to high number of flagged volumes. You can check *-nFlags.1D to see the flagged volumes ( with 1) if they are many. You can control it by reducing the threshold( although fds;1.25 means 0.5mm with 0.4s TR) or reducing the number of contiguous volume (from 4).

jacobtfisher commented 3 years ago

No problem, Azeez -- thanks for taking a look. I was able to get around the memory error by using a more powerful machine, but then get the ValueError that I mentioned above.

ValueError: could not broadcast input array from shape (649,3000) into shape (648,3000)

I did also try running with confound2_censor_contig[2]=0 and I get the same error. The number of flagged volumes is pretty low (7 total for this run), so I would hope that isn't too many to interpolate over. I've tried a number of things and I'm pretty stumped at this point. I suppose I could just avoid censoring/scrubbing, but there are some notable spikes for a few subjects so I'd like to be able to remove those volumes if possible.

jacobtfisher commented 3 years ago

Just checking in here — is there anything else that you suggest that I take a look at? I'm glad to dig in a bit more into interpolate.py to see if there is anything I may be able to spot, but I didn't want to do that if there was something simpler that I might be overlooking. Thanks again!

a3sha2 commented 3 years ago

pls Jacob, can you help with example of data with TR < 1 pls

jacobtfisher commented 3 years ago

Hey Azeez, thanks for taking another look. I don't have the dataset I'm working with uploaded to a public repository yet, but I did find a dataset that was collected on the same scanner at UCSB. The TR of this one is 720ms.

https://openneuro.org/datasets/ds002674/versions/1.0.3

dhasegan commented 2 years ago

was there any resolution to this problem?

dhasegan commented 2 years ago

I have a similar problem, getting oom exception when running on a cohort with 4 sessions with this count of flagged volumes: 78,0,7,0 I am using HCP dataset with TR = 0.72s at 0.2mm. So the final threshold is fds:0.278

dhasegan commented 2 years ago

I investigated the code a little bit and looks like its trying to allocate a ~129GB matrix (4500 x 1200 x 3000 of float64) which is over what I have in my HPC node. The last dimension of the matrix is the number of voxels that will change simultaneously. From "https://github.com/PennLINC/xcpEngine/blob/master/utils/interpolate.py#L56":

    parser.add_argument(
        '-v', '--voxbin', action='store', default=3000, type=int,

I just have to pass a flag --voxbin 1000 in utils/tfilter . Is there a simple way to patch my singularity image with a fix for this?

Would adding this as a param in the regress module make sense?

dhasegan commented 2 years ago

Feel free to use this docker image until a better fix comes along: https://hub.docker.com/repository/docker/dhasegan/xcpengine

created with this dockerfile:

FROM pennbbl/xcpengine:1.2.4
RUN sed -i 's/interpolate.py /interpolate.py -v 1000 /g' /xcpEngine/utils/tfilter
ENTRYPOINT ["/xcpEngine/xcpEngine"]
a3sha2 commented 2 years ago

it is better to check your data, may be you have more volumes flagged as outliers

dhasegan commented 2 years ago

Seems like setting that voxbin to 500 fixes the problem for me and now I can run with 64GB memory

a3sha2 commented 2 years ago

@dhasegan thank you! we will set the voxbin base on the size of the volume