🐛 fmriprep-ingress pipeline nuisance regression

klemens-egger commented 3 months ago

Describe the bug

I am trying to run the fmriprep-ingress pipeline, but the nuisance regressor options do not seem to work properly in my case.

When running the command (see below) I get an error saying (see screenshot):

`LookupError: When trying to connect node block 'ingress_regressors' to workflow 'cpac_sub-01_ses-01' after node block 'nuisance_regressors_generation_T1w':

[!] C-PAC says: None of the listed resources in the node block being connected exist in the resource pool.`

I don't really understand what this workflow is trying to achieve. It connects nuisance regression with an anatomical T1w image, which seems not intuitive to me. Is this intended or am I missing something?

Thanks for your help!

To reproduce

No response

Preconfig

[ ] default
[ ] abcd-options
[ ] anat-only
[ ] blank
[ ] ccs-options
[X] fmriprep-options
[ ] fx-options
[ ] monkey
[ ] monkey-ABCD
[ ] ndmg
[ ] nhp-macaque
[ ] preproc
[ ] rbc-options
[ ] rodent

Custom pipeline configuration

fmriprep_ingress_pipeline.txt

Run command

docker run -i --rm -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/BIDS:/bids_dir -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/cpac_output:/outputs -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/tmp:/tmp -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/cpac_output/config:/configs -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/resources:/resources fcpindi/c-pac:latest /bids_dir /outputs participant --pipeline-file /configs/pipeline_config_fmriprep-ingress.yml

Expected behavior

That the ingress_regressors function takes the right confounds file in the fmriprep folders and adds it correctly to the pipeline.

Acceptance criteria

a more detailed documentation of the nuisance regression function might be helpful
ingress_regressors accepts input correctly

Screenshots

Bildschirmfoto 2024-03-25 um 14 53 54

C-PAC version

v.1.8.6

Container platform

No response

Docker and/or Singularity version(s)

No response

Additional context

No response

e-kenneally commented 3 months ago

Hi! Thank you for reaching out. There are a couple of things I want to check.

Could you attach the nuisance regression section of your pipeline file? It looks like generate_regressors and ingress_regressors might both be switched on, which could cause a crash like this
Could you post one subject of your data config file, so I can verify the layout?

I don't really understand what this workflow is trying to achieve. It connects nuisance regression with an anatomical T1w image, which seems not intuitive to me. Is this intended or am I missing something?

The screenshot you posted indicates that nuisance_regressors_generation_T1 and ingress_regressors are both being connected to the workflow, when we just want ingress_regressors. The T1 just refers to the template used in functional registration (T1 or EPI).

e-kenneally commented 3 months ago

Oops I just saw that you did attach your pipeline config - I think I know what the issue is. There is a sub-category under ingress_regressors where you can specify the name of the regressor strategy. For some reason, that isn't showing up by default in the pipeline files on github. I'll open an issue to fix that! The name in this section has to match one of the strategies above under Regressors (only one strategy can run at a time). It's not intuitive so I am working on a new workflow for ingressed regressors as well as documentation for the current implementation!

ingress_regressors:
      run: On
      Regressors:
        Name: HMP24_acompcorr_with_gsr
        Columns: [global_signal, motion, a_comp_cor, white_matter, csf]

Let me know if that fixes your problem, or if you have any other questions!

klemens-egger commented 3 months ago

Hi, thanks for the quick response and thanks for adding some more documentation, that is very much appreciated!

I tried now to add the name of the regressor and commented the second regressor out as suggested by you, but unfortunately that did not change the process and still leads to the same error.

If I look at my data_config_participants.yml file, I am also not sure if that worked properly and might be responsible for the error?

I will attach the data_setting.ymlfile also, could you scan through those files and let me know if I might have done something incorrect already there that would explain the issue?

For example, I tried to also add the path to the fmriprep freesurfer folders, but somehow this does not show up in the data_configfile. I don't know if that would be necessary for the fmriprep_ingress pipeline though.

And just to be sure, I'll also attach a screenshot of my BIDS and fmriprep directory to doublecheck.

data_config_participants.txt data_settings.txt

e-kenneally commented 3 months ago

Hi, thank you for providing more information! It looks like the issue is from the data config file. Unfortunately, the data config builder tool is not yet compatible with the derivatives_dir option. That feature is currently in progress. Here is the documentation on how to format it with fmriprep ingress, and specific documentation for running nuisance regression on fMRIPrep outputs will be released in the next few weeks!

For example, based on the information you've given me, the data config should look like this for one subject:

- site: site-1
  subject_id: 01
  unique_id: 02
  derivatives_dir: /bids_dir/derivatives/fmriprep/sub-01/ses-01

Let me know if you have any questions!

klemens-egger commented 3 months ago

Hi again, thanks for the further suggestions and the link. I completely missed that section.

I tried to use this minimal config file now but it still gives me the same error.

- 
  site: site-1
  subject_id: 01
  unique_id: 01
  derivatives_dir: /bids_dir/derivatives/fmriprep/sub-01/ses-01
- 
  site: site-1
  subject_id: 01
  unique_id: 02
  derivatives_dir: /bids_dir/derivatives/fmriprep/sub-01/ses-02

Is it correct that the pipeline file accesses the data config file and assumes it is located in the /configs folder from the docker command I am running? Do you have some further ideas on where the error might be rooted? And another question regarding this data_config file: How would the pipeline now know to only take the resting-state scans from the derivatives_dir? My full dataset has 3 scans per session and only one is resting state and should be further processed here with cpac.

Thanks for you help!

e-kenneally commented 3 months ago

Is it correct that the pipeline file accesses the data config file and assumes it is located in the /configs folder from the docker command I am running?

Oh I completely missed that!! You need to add a flag to indicate that you are using a data config file, otherwise C-PAC will just auto-generate one without the fMRIPrep directory. Your run command would look like this:

docker run -i --rm -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/BIDS:/bids_dir -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/cpac_output:/outputs -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/tmp:/tmp -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/cpac_output/config:/configs -v /nfs/DMT_HAR_MED_mri_data/BIDS/TEST_cpac/resources:/resources fcpindi/c-pac:latest /bids_dir /outputs participant --pipeline-file /configs/pipeline_config_fmriprep-ingress.yml --data_config_file /configs/<data_config_name>

My full dataset has 3 scans per session and only one is resting state and should be further processed here with cpac.

Unfortunately C-PAC doesn't filter out non-resting state data when pulling in an fMRIPrep directory. I'm sorry for the inconvenience and I will look into adding this functionality, since other users probably have the same concern!

klemens-egger commented 3 months ago

Hi, I got the command working now thanks to your help!

I still have a few questions you probably have the answers for before I start the pipeline for the whole dataset.

1) Since you said that C-PAC does not filter out non-resting state data, I was wondering if the flag --bold-labelcould be used to tell the pipeline to only look for task-rest for example. I tried the command with this flag, which wouldn't start the pipeline, but maybe I did not use it correctly?

2) The timeseries extraction that I ran without defining any Atlas to use did run for every atlas that is built-in CPAC, at least it seemed so. Can I specify ROIs to not create as much output that I probably don't need?

3) For nuisance regression, the output files contain three columns, but I am not quite sure what is actually presented in those. Is there anywhere information on what is stored in these files. The accompanying json files were not exclusively relating to these desc_confounds_timeseries.1Dfiles created. Here are the first few lines for reference:

# C-PAC 1.8.6
# Ingressed nuisance regressors:
1053.563041422457217777 1026.012326040201969590 1791.467862074797494643
1050.515425125942329032 1023.296357298717225603 1789.194473524351678861
1053.647751243849597813 1025.503930312144575510 1780.198806144095669879
-0.252886093174172877   -0.650269953506046772   -7.174069838742070715

Also, I was not sure if this part of the pipeline would actually ingress the regressors as intended, i.e. does Columns refer to the column names in the fmriprep desc_confounds_timeseries.tsv file and ingresses every column that has one of the listed strings in its name? Or if that is not the behavior, how does it work? I also tried to rund the pipeline with Censoring turned on, but that somehow did not work. When I ran fmriprep, I used the flag --FD_treshold 0.5 and I would like to use associated columns in the desc_confounds_timeseries.tsv for scrubbing. Do you know how I would specify this in the pipeline config file?

ingress_regressors:
      run: On
      Regressors:
        Name: HMP24_acompcorr_with_gsr
        Columns: [global_signal, motion, a_comp_cor, white_matter, csf]

4) Can I also do for example fsl's dual_regression directly with fMRIprep ingressed data. This analysis was not presented in the preconfigured pipeline template and I was wondering if that means that this wouldn't work or if it was just not shown in the template.

5) A last thing I am not quite sure is, how and when the pipeline would smooth the data. fMRIprep does no spatial smoothing as a last step, but it is often recommended to do that before doing further analyses. So I if run this (last sections in the pipeline config file), ill smoothing only be done as the very last step or already before something else happens? And what would be the implications?

post_processing:
  spatial_smoothing:

    # Smooth the derivative outputs.
    # Set as ['nonsmoothed'] to disable smoothing. Set as ['smoothed', 'nonsmoothed'] to get both.
    #
    # Options:
    #     ['smoothed', 'nonsmoothed']
    output: ['smoothed', 'nonsmoothed']

Sorry for the very long post, but answers to these questions would be very useful to understand my outputs in the end. Thanks a lot for your work!

e-kenneally commented 3 months ago

Awesome, glad to hear it's working! To answer your questions:

The C-PAC workflow is very different when pulling in raw data vs. pulling in preprocessed data. The --bold-label flag is specifically for raw data.
These are the atlases used by default. You can paste this list under timeseries_extraction > tse_roi_paths in your pipeline config and modify the list however you want.
That file is just the pipeline-specific regressors pulled from the regressors tsv file produced in fMRIPrep. C-PAC parses the confounds tsv from fMRIPrep to just include the regressor columns you selected in your pipeline config. The names have to match exactly. Those regressors are then applied when nuisance regression runs. The columns should be labeled, though, which I will look into. This file is basically the same as the C-PAC generated regressors file described here. Also, scrubbing is not yet compatible with fMRIPrep output directory ingress - I apologize for any inconvenience.
Yes, you can run dual regression on fMRIPrep outputs! Here is more information on how to configure it in the pipeline config.
Thank you for pointing this out! There was a bug preventing spatial smoothing and z-scoring from running with fMRIPrep outputs. I have fixed this and it will be in the next software update, which will be released very soon. Smoothing and z-scoring run after all of the derivatives have been generated. Here is more information on spatial smoothing in C-PAC.

Let me knows how it goes!

klemens-egger commented 3 months ago

Thanks for clarifying!

The process works fine now, the only thing I would need still is the implementation of scrubbing into the pipeline.

I guess I will wait until this feature is integrated in the future and thanks for your work!

Best wishes, Klemens

FCP-INDI / C-PAC