pyAFQ within QSIprep does not read --recon_spec json file correctly?

jk619 commented 9 months ago

Summary

When running reconstruction using qsiprep for pyafq_tractography it seems that the custom config (given by --recon-spec) is not read correctly.

Additional details

QSIPrep version: QSIprep 0.20.1.dev0+geda84d5.d20240112
Singularity version: 3.11.0

What were you trying to do?

Modify the pyafq_tarctography.json to change random_seeds parameter to true and n_seeds to 1500000. I created a new pyafq_tarctography_NYU.json config (below) based on pyafq_tarctography.json and mounted it inside the singularity image so it can be read using --recon-spec

{
    "description": "Use pyAFQ to perform the full Tractometry pipeline",
    "space": "T1w",
    "name": "pyafq_tractometry_NYU",
    "atlases": [],
    "nodes": [
        {
            "name": "pyafq_tractometry_NYU",
            "software": "pyAFQ",
            "action": "pyafq_tractometry",
            "input": "qsiprep",
            "output_suffix": "PYAFQ_TRACTOMETRY",
            "parameters": {
                "use_external_tracking": false,
                "export": "all",
                "random_seeds": true,
                "n_seeds": 1500000,
                "odf_model": "CSD",
                "seg_algo": "AFQ",
                "parallel_segmentation": "{'n_jobs': -1, 'engine': 'joblib', 'backend': 'loky'}",
                "n_points": 100,
                "clean_rounds": 5,
                "distance_threshold": 3,
                "length_threshold": 4,
                "min_sl": 20,
                "stat": "mean",
                "scalars": "['dti_fa', 'dti_md']",
                "import_tract": ""
            }
        }
    ]
}

What did you expect to happen?

Tracking until 1500000 bundles are created (example below comes from running pyAFQ (1.2) natively on the same data)

INFO:AFQ:Loading Image... INFO:AFQ:Generating Seeds... INFO:AFQ:Getting Directions... INFO:AFQ:Tracking... 1%| | 16149/1500000 [01:31<2:17:09, 180.30it/s]

Code run:

BIDS_DIR = "/scratch/jk7127/qsiprep_run/BIDS"

tracking_params = {"random_seeds" : True,
                  "odf_model" : "CSD",
                  "n_seeds" : 1500000}

brain_mask_definition = ImageFile(
    suffix="mask",
    filters={'desc': 'brain',
             'space': 'T1w',
             'scope': 'qsiprep'})

myafq = GroupAFQ(bids_path=BIDS_DIR,
                 segmentation_params={"seg_algo": "afq"},
                 scalars=["dti_fa", "dti_md"],
         clean_params={"n_points":100, "clean_rounds":5, "distance_threshold":3,"length_threshold":4,"min_sl":20, "stat":"mean"},
                 tracking_params=tracking_params,
         preproc_pipeline='qsiprep',
         participant_labels=subjid
)

bundle_html = myafq.export_all(viz=False,afqbrowser=False, xforms=False,indiv=True)>

What actually happened?

INFO:AFQ:Loading Image... INFO:AFQ:Generating Seeds... INFO:AFQ:Getting Directions... INFO:AFQ:Tracking... 31%|███ | 62192/200885 [05:57<16:00, 144.46it/s]

Seems that the parameters are not read as max number of fibers is 200885

Code run:

sub=$1

# load the freesurfer module
module load freesurfer/6.0.0
source /share/apps/freesurfer/6.0.0/SetUpFreeSurfer.sh

singularity run --cleanenv --containall --writable-tmpfs \
    -B /scratch/jk7127/qsiprep_run/work:/work,/scratch/jk7127/qsiprep_run/BIDS/,${FREESURFER_HOME}/license.txt:/opt/freesurfer/license.txt,/scratch/jk7127/qsiprep_run/pyafq_tractometry_NYU.json:/pyafq.json \
    /scratch/jk7127/Singularity/qsiprep.sif \
    /scratch/jk7127/qsiprep_run/BIDS/ /scratch/jk7127/qsiprep_run/BIDS/derivatives/ participant \
    --fs-license-file /opt/freesurfer/license.txt \
    --participant_label $sub \
    --output-resolution 1.5 \
    -w /work \
    --recon_only \
    --recon_spec /pyafq.json \
    --recon_input /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/
exit 0

below is report.srt created by qsiprep before running reconstruction (seems all parameters are read correctly). report.srt comes from:

/scratch/jk7127/qsiprep_run/work/qsirecon_wf/sub-wlsubj121_pyafq_tractometry_NYU/sub_wlsubj121_ses_nyu3t02_acq_138vols_space_T1w_desc_preproc_recon_wf/pyafq_tractometry_NYU/run_afq/_report/report.rst

Original Inputs
---------------

* bval_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.bval
* bvec_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.bvec
* dwi_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.nii.gz
* itk_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/anat/sub-wlsubj121_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5
* kwargs : {'n_seeds': 1500000, 'random_seeds': True, 'odf_model': 'CSD', 'seg_algo': 'AFQ', 'parallel_segmentation': {'n_jobs': -1, 'engine': 'joblib', 'backend': 'loky'}, 'n_points': 100, 'clean_rounds': 5, 'distance_threshold': 3, 'length_threshold': 4, 'min_sl': 20, 'stat': 'mean', 'scalars': ['dti_fa', 'dti_md'], 'import_tract': None, 'omp_nthreads': 8}
* mask_file : /work/qsirecon_wf/sub-wlsubj121_pyafq_tractometry_NYU/sub_wlsubj121_ses_nyu3t02_acq_138vols_space_T1w_desc_preproc_dwi_specific_anat_wf/resample_mask/sub-wlsubj121_desc-brain_mask_trans.nii.gz
* tck_file : <undefined>

jk619 commented 9 months ago

Seems n_seeds parameter is not read at all. Setting it to a string in the recon_spec file does not cause any error.

Original Inputs
---------------

* bval_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.bval
* bvec_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.bvec
* dwi_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/ses-nyu3t02/dwi/sub-wlsubj121_ses-nyu3t02_acq-138vols_space-T1w_desc-preproc_dwi.nii.gz
* itk_file : /scratch/jk7127/qsiprep_run/BIDS/derivatives/qsiprep/sub-wlsubj121/anat/sub-wlsubj121_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5
* kwargs : {'n_seeds': 'wrong_string', 'random_seeds': True, 'odf_model': 'CSD', 'seg_algo': 'AFQ', 'parallel_segmentation': {'n_jobs': 8, 'engine': 'joblib', 'backend': 'loky'}, 'n_points': 100, 'clean_rounds': 5, 'distance_threshold': 3, 'length_threshold': 4, 'min_sl': 20, 'stat': 'mean', 'scalars': ['dti_fa', 'dti_md'], 'omp_nthreads': 8}
* mask_file : /work/qsirecon_wf/sub-wlsubj121_pyafq_tractometry_NYU/sub_wlsubj121_ses_nyu3t02_acq_138vols_space_T1w_desc_preproc_dwi_specific_anat_wf/resample_mask/sub-wlsubj121_desc-brain_mask_trans.nii.gz
* tck_file : <undefined>

arokem commented 9 months ago

Thanks for reporting this! Just to see that I understand what's going on, could you share your recon_spec file and how you called it?

arokem commented 9 months ago

Oh - sorry - I only saw the last message and not the entire thread before it! I see your recon_spec is up there (and sorry for the slowness picking up this issue!).

arokem commented 9 months ago

OK - I think that I understand what's going on. The issue is that all the arguments in the recon_spec are being passed to ParticipantAFQ as one dict kwargs here: https://github.com/PennLINC/qsiprep/blob/bd788392cfe6182a581473d0c5685f6d06bcc8c3/qsiprep/interfaces/pyafq.py#L106, but tracking parameters (for example n_seeds) need to be in a separate nested dict called tracking_params, otherwise they are ignored. My suggestion would be to add the tracking parameters defaults as a dict in the same way that parallel_params are added: https://github.com/PennLINC/qsiprep/blob/bd788392cfe6182a581473d0c5685f6d06bcc8c3/qsiprep/data/pipelines/pyafq_tractometry.json#L36, but not 100% sure whether that would work.

@jk619 : I hope what I am saying makes sense to you. Would you mind running your processing with a recon_spec that looks like below? I've changed one line marked with a comment:

{
    "description": "Use pyAFQ to perform the full Tractometry pipeline",
    "space": "T1w",
    "name": "pyafq_tractometry_NYU",
    "atlases": [],
    "nodes": [
        {
            "name": "pyafq_tractometry_NYU",
            "software": "pyAFQ",
            "action": "pyafq_tractometry",
            "input": "qsiprep",
            "output_suffix": "PYAFQ_TRACTOMETRY",
            "parameters": {
                "use_external_tracking": false,
                "export": "all",
                "tracking_params: {"random_seeds": true, "n_seeds": 1500000},    #  <= This is what I've changed
                "odf_model": "CSD",
                "seg_algo": "AFQ",
                "parallel_segmentation": "{'n_jobs': -1, 'engine': 'joblib', 'backend': 'loky'}",
                "n_points": 100,
                "clean_rounds": 5,
                "distance_threshold": 3,
                "length_threshold": 4,
                "min_sl": 20,
                "stat": "mean",
                "scalars": "['dti_fa', 'dti_md']",
                "import_tract": ""
            }
        }
    ]
}

jk619 commented 9 months ago

Hi Ariel,

Thanks for taking a look!

I tried both but none worked

"tracking_params": {"random_seeds": true, "n_seeds": 1500000},  
"tracking_params": "{'random_seeds': true, 'n_seeds': 1500000}" - same  syntax as for parallel_segmentation

Introducing the tracking_params dict to the json file removed random_seeds and n_seeds from the report.rst (somewhat expected I think) but there is no sign of tracking_params there.

* kwargs : {'n_seeds': 'wrong_string', 'random_seeds': True, 'odf_model': 'CSD', 'seg_algo': 'AFQ', 'parallel_segmentation': {'n_jobs': 8, 'engine': 'joblib', 'backend': 'loky'}, 'n_points': 100, 'clean_rounds': 5, 'distance_threshold': 3, 'length_threshold': 4, 'min_sl': 20, 'stat': 'mean', 'scalars': ['dti_fa', 'dti_md'], 'omp_nthreads': 8}

current:

* kwargs : {'odf_model': 'CSD', 'seg_algo': 'AFQ', 'parallel_segmentation': {'n_jobs': -1, 'engine': 'joblib', 'backend': 'loky'}, 'n_points': 100, 'clean_rounds': 5, 'distance_threshold': 3, 'length_threshold': 4, 'min_sl': 20, 'stat': 'mean', 'scalars': ['dti_fa', 'dti_md'], 'import_tract': None, 'omp_nthreads': 8}

PS. I looked at printouts produced by the recon workflow and seems I never get this info:

https://github.com/PennLINC/qsiprep/blob/bd788392cfe6182a581473d0c5685f6d06bcc8c3/qsiprep/workflows/recon/pyafq.py#L97-L100

Would be easy to trace the kwargs error based on that

jk619 commented 9 months ago

I think the problem lies here:

https://github.com/PennLINC/qsiprep/blob/bd788392cfe6182a581473d0c5685f6d06bcc8c3/qsiprep/workflows/recon/pyafq.py#L22-L48

All parameters are assigned directly to kwargs (i.e kwargs[n_seeds] instead of kwargs[tracking_params][n_seeds]. I think the easiest is to find to which category n_seeds belongs (in this case [tracking_params]) and assign it there. This would avoid playing with the json file and adding tracking_params there.

Alternatively, when adding tracking_params to the json file (as you proposed)

"tracking_params": {"random_seeds": true, "n_seeds": 1500000},

line 36 is never reached as tracking_params is not a default parameter loaded in line 23.

This is my solution (hope it helps and makes sense) but parameters that are not a part of tracking_params, segmentation_params or cleaning_params are not currently assigned so probably it needs another fix.

import AFQ
import AFQ.utils.bin as afb
import pdb

def _parse_qsiprep_params_dict(params_dict):
    arg_dict = afb.func_dict_to_arg_dict()
    kwargs = {}
    special_args = {
        "CLEANING_PARAMS": "clean_params",
        "SEGMENTATION_PARAMS": "segmentation_params",
        "TRACTOGRAPHY_PARAMS": "tracking_params"}

    for key in params_dict.keys():
        for special_key in special_args.keys():
            if key in arg_dict[special_key].keys():
                if special_args[special_key] not in kwargs:
                    kwargs[special_args[special_key]] = {}

                kwargs[special_args[special_key]][key] = afb.toml_to_val(params_dict[key])

    for ignore_param in afb.qsi_prep_ignore_params:
        kwargs.pop(ignore_param, None)

    return kwargs

params_dict = {}
params_dict['n_seeds'] = 1500000
params_dict['random_seeds'] = True
params_dict['n_points'] = 200

kwargs = _parse_qsiprep_params_dict(params_dict)
print(kwargs)

OUTPUT

{'tracking_params': {'n_seeds': 1500000, 'random_seeds': True}, 'clean_params': {'n_points': 200}} PS. It seems that tracking_params are stored under TRACTOGRAPHY_PARAMS not in TRACTOGRAPHY (line 29) same for SEGMENTATION_PARAMS instead of SEGMENTATION (line 28)

PennLINC / qsirecon