nipreps / mriqc

Automated Quality Control and visual reports for Quality Assessment of structural (T1w, T2w) and functional MRI of the brain
http://mriqc.readthedocs.io
Apache License 2.0
294 stars 130 forks source link

Running QC independently of directory structure #898

Open ZviBaratz opened 3 years ago

ZviBaratz commented 3 years ago

(This issue also related to issue #797)

Hi,

I would like to propose trying to make the Pythonic interface with anat_qc_workflow() and fmri_qc_workflow() easier to use independently of the CLI and the BIDS data structure. This would both increase flexibility in applying mriqc to non-standardized datasets or databases containing a number of datasets with overlapping data, and reduce coupling between the configuration singleton and the workflows.

If the maintainers (@oesteban / @effigies ?) support this proposal, I would be happy to try and help come up with a PR, but I might need a bit of guidance to get things going without breaking anything.

If not, I would be very grateful for a minimal example of simply executing the anatomical workflow on a single T1 scan. Perhaps it could also be a good addition to the Running mriqc docs.

Thank you very much and all the best, Zvi

effigies commented 3 years ago

I would support that. Very little except the fetching and saving should care about BIDS.

ZviBaratz commented 3 years ago

@effigies, indeed.

Just to make sure I understand - each of the workflow functions relies on the init_mriqc_wf() function having been called, which creates a parent nipype.pipeline.engine.Workflow instance and adds the generated "sub"Workflow instances to run the anatomical and functional workflows as nodes to the parent instance. All of the workflows rely on the config singleton module to fetch data, which assumes a BIDS-compliant dataset directory, and is updated with the run/environment information by calling the from_dict() function on the CLI's parsed arguments.

If I understand correctly, all we need to do is enable passing an alternative configuration dictionary to the workflow execution functions (which could use the same config.from_dict() function to initialize ?). It seems like this used to be the general method in the past (the workflow function used to have a settings parameter), but changed in be0405dcaca3257a4f341a60b0ab5eb527c9d0d4 (last April) to conform with fMRIPrep's configuration format. Unfortunately, I am not familiar with it and therefore it's hard for me to assess the effects of bypassing the init_mriqc_wf() function and initializing the config module as part of the anatomical or functional workflows. What do you think? Does any of this make any sense?

effigies commented 3 years ago

Ah, right. I forgot about the config modules. They've definitely made it harder to factor these things out. @oesteban might have some thoughts here.

oesteban commented 3 years ago

Hi @ZviBaratz,

From a computer science point of view I agree that making init_mriqc_wf independent of BIDS makes for a more modular, less cohesive design that in the long term will reduce maintenance burden and hard-to-debug errors.

That said, for a long while there have been proposals to make MRIQC work on non-BIDS datasets, and this would be a solid move in support of that. TBH, I don't love the idea of making MRIQC non-BIDS capable - regardless of how easy that would be.

The (I admit, very opinionated) reason is that BIDS imposes some formal validity to the data that is worth checking. Quality control is not just about detecting intolerable head motion cases, it is also about the awareness of scientists that their data may become useless if some little piece of metadata is lost.

For this reason (and the fact that it is trivial to generate a BIDS-like structure with symlinks to a dataset that isn't BIDS valid), we have made very little effort towards this direction.

That all said, MRIQC badly needs new, motivated contributors. If you think this idea will make you more acquainted with the tool and likely lead to further contributions, then you have all my support.

If I understand correctly, all we need to do is enable passing an alternative configuration dictionary to the workflow execution functions (which could use the same config.from_dict() function to initialize ?). It seems like this used to be the general method in the past (the workflow function used to have a settings parameter), but changed in be0405d (last April) to conform with fMRIPrep's configuration format. Unfortunately, I am not familiar with it and therefore it's hard for me to assess the effects of bypassing the init_mriqc_wf() function and initializing the config module as part of the anatomical or functional workflows. What do you think? Does any of this make any sense?

We decided to operate that way because the config singleton would not traverse across package boundaries, and, although most of MRIQC's workflows are self-contained, some would come from niworkflows. If all the NiPreps embraced the possibility of accepting a config object as an argument, then that would not be a problem. But, at the moment NiPreps should be as simple as possible and should not rely on such an obscure information passing mechanism (among other reasons because allowing that would force us to impose some sort of API on the config singleton).

IMHO, I think the shortest path to what you want to do (run a single scan) would be:

  1. Copy the CLI codebase to a new script, remove the logic about BIDS
  2. Create a temporal folder, copy/link the input there - but following BIDS rules.
  3. Keep the rest of MRIQC as it is now.

HTH

ZviBaratz commented 3 years ago

After a quick discussion with @oesteban via email -

I managed to run mriqc through the CLI on my BIDS-ish NIfTI format archive. It's a little hacky but it works. I still absolutely think running workflows should be completely separate from data retrieval, but I also don't yet have a clear understanding of the config's module's function and how to best handle it so I'm waiting for further input / getting a better grasp of it.

Meanwhile, I created a PR with some general updates (I updated sphinx, rebuilt the docs, integrated black and flake8, and proposed some other minor changes). If the changes I'm proposing are welcome, I can try and continue in that line. I realize the current misc branch I created isn't ideal, as it addresses multiple issues in a single PR, but I was hoping it's good enough for the purposes of easing into the codebase.