courtois-neuromod / anat-processing

Pipeline to process anatomical data, including microstructure metrics from DWI and MT data
MIT License
5 stars 3 forks source link

What analysis pipeline to use with the Neuromod project? #15

Open jcohenadad opened 3 years ago

jcohenadad commented 3 years ago

Context

There are a few pipeline solutions out there:

Problem

There is no clear guideline about what pipeline technology to use for the neuromod project. Also, most of these pipelines require some time to learn to be used. Consequently, if one neuromod sub-team uses technology X and another sub-team uses technology Y, this might create unnecessary complications:

Current situation

Based on Slack discussions which started in ~dec 2019, NextFlow was put on the table by @arnaudbore as a possible solution for pipeline. There was no opposition to this idea, consequently, @agahkarakuzu started go with NextFlow and created a pipeline for this repository.

In parallel, other sub-teams have been using fmriprep as a pipeline technology (based on @bpinsard comments on Slack).

Proposed Solution

The two pipeline technologies supported for neuromod project are:

Moving forward

If the core team agrees with the proposed solution, I suggest to add this information under the contributing section of the neuromod project. Related to https://github.com/courtois-neuromod/cneuromod.ca/issues/13

@pbellec @bpinsard @arnaudbore feedback needed

pbellec commented 3 years ago

So a couple thoughts. First I think it would be great to have a go-to pipeline system. It used to be PSOM before we moved to python. This go-to has not materialized, for a number of reasons.

There are just a ridiculous number of pipeline engines out there https://github.com/pditommaso/awesome-pipeline Solutions that have been discussed in simexp recently include:

Note that fMRIprep is not a pipeline tool. It is a particular pipeline built on top of nipype. And it's using nipype capabilities to a very limited degree (no inter- subject slurm parallelization). Nipype is primarily a collection of interfaces for neuroimaging tools, and its pipeline engine is not particularly good. My understanding is that this project is to be superseded by pydra.

Another alternative is to implement a library that take advantage of multicore using joblib - used by scikit-learn and nilearn amongst others - and then distribute individual subject analysis using a basic bash script and slurm. That's the approach we've used for dypac.

Currently at simexp it looks like every project converges towards a different solution for their use case, which may be why there are so many different pipeline projects in the world.... Also our general philosophy is to contribute to existing projects rather than start our own, so in effect we are constrained by choices made by others. Agah's qmrflow project is pretty much our only stand-alone neuroimaging pipeline project. Projects like dypac or load_confounds are designed as "seeds" to be merged down the road in lager project - if successful.

So in conclusion, I am not sure I can make a recommendation at this stage. Happy to hear what others think.

jcohenadad commented 3 years ago

thank you for chipping in @pbellec !

Note that fMRIprep is not a pipeline tool. It is a particular pipeline built on top of nipype.

Thank you for clarifying.

Also our general philosophy is to contribute to existing projects rather than start our own, so in effect we are constrained by choices made by others. Agah's qmrflow project is pretty much our only stand-alone neuroimaging pipeline project.

IIUC @agahkarakuzu 's pipeline is a pure nextflow pipeline, so it uses an existing technology rather than starting our own one. There are just dockerized elements, but the backbone is still a pure nextflow pipeline.

IMHO the strategy regarding pipeline(s) (and programming language and software in general) at the scale of a lab/team is to minimize the amount of technologies/software/language because:

So, given that:

My suggestion would be to endorse it, make it official, and ask people to start learning it.

agahkarakuzu commented 3 years ago

@pbellec I wanted to add a few Nextflow improvements that came with DSL2:

Multiple container orchestration ability (1 container/1 process) was already there, but managing them also became easier.

agahkarakuzu commented 2 years ago

Brain processing pipeline is now standalone and fully switched to Nextflow DSL2 for modularity.