Reorganize reconstruction derivatives into workflow-specific datasets

tsalo commented 4 months ago

Summary

We've discussed ways to bring QSIRecon outputs in line with BEP016. One option was to add a wf entity to filenames to distinguish outputs from different workflows. Another was to split up the outputs into different workflow/pipeline-specific derivative datasets. The latter is probably a better fit with BIDS-Derivatives and BEP016 in particular.

Here is a proposed organization for derivatives:

dset/
    derivatives/
        qsiprep/  # preprocessing derivatives
            derivatives/  # reconstruction derivatives organized as sub-datasets
                qsirecon-tortoise/  # reconstruction derivatives from TORTOISE pipeline
                    dataset_description.json
                    sub-01.html
                    sub-01/
                qsirecon-pyafq/  # reconstruction derivatives from PyAFQ pipeline
                    dataset_description.json
                    sub-01.html
                    sub-01/
            dataset_description.json
            sub-01.html
            sub-01/
    dataset_description.json
    sub-01/

The second level of derivatives could easily be moved up to the same level as the preprocessing derivatives if that structure seems too nested. Also, if there are shared files that get generated by QSIRecon and are used by multiple reconstruction workflows, then those could go in something like dset/derivatives/qsiprep/derivatives/qsirecon/.

I also think it would be nice to think of QSIRecon as a sort of QSIPost suite of tools, which would fit nicely with how fMRIPrep and sMRIPrep will operate in the future. The QSIPost tools could even ingest outputs from other DWI BIDS Apps, like dMRIPrep.

arokem commented 4 months ago

The example looks great to me! I like the nesting as a record of derivatives-of-derivatives, and I believe that's consistent with BIDS derivatives overall. I don't like the "wf" nomenclature (because what isn't a workflow?), but could be swayed if there are good reasons to adopt that (I don't see it, based on your example).

mattcieslak commented 4 months ago

Ok, let's not go down the wf entity route. The directory level is much more elegant. One question - suppose we do tract profiles of RTOP from the tortoise workflow in the pyafq workflow. What would be a good way to keep track of that?

The challenge is becoming more clear to me now, that some of these workflows produce anatomical results (like bundles) and scalar results. We may want to summarize scalars from other workflows on the bundles of one or more other workflows.

mattcieslak commented 2 months ago

this was added in 0.21

PennLINC / qsiprep

Reorganize reconstruction derivatives into workflow-specific datasets #696

Summary