Closed Lestropie closed 6 months ago
The spec does allow derivatives to be stored in any format. (see https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#non-compliant-derivatives).
But: I think that in our work here we would want to define a BIDS-compliant way of storing these derivatives. For example, a fully compliant BIDS app that runs bedpostx should take these outputs and rename them into whatever we agree upon as the compliant names for these files.
The spec does allow derivatives to be stored in any format.
I seem to recall having gotten into a debate regarding that part of the spec at one point. One disadvantage I see there is that if you put all of your derivatives into a derivatives/
directory, then you are essentially disabling validation on all of your derivatives, across all pipelines. An advantage of the proposal here is that you would have the ability to mix conforming and non-conforming data, even within a single participant for a single pipeline.
But: I think that in our work here we would want to define a BIDS-compliant way of storing these derivatives.
Obviously. It'd be a very short project if that wasn't the case...
My point is that there is the prospect of defining a BIDS-compliant way of storing non-BIDS-compliant derivative data, which would potentially enable cross-utilisation of data across BIDS Apps in a shorter time frame than that required for the robust definition of any BIDS derivatives (not just DWI). We don't necessarily have to say "if your data don't conform to BIDS derivatives, which don't yet exist, then you can't use them in BIDS Apps".
One other point that came to mind here, which I think was discussed in the context of the tractography TRX format as well. Rather than / as an alternative to a sub-directory, one could instead use a tarball. So you could instead have e.g.:
BIDS_Derivatives/
bedpostx/
sub-<participant_label>/
dwi/
sub-<participant_label>[_desc-<label>]_bs.tar[.gz]
One disadvantage I see there is that if you put all of your derivatives into a derivatives/ directory, then you are essentially disabling validation on all of your derivatives, across all pipelines.
I don't think that validation is an all-or-none prospect, though. From the docstring of the pybids.BIDSLayout class:
If [validate=]True, all files are checked for BIDS compliance when first indexed,
and non-compliant files are ignored. This provides a convenient way to
restrict file indexing to only those files defined in the "core" BIDS
spec, as setting validate=True will lead files in supplementary folders
like derivatives/, code/, etc. to be ignored
Which I read to mean that compliant files are validated and indexed, and non-compliant files are not. I think that this would allow for incremental changes to happen, while still allowing BIDS apps to work with intermediate non-compliant derivatives of various software. In other words, I think that your desideratum "you would have the ability to mix conforming and non-conforming data, even within a single participant for a single pipeline" is already met. But I am not 100% sure.
From prior discussions, this proposal is counter to the intent of the project. So I'm going to close outright. Hopefully with a bit more work, there will be the capability to convert reasonably complex derivatives (bedpostx
is the yard stick here) to something very BIDS-ey.
Don't know the extent to which this topic has been discussed elsewhere, so may need to either link to prior discussions or indeed post elsewhere if it warrants escalation.
Relates slightly to #32 in that the use of sub-directories is proposed, but decided it warranted its own Issue.
There will be many who consider the manipulation of diffusion model data from the standard format in which software packages have exported them for many years into BIDS Derivatives specification to be unnecessary. FSL's
bedpostx
is a good example. All they want is to be able to pass those data into some downstream analysis. And they would like to be able to do that within the framework of BIDS Apps, even if the data stored as intermediary are not explicitly converted into something that is stringently BIDS compliant. So this is a discussion of the extent to which such data should be "supported", in the context of BIDS derivatives specification and in the BIDS validator.Take for example the FreeSurfer output that is generated by
fmriprep
:This I would expect to be considered a carnal violation by the validator. It's useful as far as getting that data from the software pipeline to the user, but it limits the extent to which the validator can be used on derivative data. And changing the entire suite of outputs of FreeSurfer into something that is compliant with some other BIDS Derivatives extension is likely some way off.
What if instead, we were to say that anything stored within a BIDS-compliant sub-directory name is to be treated as non-conformant but permissible.
So the FreeSurfer example above would instead look something like:
So a non-conforming
bedpostx
output might look something like (guessing based on their documentation, don't have an example output at hand at time of writing):So as before, question is: to what extent should this be permitted in the specification, or indeed in the validator?