bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
265 stars 154 forks source link

[ENH] Optional alternative to bval/bvec #349

Open mattcieslak opened 4 years ago

mattcieslak commented 4 years ago

Trying to develop diffusion BIDS apps is made significantly more difficult by using the FSL bval/bvec format. A format that is dependent on the image axis (and even then only sometimes) means that a huge amount of effort is required to consistently handle orientation information. These files are ubiquitous and therefore should not be removed from BIDS, but they're a lot of trouble.

I propose adding an optional mrtrix-style ".b" file for each dwi file. These can be robustly generated from dicoms, don't depend on how an image is stored on disk (are in world coordinates, consistent with NIfTI convention), are text-based, and are easy to rotate with other software packages (eg ANTs).

For reference on how difficult these files can be:

DSI Studio, camino, and others have invented their own gradient file formats that are similar to the mrtrix .b format. It would be great if the community can agree on an image-axis-independent file format to be optionally included in BIDS. Even a tsv file with the b values and x, y, z in RAS+ world coordinates would be amazing.

effigies commented 4 years ago

Looking at the format, I think this in TSV would be a reasonable type of data to support. Our TSV files always have column headers, which will help make it more explicit that these are RASB columns. So it would be roughly:

R       A       S       B
0       0       0       0
0       0       0       0
-0.0509541      0.0617551       -0.99679        3000
...

We definitely need to continue to allow bval/bvec, since that's part of the standard, but I would personally be fine with permitting a more generic format that is more consistent with BIDS.

Given that it's dwi.bval and dwi.bvec, would dwi.tsv might make sense, or would some other suffix be clearer?

Tagging @bids-standard/raw-mri-dwi.

cookpa commented 4 years ago

I think FSL supports the 3xN bvec format, but it has not been widely promoted.

mattcieslak commented 4 years ago

@effigies that sounds great! It's also worth noting that world coordinate are typically available in dicoms headers, so going directly from dicom header to tsv removes a potentially incorrect transformation to image axis coordinates.

@cookpa do you know what coordinate system the 3xN bvec format uses?

cookpa commented 4 years ago

@effigies that sounds great! It's also worth noting that world coordinate are typically available in dicoms headers, so going directly from dicom header to tsv removes a potentially incorrect transformation to image axis coordinates.

@cookpa do you know what coordinate system the 3xN bvec format uses?

It's the same as the Nx3 format, ie radiological voxels

The bvecs use a radiological voxel convention, which is the voxel convention that FSL uses internally and originated before NIFTI (i.e., it is the old Analyze convention). If the image has a radiological storage orientation (negative determinant of qform/sform) then the NIFTI voxels and the radiological voxels are the same. If the image has a neurological storage orientation (positive determinant of qform/sform) then the NIFTI voxels need to be flipped in the x-axis (not the y-axis or z-axis) to obtain radiological voxels. Tools such as dcm2nii create bvecs files and images that have consistent conventions and are suitable for FSL. Applying fslreorient2std requires permuting and/or changing signs of the bvecs components as appropriate, since it changes the voxel axes.

https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FDT/FAQ#What_conventions_do_the_bvecs_use.3F

It's just a bit more readable to have

x y z x y z

rather than the transpose.

effigies commented 4 years ago

Tagging @bids-standard/raw-mri-dwi.

LOL. Nobody's actually in that group but me... I guess I'll abuse the teams a little and tag @bids-standard/derivatives-mri-dwi. If any of you (including @mattcieslak and @cookpa) know anybody else who should be in this discussion, tag them in.

mattcieslak commented 4 years ago

Maybe @Lestropie, @Garyfallidis too?

oesteban commented 4 years ago

@arokem

@dpys has been recently working on this.

@garikoitz might also be interested

On Sat, Oct 19, 2019, 08:04 Matt Cieslak notifications@github.com wrote:

Maybe @Lestropie https://github.com/Lestropie, @Garyfallidis https://github.com/Garyfallidis too?

— You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/issues/349?email_source=notifications&email_token=AAESDRVOHXFIWAYCTI6BQ4TQPMOYVA5CNFSM4JCIMVXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBXSY4I#issuecomment-544156785, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAESDRSFOTYWGAVW7KG2VG3QPMOYVANCNFSM4JCIMVXA .

jhlegarreta commented 4 years ago

I ignore whether this is under discussion, but if the bval/bvec convention is to be re-visited, it may be worthwhile and the occasion to include other encoding methods (i.e. b-tensor encoding), and to consider how these can make into the specification.

mattcieslak commented 4 years ago

@jhlegarreta that is a good idea. I know software exists for reconstructing GTI scans, but is there a file format for b-tensors yet?

jhlegarreta commented 4 years ago

@mattcieslak not that I know. I'd dare to say that labs that use such encodings have implemented their own solutions to deal with the data. Also, neuroimaging toolkits may have started or have already started to think about incorporating the ability to read such data into their pipelines.

IMHO BIDS is the venue where these efforts should be put in common to gather consensus and define the format.

oesteban commented 4 years ago

@jhlegarreta, storing b-tensor encoding would be possible with a 7-column tsv file?

If so, I think the reinterpretation of mrtrix's format by @effigies here (https://github.com/bids-standard/bids-specification/issues/349#issuecomment-543788435) would generalize well to b-tensor encoding, we just need to define the right column names.

dPys commented 4 years ago

In the sense of dealing with raw vectors, I would also support the idea of a single tsv containing the b values and x, y, z in RAS+ per the suggestion from @mattcieslak.

In a derivatives sense (i.e. esp. within the DiPy universe), I’ve found it helpful to further save the gradient table object as pickle (where the suffix could even formally be called .gtab) since passing gtab object itself doesn’t play nice with workflows. In other words, vector metadata really needs its own format, too (e.g. so that B0 indices are not lost, and so that we can store tau, deltas, qvals, etc.). It would be nice though if this could generalize beyond python, perhaps finding a way to house vector metadata cleanly in .json? (Relatedly, a json read/write method for the gradient table class in DiPy could help with this)

@dPys

oesteban commented 4 years ago

my 2ct: Pickle is just a serialization of data specific to Python. I believe dipy or any other software could have simple readers that build the desired object directly from the tsv file. In addition, pickle is not human readable. Not sure it is really helpful in derivatives either.

arokem commented 4 years ago

+1 to the general idea. I also like the idea of a tsv file for this. A json sidecar could potentially be used to discriminate between b-matrix and b-vector representations. I agree that reading and writing these (e.g., in DIPY) should be straightforward to implement. The json sidecar could also be used to specify the units of the b-values (which could vary in their order of magnitude).

garikoitz commented 4 years ago

(thanks Oscar) +1 for me too, I like the tsv + json idea as well. Tsv with column names to avoid confusion and the json file with the extra required information.

Just thinking aloud, not sure if this makes sense but happy to hear opinions:

For example, files generated by dmriprep will be the input to other different analysis tools. Once you start with the analysis pipeline is always good to know what happened to your origin files (rotated by ..., flipped x by ..., etc. )

On Sat, Oct 19, 2019 at 3:20 PM Ariel Rokem notifications@github.com wrote:

+1 to the general idea. I also like the idea of a tsv file for this. A json sidecar could potentially be used to discriminate between b-matrix and b-vector representations. I agree that reading and writing these (e.g., in DIPY) should be straightforward to implement. The json sidecar could also be used to specify the units of the b-values (which could vary in their order of magnitude).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/issues/349?email_source=notifications&email_token=ABCZAVYMJ4UVRHSHARC4RCTQPOB4HA5CNFSM4JCIMVXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBX52BY#issuecomment-544201991, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCZAVY4JA3RGJV7Z6G57SLQPOB4HANCNFSM4JCIMVXA .

oesteban commented 4 years ago
  • could the json have information related to the diffusion file as well? I mean, for example, {LAS/RAS, [-1,2,3,4]/[1,2,3,4], neurological/radiological} information that could be used to check the b values against the file for extra security

Are we sure there's a need for neurological/radiological conventions given that there's no data array to be interpreted (the image)? What having several options buy you in this case?

I would enforce scanner coordinates. You can even enforce vectors to be normalized and that will be easy to check for the validator.

Also, I'm in favor of having an optional JSON sidecar, with sensible defaults that enable the tsv to be univocally interpreted when the JSON is not provided (i.e., it is not necessary).

  • and maybe history of manipulations of the b-values?

This does not apply to BIDS-raw (which is what we are discussing here). I would also be resistant to encode provenance in BIDS-Derivatives, but it could be arguable.

@mattcieslak - I think there's an appetite to explore your idea, would you mind drafting a PR to the bids specification proposing what you get from this discussion?

Thanks @effigies for the heads-up.

mattcieslak commented 4 years ago

It sounds like we're all in agreement with the format @effigies suggested. The json sidecar @arokem suggested with measurement info like big and small deltas would also be very useful. I will get the PR started, thanks everyone for your responses!

jhlegarreta commented 4 years ago

@jhlegarreta, storing b-tensor encoding would be possible with a 7-column tsv file?

@oesteban Not sure that would cover any possible diffusion encoding trajectory. I ignore whether there is consensus on how such a trajectory can be represented/described so that it can be formalized into a tabular format.

In any case, the TSV (with whichever columns that it takes to encode all possible trajectories) + JSON solution seems to me appropriate a priori.

chrisgorgo commented 4 years ago

I think it's great to rethink the best file format for this metadata from scratch. The limitations of bvec/bval justify it. Thanks for spearheading this @mattcieslak.

The only thing I would be worried about is increasing the number of methods the same piece of metadata can be represented. In other words, there might be a situation when someone develops an app that requires the optional .tsv format and ignores bvec/bval even if they are present. The opposite is also plausible. It might be worth discussing if the new format is aimed to replace the old one in some upcoming backward compatibility breaking release.

garikoitz commented 4 years ago

@oesteban I was thinking on BIDS derivatives when I was thinking aloud, you are right. And in that case I would argue for some form of provenance, yes!

But, +1 to tsv and json

On Sat, Oct 19, 2019 at 5:02 PM Oscar Esteban notifications@github.com wrote:

  • could the json have information related to the diffusion file as well? I mean, for example, {LAS/RAS, [-1,2,3,4]/[1,2,3,4], neurological/radiological} information that could be used to check the b values against the file for extra security

Are we sure there's a need for neurological/radiological conventions given that there's no data array to be interpreted (the image)? What having several options buy you in this case?

I would enforce scanner coordinates. You can even enforce vectors to be normalized and that will be easy to check for the validator.

Also, I'm in favor of having an optional JSON sidecar, with sensible defaults that enable the tsv to be univocally interpreted when the JSON is not provided (not necessary).

  • and maybe history of manipulations of the b-values?

This does not apply to BIDS-raw (which is what we are discussing here). I would also be resistant to encode provenance in BIDS-Derivatives, but it could be arguable.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/issues/349?email_source=notifications&email_token=ABCZAV7UQDIIPBQSZIBPVQDQPONYNA5CNFSM4JCIMVXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBX7DTY#issuecomment-544207311, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCZAV6RWYHROTDJSIDC6DDQPONYNANCNFSM4JCIMVXA .

effigies commented 4 years ago

@chrisgorgo:

there might be a situation when someone develops an app that requires the optional .tsv format and ignores bvec/bval even if they are present. The opposite is also plausible.

I was thinking of this earlier, and I was imagining the correct response to either accept both formats, or support one and, if only the non-accepted format is available, error with an explicit message.

I'm not sure how much this violates existing principles or just adds complexity to BIDS Apps.

chrisgorgo commented 4 years ago

As a side note - the question of how to introduce the new format can be separate from the question of what is the ideal new format. Admins might want to split the discussions for clarity.

Now back to logistics. I see the following options: 1) Require the metadata in either new or old format.

Option 3 could lead to 4 after some depreciation period. Tooling could be built to assist in updating datasets.

mattcieslak commented 4 years ago

If you have a RASB tsv file and a dwi nifti file it would be very easy to generate correct bval and bvec files based on the nifti header. It's the other direction that is tricky. If an RASB tsv were available I would use that first. Otherwise the app would fall back to using the bvals and bvecs at the user's own risk. Unless the BIDS app completely forgoes using FSL, bvals and bvecs will need to be created at some point anyways.

Also, is this thread suggesting creating a json sidecar for the dwi.tsv file or adding some additional fields to dwi.json? The PR #352 implements the latter at the moment

dPys commented 4 years ago

I was proposing a separate json to house derivative metadata for the vectors represented in .tsv. This could include tau, big/small delta, b0 indices, vector orientation relative to image orientation, the gradient table itself with b-matrix / b-vector representations, and units of the b-values (as @arokem suggested). I also agree with @oesteban 's reluctance to include provenance in the json itself, though I do think some type of change log would be useful.

oesteban commented 4 years ago

Thanks very much @mattcieslak for submitting #352. I'm going to lock this thread so that conversation continues there.

francopestilli commented 4 years ago

I just found this thread. I made an independent comment about the proposed tsv file. https://github.com/bids-standard/bids-specification/commit/1d50d14bed67e97892d29ee4edcfbc68a8388d59#r35716642

I see now you guys have been discussing this. Which is great. If we all agree. Please disregard my comment.