MRtrix3 / mrtrix3

MRtrix3 provides a set of tools to perform various advanced diffusion MRI analyses, including constrained spherical deconvolution (CSD), probabilistic tractography, track-density imaging, and apparent fibre density
http://www.mrtrix.org
Mozilla Public License 2.0
281 stars 176 forks source link

Supporting multiple .tck files with tractography-related commands #1567

Open sheljohn opened 5 years ago

sheljohn commented 5 years ago

This issue follows a discussion started on the community forum. I would like to submit a proposal for an enhancement, namely the support of multiple input .tck files (as opposed to a single one currently) wherever possible.


Problem

Tractography on large datasets is typically run as parallel jobs on computing clusters. There are a number of advantages to divide the computation of a given number of tracts into smaller batches:

Unfortunately, most of the commands in MRtrix (and indeed the source-code itself) do not support multiple .tck files in input.

Proposals

Wherever possible, tractography-related commands should support multiple .tck files in input. There are several ways this could be implemented in practice:

0. Variable number of arguments

As with the tckedit command for instance. I think this is a bad idea, because extending support for multiple command-line arguments disrupts the current interface of several commands; this implies a lot of replicated effort to modify each command individually, and potentially breaks backwards compatibility. I don't think this is a viable solution.

1. Built-in support for multiple tract files

This is what I would personally prefer, but it involves modifying/extending the existing source code. The idea is to introduce a new file-format with extension .lst, which contains one filename per line, and detect this extension internally in order to iterate over the files. The commands remain exactly the same, and the change does not necessarily apply only to .tck files.

Broadly, this involves a rewrite of the class MR::DWI::Tractography::Reader, to include a behaviour similar to the current implementation of MR:DWI::Tractography::Editing::Loader.

The difficult parts are:

I have started implementing this on a fork of the master branch, and should have a compiling version today. This mainly involves:

2. Piping for streamlines

This relates to the discussion in issue #480. I could not do a better job of summarising this idea than @jdtournier and @Lestropie, so perhaps they can elaborate on my short description; but as far as I understand, this leverages the ability of the host system to stream data, in order to virtually concatenate the streamlines at runtime. I am not sure:

jhadida commented 5 years ago

Wrong user account, sorry! I'll keep using the wrong one to avoid reposting...

Lestropie commented 5 years ago

Given I've recently been working on #1555, my first instinct is that this same syntax could be used for streamlines data. So one would rename the individual files to conform to the necessary requirements, and then simply use the square-bracket notation when specifying the input track file at the command-line. Tractography::Reader would then be responsible for parsing the headers of all input files in order to fill Tractography::Properties with the consensus contents, and moving from one file to the next as streamlines data are loaded.

This isn't actually incompatible with providing a text file with a list of file names; it's maybe a little more consistent with existing capabilities, but conversely it's maybe a little less flexible.

411 is also relevant; particularly for piping of streamlines data, but also I suppose that in merging this with approach 1, there could then be one particular "handler" for when the input is detected as being a text file with a list of filenames, and that would then be able to invoke the appropriate handler for each track file independently in turn.


As far as the piping is concerned, my thinking was as follows:

Don't see any reason why this wouldn't be constrained to Unix only. There wouldn't be any new filetype required: the - at the command-line for either read or write of track data would be sufficient.

sheljohn commented 5 years ago

@jdtournier @Lestropie

I have something working here.

There are 3 main commits:

This code:

Please let me know if that would be good for a PR or not :)

sheljohn commented 5 years ago

FYI, the version that was online this afternoon had a typo in it (which would have made compilation fail); I messed up something with the interactive rebase this morning, and didn't notice it until I pulled it somewhere else and tried to build it there. All should be in order now; 4 commits ahead of master, builds and tests fine.

jhadida commented 5 years ago

I went ahead and opened a PR #1569 : happy to extend / amend / retract.

Lestropie commented 5 years ago

properties_consensus function

I'm still in the process of catching up on this, but just wanted to comment on this bit specifically before I read the rest:

In #1555, I make more extensive use of the Header::merge() function to construct the "consensus" header (which is also modified in that PR in order to support its use in this way; previously that function was exclusively for managing the square-bracket notation). It might be preferable, given the functionality for handling multiple instances of Tractography::Properties is a very similar operation, to have the same functional interface.