image_data.rst - typo in strides for PIR orientation

I am trying to understand the definition of data strides in documentation. Quite confused me that strides of PIR data storage are specified as -2,-3,1. According to my understanding it should be 3,-1,-2. I have checked this also by running mrconvert and inspecting the result in freesurfer's mri_info which provides the info about the direction of image axes.

Also, maybe it should be more clearly described (ideally with some example) how the strides specification is translated into data ordering. There is only provided RAS example which is trivial.

My understanding for conversion is as follows:

From data ordering to strides: for each direction (RAS) which defines x,y,z axes of NIfTI coordinate system, find its order and direction in data ordering to be translated. For example PIR ordering: Take first letter from RAS, R (corresponding to x NIfTI axis), it is in 3rd position of PIR and correct direction, therefore first stride is 3. Take second letter, A, it is in 1st position of PIR and reverse direction, therefore second stride is -1. Take S, it is in 2nd position of PIR and reverse direction, therefore third stride is -2. From strides to data ordering: Pick stride number and pick the letter from RAS in the order and direction specified by number. Example: 3,-1,-2: Pick 3, therefore R is at 3rd position. Pick -1, therefore A is at 1st position reversed (->P). Pick -2, S is at 2nd position reversed (->I). Hence PIR. Generalization to 4rd dimension: From data ordering to strides: Specify where the 4th dimension data should be stored (its stride) and insert the number to the 4th position of the strides specification. I.e. for volume-contiguous PIR, insert 1 at 4th position and increment other strides by sign(strideNumber)*strideNumber, hence strides are 4,-2,-3,1. For 4th dimension data stored on the most outer stride in PIR, the strides are 3,-1,-2,4. From strides to data ordering: Find position of number 4, this is the order of stride of 4th dimension.

The understanding of symbolic strides could be also be inferred to imagine how the nested for cycles in hypothetical data reading code are organized. I.e. the index which increments fastest has stride 1 etc.

Is my understanding correct?

Antonin

Yes, this one is never easy to communicate, maybe I should try to explain that better. I think you're right, the PIR example in the docs is wrong, it should read 3,-1,-2 as you say - my apologies, I'll fix that up. I personally never think about strides in these terms, maybe that's why I got confused...

So the way I interpret strides, taking your PIR example, is by saying:

fastest stride (1) is anatomical A→P, i.e. scanner -y → -1 in second (y) position
next fastest stride (2) is anatomical S→I, i.e. scanner -z → -2 in third (z) position
next fastest stride (3) is anatomical L→R, i.e. scanner +x → 3 in first (x) position

giving the expected strides 3,-1,-2, as you say.

But really, the only real way to understand this is by thinking in terms of the strides needed to navigate the data properly - i.e. the mapping from voxel coordinate to index into the linear data array. If you imagine a 2×2×2 image, its actual strides in RAS would be 1,2,4. In other words, to obtain voxel value at position (i,j,k), you need to retrieve the value in the data array at offset n = i + 2 j + 4 k; i.e. the next voxel along k is a whole plane's worth of data away, in this case 4 voxels, etc. If these data were stored in PIR, the actual strides for these data would be 4,-1,-2.

But this is for a 2×2×2 image. If the image had been 5×5×5, its actual strides for a PIR image would have been 25,-1,-5. This gets more complicated when the dimensions differ along the different axes. For example, if the image dimensions were 3×4×5, and the image was stored RAS, the actual strides would be 1,3,12; for a PIR image, this would be 20,-1,-4.

As you can see, the actual strides are intimately dependent on the image dimensions. So to make the strides specification independent of the image dimensions, we express them as symbolic strides, where all that matters is their relative absolute magnitudes. In this way, we interpret 1,2,3 as equivalent to 2,4,120 or 1,10,11: if you rank them by absolute magnitude, and write down the list in terms of their (signed) ranking, they come out the same. For convenience and clarity, we generally reduce these strides to their simplest expression, which is that given by their signed ranking, so RAS is 1,2,3, PIR is 3,-1,-2, etc. rather than equivalent specifications like 1,5,25 or 4096,-1,-64, etc. so in practice you can pass these to mrconvert and it should produce the same outcome.

The above treatment extends naturally to arbitrary numbers of dimensions, which is probably the reason why we settled on this terminology. There are no additional labels for the 4th, 5th, etc dimensions in NIfTI. Besides, NIfTI can't store data that arbitrarily anyway, we can only modify the first 3 strides, but never have volume-contiguous storage (although I think there's a way to do it in the specs, but I don't think many software packages support that).

Does this help clarify things...? If so, I might have to think about how to update the docs with something like this. Feel free to tell me whether anything I've said actually helped or not, or suggest ways of improving what's in the docs - always appreciated.

Dear Donald,

thank you for the feedback and clarification. I was not used to think in strides, therefore it requires some time to get used to that. I had to think about your examples with differing dimension across axes, since it was not clear how to interpret to which axis the particular size belongs. If I say that image dimension for PIR image is 3x4x5, it means, that 1) size along NIfTI axis x (L->R) is 3, along y(P->A) is 4, and along z(I->S) is 5, or 2) the size along dimension with fastest stride (i.e. P) is 3, along I is 4, along R is 5? Your stride numbers 20,-1,4 for 3x4x5 image PIR refer to the interpretation 1).

If I print the image info by mrinfo -dim, or fslinfo (FSL), or mri_info (FreeSurfer) what is the interpretation of values? It seems that for fslinfo and mri_info the interpretation is 2) (i.e. the order of numbers changes when the strides via mrconvert -strides are changed), whereas for mrinfo the interpretation is 1) if the -norealign option was NOT provided. If the -norealign IS provided, then the interpretation is also 2), if I understand it correctly. This -realign / -norealign option complicates things even more, but it is described in the documentation, so hopefully everyone can grasp that.

However, it is so easy to get confused with this...

However, it is so easy to get confused with this...

Definitely... I'll try to clarify how I think about this stuff, hopefully I'll answer your actual question somewhere in there - just bear with me.

So the main thing to understand first off is that there is an interaction between the transformation matrix and the strides. The way I think about this (and is how these issues are represented within the MRtrix3 codebase itself), is via 3 different coordinate systems (note that this isn't necessarily how everyone conceptualises these things, it's just my way):

voxel space: a straight N-dimensional array, with voxels arranged on a discrete grid, and voxel centres at integer locations along its axes.
image space: this simply scales the voxel space by the voxel sizes, so that voxel centres are now the correct distance apart. Its axes point in the same direction as the voxel space.
scanner space (also referred to as the real, world, or patient coordinate system): this is your anatomical frame of reference, i.e. where you are relative to the isocentre of the scanner, etc.

Then:

the strides provide information that allow you map a voxel location in voxel space to that voxel's index (offset) in the data array.
the transform (embodied in NIfTI as the qform or sform) provides the rigid-body rotation/translation information to allow you to map from image to scanner space.

So the issue is that there is some redundancy between these two things: if I was to switch the strides so that we now store voxels contiguously along y, then x in voxel space instead of the standard x then y, that is essentially equivalent to using a modified transform where the image x & y axis are exchanged - without explicitly changing the strides as such. In both cases, the voxel values are stored in the same order on file, and the location of any given voxel is the same in scanner space. This is where things get a bit messy, since different packages will deal with this differently (well, to be honest I think it would be more appropriate to say that MRtrix3 deals with this differently from everything else...).

If we focus on NIfTI in particular, things are relatively simple since that format assumes fixed, standard strides: voxel intensities are always stored contiguously along the x-axis, then each row is stored contiguously along the y-axis, etc. (what we would refer to in MRtrix as strides 1,2,3,...). The issue here is that the x-axis we're talking about here is actually the voxel space x-axis, i.e. that specified in the transform (qform/sform), not the anatomical x (patient left-right) axis. So when you say:

size along NIfTI axis x (L->R) is 3, along y(P->A) is 4, and along z(I->S) is 5

there's a bit of confusion here with what is meant by the 'NIfTI axis'. Using your example, the size along the voxel space x-axis is indeed 3, but whether that corresponds with scanner space L->R depends on the transform. For a PIR image, that would actually map to A->P, which is a closer match to your interpretation 2... So, I think I can see where the confusion comes from, but before I go into that, I just want to clarify a few things about how MRtrix3 handles things:

When using mrinfo -norealign, you get a dump of exactly what was in the NIfTI header, with no further modification by the MRtrix3 backend. For a NIfTI image, the strides will always be 1,2,3,... - that's implicit in the format. The transform can however represent a very large 90° rotation that takes the voxel space x-axis and maps it to the y or z axis, etc. This manifests as a transform matrix whose upper left 3×3 (rotation) quadrant deviates strongly from the identity (e.g. you get something like [ 0 1 0; 1 0 0; 0 0 1], which corresponds to swapping the x & y axes around).

However, when you don't use the -norealign option, you allow MRtrix3 to modify this information in a way that maintains the consistency of the data, but also removes some of the ambiguities between the transform and the strides. The idea behind that operation is to bring the transform back to near-axial, i.e. to something that represents a small rotation (just to account for a slightly oblique acquisition plane, basically) by permutation of the axes, with corresponding adjustments to the strides. We discussed this operation and the rationale behind it to death in this issue on our previous GitLab repo [EDIT: I can't make it public, there's too many private branches on there - I'll repost the relevant issue content here]. Long story short: I really do think doing this is a good idea, it allows data stored using different storage conventions to be reasoned about, processed, and displayed as expected most of the time.

So having clarified this, I think the confusion probably arose due to my implicit assumption about what is meant by an image with dimensions 3×4×5. Since in MRtrix3 we always have these dimensions reported after the transform has been modified, that means that the dimension along the x-axis genuinely is that along L->R, etc. So when I talked about an image with dimensions 3×4×5, and then storing that image as RAS or PIR, those dimensions assumed your interpretation 1 - and if you think about it, it has to, anything else would mean the resulting images would no longer be equivalent. However, technically what is stored in the NIfTI header is quite different now, with a very different transform matrix, different dimensions along each axis, and voxel values stored in a different order on disk. Most NIfTI-handling tools would report exactly what was stated in the headers. In contrast, in MRtrix3, NIfTI is just one of a number of possible image formats, and is therefore treated in ways that simplify concurrent processing of these different types of images. In particular, MRtrix3 will modify the information to make it more consistent with any other images that might be processed alongside it. For instance, if you were to create a small RAS image with dimensions 3×4×5, e.g.:

mrconvert some_image.nii -axes 0:2 -stride 1:3 -coord 0 51:53 -coord 1 51:54 -coord 2 51:55 ras.nii

then convert that to PIR:

mrconvert ras.nii pir.nii -stride 3,-1,-2

and inspect with mrinfo ras.nii pir.nii, you'll see that the images are reported as being essentially identical in every respect other than their strides:

************************************************
Image:               "ras.nii"
************************************************
  Dimensions:        3 x 4 x 5
  Voxel size:        0.9 x 0.898438 x 0.898438
  Data strides:      [ 1 2 3 ]
  Format:            NIfTI-1.1
  Data type:         unsigned 16 bit integer (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:               0.9992     0.04071   0.0004318      -41.47
                         -0.04071      0.9992     0.00267      -38.63
                       -0.0003228   -0.002686           1      -83.89
  comments:          TOURNIER DONALD (0
  mrtrix_version:    0.3.15-449-gb5eeb174
************************************************
Image:               "pir.nii"
************************************************
  Dimensions:        3 x 4 x 5
  Voxel size:        0.9 x 0.898438 x 0.898438
  Data strides:      [ 3 -1 -2 ]
  Format:            NIfTI-1.1
  Data type:         unsigned 16 bit integer (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:               0.9992     0.04071   0.0004318      -41.47
                         -0.04071      0.9992     0.00267      -38.63
                       -0.0003228   -0.002686           1      -83.89
  comments:          TOURNIER DONALD (0
  mrtrix_version:    0.3.15-449-gb5eeb174

This is as it should be: the information content is identical between the two images. All we've done is order the voxels differently on file... So any MRtrix3 application that needs to process these two images together will work as expected. For example, subtracting the two images voxel-wise works exactly as expected, returning an image that contains zero values everywhere - even though the voxel values in the two images are ordered differently on file:

$ mrcalc ras.nii pir.nii -sub - | mrstats -
mrcalc: [100%] computing: (ras.nii - pir.nii)
         volume         mean       median    std. dev.          min          max       count
         [ 0 ]             0            0            0            0            0           60

However, it you look at the information in the headers without allowing MRtrix3 to modify anything, i.e. using mrinfo pir.nii -norealign, you'll see that the image dimensions, voxel sizes, and transform are indeed reported as different, with the transform now containing a very large rotation. In fact, the only thing that hasn't changed is the strides:

************************************************
Image:               "pir.nii"
************************************************
  Dimensions:        4 x 5 x 3
  Voxel size:        0.898438 x 0.898438 x 0.9
  Data strides:      [ 1 2 3 ]
  Format:            NIfTI-1.1
  Data type:         unsigned 16 bit integer (little endian)
  Intensity scaling: offset = 0, multiplier = 1
  Transform:             -0.04071  -0.0004318      0.9992      -41.36
                          -0.9992    -0.00267    -0.04071      -35.92
                         0.002686          -1  -0.0003228      -80.31
  comments:          TOURNIER DONALD (0
  mrtrix_version:    0.3.15-449-gb5eeb174

Clearly, if you were to subtract that from the original RAS without the modifications we normally do, the images would simply not match, and the application would either bail out with a message reporting the mismatch, or it would just crash out. This is why MRtrix3 performs these modifications.

Hopefully that clarifies what's going on here...?

Here is a dump of issue 74 on GitLab referenced above, just for completeness:

Should MRtrix reshuffle the transform on load by default?

J-Donald Tournier @jdtournier commented 2 years ago

The current default for MRtrix is to massage the image transform into as near an axial orientation as possible by switching columns and/or inverting them. This is all done in a self-consistent manner, of course: any column switch will be accompanied by the corresponding changes to the data strides, and any inversion of an image axis will be accompanied by a change in the origin component of the image transform, leaving the data themselves untouched.

I've always thought this was a nice feature since it allows images to be displayed in a much consistent manner in the viewer, regardless of the way they were acquired and/or stored. It also allows images that have been stored using different transforms / strides to be processed together sensibly - most of the time. Issues occur when operating on images, one of which has had its transform modified such that it has been rotated from its original orientation by an amount sufficient to trigger a transform reshuffle (or at least, to trigger a different transform reshuffle). The problem then is that trivial operations such as a straight copy will end up looping over different axes from what might have been intended - for example what was x in the original image might now be considered to be y after the reshuffle - which obviously isn't what the developer would have intended. For now, this issue has been fixed in mrtransform by introducing a flag to prevent the reshuffle (commit a09502fe), but that's not the prettiest hack I've ever come up with...

So the question is: should that behaviour be removed altogether, and perhaps limited to the viewer alone? It would be trivial to shift that code out of the default path in the Image::Info class, and added as an additional explicit member function where required. I'm trying to figure out whether this is a good idea or not. My worry is that odd things might happen if for example one image is converted to a different format that doesn't support the strides requested, so that a transform reshuffle has to be done for the output image, and a subsequent attempt is then made to process this image along with the original. Without doing the kind of reshuffling that MRtrix currently does by default, it would be almost impossible to guarantee the correct behaviour in such a case - the x axis would just be whatever the first axis was, regardless of whether it corresponds to the other image's x axis (which might have been AP for a sagittal acquisition, for example). I'm pretty sure these kinds of issues happen regularly with FSL for example (they certainly do when converting non-axial images to NIfTI using MRtrix 0.2, and trying to overlay them in FSLView on top of their original image).

Is this something worth worrying about, or is the current behaviour the safest way to proceed, with the option of disabling it in the few cases where it is not desired...? If the latter, suggestions are also welcome for a cleaner way of preventing the transform reshuffle...

Dave Raffelt @draffelt commented 2 years ago

Just a couple of questions to make sure I understand (I've always wanted to get my head around this):

What do you mean by near an axial orientation as possible? Do you mean x,y,z corresponds to axis 0,1,2 like most axially acquired images?
When you say this helps images viewed in a consistent manner, do you mean that a sagittal acquisition will load up and be correctly lined up with an axial acquisition? I thought that MRview should be able to work this out on the fly without jigging the transform and strides on image load.
When you say images can be processed together sensibly, do you mean that looping over axis 0,1,2 for both a sagittal and axial acquisition ensure that corresponding voxels are processed simultaneously? (something that would not happen very often in practise I guess)
What do you mean by a transform reshuffle? Do you mean that during a mrtransform, if you transform/rotate the image (by modifying the header transform) by a large amount, then the output axis get shuffled?

If I've understood correctly, then I'd probably vote for what you are doing now, just disabling it where it is a problem.

J-Donald Tournier @jdtournier commented 2 years ago

OK:

by near-axial, I mean that the first column of the transform is close to (1,0,0) (x-axis), the second to (0,1,0), and the third to (0,0,1). In other words, the rotation part of the transform is as close to identity as possible by swapping/inverting the axes. Or as you put it, axes 0,1,2 correspond as closely as possible with genuine scanner-space x,y,z.
this helps images viewed in a consistent manner: yes, basically when I request an axial orientation, I want to get an near-axial regardless of whether the image was acquired axial or sagittal or whatever. You're right that there is no reason why MRView couldn't do this by itself most of the time, and would do so without modification in every case apart for when the viewer is locked to the image axes - this is where MRView assumes that axes 0,1,2 do correspond to spatial x,y,z. No reason why we couldn't also work that out on the fly, but since that's always been a safe assumption in MRtrix, there was never any need to.
images can be processed together sensibly: yes, if an image is converted or processed and ends up with different strides for whatever reason (format conversion, or a specific process that enforces a specific set of strides on its output - not uncommon), then any further process that loops over both the original and processed image together will probably end up looping over different axes. You're right, probably wouldn't happen a lot in practice, but Murphy says it will happen. And when it does, massive confusion will ensue...
transform reshuffle: whenever an Image::Buffer is instantiated, the Image::Info::sanitise() method is invoked, and part of that will reshuffle the transform to make it near-axial. As I said before, this involves swapping axes around (along with the corresponding strides), and inverting axes (and their corresponding strides, as well as shifting the origin to the opposite corner) until the rotation part of the transform is near-identity. What that means is that if I load an image and get its header, then modify the transform of that header with a large rotation, then create a new output image based on that header, chances are the transform will be reshuffled for the output image. What this means is that while axes 0,1,2 for each buffer are as close to spatial x,y,z as possible, there is no longer any guarantee that axes 0,1,2 of the input image correspond to axes 0,1,2 of the output image. For instance a 90° rotation about the z-axis will probably mean that if the strides of the input image were 1,2,3, the output image will probably end up with strides 2,-1,3, and the loop will therefore iterate over x,y,z for the input image, and y,-x,z for the output. Not cool.

I must admit, as I was writing the original issue, I went from not too sure the reshuffle was necessary to pretty convinced it was absolutely required. Especially since MRtrix allows applications to specify the output strides, so there is no guarantee that an image processed by MRtrix will end up with the same strides as the original. This is in my opinion a good thing, but it does mean we have to take special precautions to ensure it doesn't produce weirdness down the track.

One other option that I hadn't thought of till now, is to only reshuffle the transform on image load, and leave it as-is on image create. This would ensure that any images read will align properly, but prevent situations like the one I've outlined above when creating a new rotated image. In 99% of cases, the input transform is used as-is to create the output image, which would imply that it's already been reshuffled - and any app creating the output transform from scratch will probably create a near-axial one anyway. That would prevent the issue in mrtransform, and remove the need for this ugly ProtectTransform hack - which I don't expect other developers to understand / know about until they've reached maximum frustration trying to figure out why their regridding app is screwing up every so often... Might start working on that when I have a minute, if nobody has any objections?

Dave Raffelt @draffelt commented 2 years ago

Thanks for explaining it all, makes good sense now. Sounds like a good solution to me. Maybe we could also add an option to mrinfo that allows you to display the unshuffled raw transformation/strides (assuming it currently displays the shuffled versions)? This might be helpful for people that are having issues with compatibility or trying to better understand the whole shuffling process.

J-Donald Tournier @jdtournier commented 2 years ago

Good idea re mrinfo. I'll work on that when I have a minute

Dear Donald, thank you for your time and patience. I am more and more surprised (but also frustrated) how the things so simple at first glance are so complicated and confusing in reality....

Introducing image space in context of mrtrix brought a confusion to me since it was not clear to me, how do you define this space. What direction the particular x,y,z axes of image space really point at? The data are usually 3D array and can be viewed in whatever orientation, so the definition of image space cannot be related to way the image is displayed. Do I grasp it correctly according to your statement

The issue here is that the x-axis we're talking about here is actually the voxel space x-axis, i.e. that specified in the transform (qform/sform), not the anatomical x (patient left-right) axis.

that the direction of x,y,z axes of voxel/image space in your understanding is defined by their corresponding columns in qform matrix (i.e. first column defines x axis, second column defines y axis, etc)?

In contrast, my previous understanding of axes in voxel and image space was primarily related to way how the voxel indexes in file are arranged and their position is mapped by qform, i.e. x axis defines most contiguous dimension in file (with stride 1), y axis is second etc.

That it is also in conformance with NIfTI standard, where

The i index varies most rapidly, j index next, k index slowest.

https://nifti.nimh.nih.gov/pub/dist/src/niftilib/nifti1.h where only x,y,z indexes/coordinates are labeled i,j,k.

The concept of "variable strides" introduces one more level of complexity above that. Maybe it would help to provide in the documentation extension of the original nifti1 formula (or point to particular piece of source code) how precisely the index from file data array is mapped to the mrtrix scanner space RAS coordinate.

size along NIfTI axis x (L->R) is 3, along y(P->A) is 4, and along z(I->S) is 5

By NIfTI axis I meant the axis in scanner space, which is defined as x pointing from left to right, y from posterior to anterior, z from inferior to superior (RAS).

On the other hand, I understand that by introducing concept of strides you can assure that orientation of x,y,z axes in coordinate system you use in mrtrix is constant, identical to NIfTI standard http://mrtrix.readthedocs.io/en/latest/getting_started/image_data.html#strides 0 (x) increasing from left to right 1 (y) increasing from posterior to anterior 2 (z) increasing from inferior to superior

This is in contrast to image space coordinate system (as I understand its definition) where each axis can point to whatever direction.

This approach has big advantages, for example (and here I am replicating your previous appended post):

If you for example specify that the image size is 3x4x5, you do not need to further specify what "orientation" the image is, i.e. what is principal direction of the axes.
The output of mrinfo is better human - readable since one usually is not interested what storage type the image is.
And, most importantly, the images with different order of storage/image space axis orientations do not get messed up in mrtrix internals.

But, it seems to me, that to grasp this concept it requires to abandon image space and think only in mrtrix/scanner space.

I have one more think to discuss which is only partially related to this topics (and maybe deserves stand-alone chapter): I found that when qform_code is set o 0 in NIfTI dataset, fslview does not display orientation labels. In contrast, mrview produces [WARNING] transform matrix contains invalid entries, resetting to sane defaults and shows labels. I think the more safe approach would be NOT to show labels, since they can be totally wrong.

And, this generally possibly brings complications to all image handling routines which make use of mrtrix coordinate system (which implies that image transform has to be valid). What is general behaviour in this situation? Does mrtrix allow to process these files?

It seems the more I try to explain this, the more confusion ensues...

OK, the first thing is to cleanly separate the NIfTI standard from MRtrix3's implementation and handling of images in general. Bear in mind that MRtrix started life before NIfTI came along, and supports a number of other image formats besides NIfTI - in fact NIfTI support came relatively late to MRtrix, well after its initial release (bearing in mind that it had been used in-house for quite a few years before then). So a lot of the image handling is designed to be agnostic to the actual format of the images, and to provide a coherent view of the data regardless of the format these data might be stored in.

Back to NIfTI: it is clearly described in the file you linked to (which is actually included wholesale in MRtrix3), and that provides the basis for interpreting these images. The NIfTI standard has no concept of (variable) strides: it states unambiguously (as you mentioned) that:

The i index varies most rapidly, j index next, k index slowest.

It also states that the axes along which i,j,k point in scanner space are provided by the qform/sform (although it's not clear which should be used when both are present). So that's pretty much all we need to know to unambiguously map a voxel value in the file to its position in real space.

Everything else after that is how MRtrix3 handles these images - and this applies not just to NIfTI format images, but any other format too, be it DICOM, mif/mih, mgh/mgz, Analyse, etc. But first, I'll try to clarify exactly what space is what, where strides come into it, and how that all relates to NIfTI.

scanner space is as described previously, and the MRtrix3 coordinate system matches exactly with the NIfTI standard (a happy coincidence for us given that NIfTI came along afterwards...).
voxel space is the grid spanned by the voxel centres, i.e. the i,j,k indices in NIfTI when the strides are 1,2,3 - this will not be true otherwise (more on that below). When the strides match the NIfTI standard's stated ordering, then the axes of the voxel space coordinate system will indeed point along the vectors given by the columns of the qform/sform matrix.
image space shares its axes with voxel space, but the axes are unit vectors. A unit displacement in image space is 1mm, same as in scanner space. In contrast, a unit displacement in voxel space will be the voxel size along that particular axis, and that will be direction-dependent if the voxels are not isotropic.

So all this says that when the strides of the image are 1,2,3, then everything is as you expect. This is what mrinfo will report if the image happens to be RAS, or if you provide the -norealign option. The MRtrix3 backend will in fact read this information from the NIfTI headers as-is, assuming strides 1,2,3 as per the standard. At that point, the information from the NIfTI headers has been interpreted as per the standard, and everything should match with what you understand the standard to be.

The bit that is causing confusion here is probably what happens next, the additional transform reshuffle, which happens later in a code path common to all image formats. This is nothing to do with NIfTI per se, it's what MRtrix3 will do to all images regardless of their original format. The reason it can happen at all is that MRtrix3 has a very flexible concept of strides: voxel values can be organised in any order, the data are just treated as a N-dimensional array, and there is no particular expectation that values along the first axis should be contiguous. In fact, for performance reasons MRtrix3 will often rearrange the values so that they are contiguous along the volume axis (i.e. axis 3, indexed from 0) - if the output image format supports it (although I think the only format that fully supports arbitrary strides is MRtrix3's own mif/mih format). To navigate this array, all we need to know are the strides. I've already explained how these are handled, and how a symbolic specification (e.g. 2,1,-3) can be expanded to the actual strides given the dimensions of the array, hopefully that was clear enough then.

It's important to realise at this stage that the strides are conceptually entirely independent of the image transform. The latter only relates to the first 3 (spatial) axes anyway. Now I can take a NIfTI image interpreted as per the standard (i.e. strides 1,2,3), with a potentially large rotation in the transform, and that is already a fully valid image. The reason for the further modifications that MRtrix3 makes is as explained previously: to allow sensible processing of the data alongside other images that might have been processed differently, and been regridded from e.g. PIR to RAS, etc. even though the voxel centres lie on the same exact grid. The only way to allow images regridded like this to be processed as expected alongside their original non-regridded version, in a way that is transparent to the user (or indeed the app developer) is to do the modifications I mentioned previously. We have the flexibility in MRtrix3 to modify the strides, so what we can do is take the axes defined by the incoming transform (i.e. qform/sform for a NIfTI image), and permute/flip them around so that they point close to their corresponding anatomical axes (i.e. scanner space), and adjust the strides accordingly. Note that this now means that image & voxel space no longer correspond with the original qform/sform axes in the NIfTI header, but with the transform matrix created by this realignment process. The voxel space (in MRtrix3 speak - not NIfTI) should now have its axes pointing close to the anatomical L->R, P->A, I->S axes, and you now need to take account of the strides to figure out the offset into the data array of a voxel at index i,j,k.

So a voxel at i,j,k in a PIR NIfTI image that has been imported via this process is actually at -j,-k,i in the original image (note that this means you need a non-zero base offset here, otherwise you will end up with negative offsets). What this translate to as far as MRtrix3 is concerned, is into a different formula for computing the offset. Assuming the image dimensions were 3×4×5 (as per the example above), where these dimensions relate to the scanner frame (i.e. the dimension along the axis closest to L->R is 3, etc), if the image was stored RAS, the actual strides would be 1,3,12, and the offset for a voxel at x,y,z would be x+3y+12z. If the image was PIR, its dimensions as stored in the NIfTI header would actually be 4×5×3 (as per one of my previous posts), its unmodified strides would be 1,4,20, and the offset for a voxel at i,j,k would be i+4j+20k - but note that in this case, i,j,k would not correspond to the x,y,z used in the RAS image. After the transform reshuffle, the new strides would be 20,-1,-4, and the offset for a voxel at x,y,z would be 19+20x-y-4z (note the additional non-zero base offset here), and in this case, the x,y,z indices used here are a direct match to the RAS case. You can verify all this by running e.g. mrstats -debug some_image. For example, using the images from my previous post:

 $ mrstats -debug ras.nii
...
mrstats: [DEBUG] image "ras.nii" initialised with strides = [ 1 3 12 ], start = 0, using indirect IO
...

 $ mrstats -debug pir.nii
...
mrstats: [DEBUG] image "pir.nii" initialised with strides = [ 20 -1 -4 ], start = 19, using indirect IO
...

If you don't do these kinds of modifications, you can end up with issues such as derived images (that may have been regridded) not matching with their original non-regridded image in the display (many reports of that on the various mailing lists and forums in relation to fslview), and other more subtle issues of this type. One of them that we had to deal with for the dwipreproc script was that topup used to produce its output field map with a unit transformation, regardless of the input transform (and the images were often LPS given that's how axial DWI images are typically stored in DICOM), whereas eddy did take account of the transform when applying the field. So the net result was that the distortion field that actually applied was flipped back-to-front and left-to-right compared to what it should have been, with the effect that the distortion correction didn't look right - but often only subtly so. It took a while to figure this out, and now we simply convert everything to LAS (what FSL historically expects given its Analyse legacy) just to make sure we avoid issues of this nature... I should add that this has been fixed in recent versions of FSL, as far as I know. But I'm just using this as an example of why getting this right is critically important, for users and developers alike.

I have one more think to discuss which is only partially related to this topics (and maybe deserves stand-alone chapter): I found that when qform_code is set o 0 in NIfTI dataset, fslview does not display orientation labels. In contrast, mrview produces [WARNING] transform matrix contains invalid entries, resetting to sane defaults and shows labels. I think the more safe approach would be NOT to show labels, since they can be totally wrong.

And, this generally possibly brings complications to all image handling routines which make use of mrtrix coordinate system (which implies that image transform has to be valid). What is general behaviour in this situation? Does mrtrix allow to process these files?

That's a sensible suggestion, but would be very problematic to do in practice, I think. First off, some image formats simply don't provide orientation information - Analyze is a case in point. More to the point, as you allude to, we need to allow processing of these images in general, and in many cases we just have to put something in there, otherwise a lot of the MRtrix3 commands will fall over. This is mostly a consequence of the explicit design decision I made over a decade ago that processing should happen in the scanner frame where possible. For example, even something as simple as importing the bvecs/bvals can't be done in MRtrix3 if the image doesn't have a valid transform, since MRtrix3 will internally convert the DW gradient table to scanner space. As least by resetting the transform to unity, the bvecs/bvals can be imported for that image. Same for computing tensors and FODs: the coefficients stored in both cases are provided with respect to the scanner frame, not the image axes. Tractography also won't work, since the input FODs or tensors need to be provided with respect to scanner space, and the streamlines are to be stored with respect to scanner space also. Also, any image registration needs a valid transform. So I think the simplest approach is simply to warn the user about this, but nonetheless insert some vaguely sensible default in there to allow subsequent commands to proceed.

One possible approach is to detect these cases in MRView, and not display the orientation labels in this case. But I'm not sure that's such a good idea. The point is, in any other command, the image will be interpreted as if it were pure axial (the user will get the warning, but that's it). So it makes sense to have the viewer also show that interpretation, if only so users can see that this is what would happen if they were to process their images with any other commands. Another issue is that the menu entries to change orientation are labelled axial / sagittal / coronal, so the very fact that we display the images and you can chose one of those orientations implies that we've made these very assumptions about the transform. I can see why fslview can afford to drop the labels, since they are simply displaying the image according to how it was stored - the transform information is used purely to display the orientation labels. In contrast, MRView internally displays everything in the scanner coordinate system, no matter which angle the images might be viewed from. So there is a fundamental reliance on the transformation information being available, and I don't see that hiding the labels is really going to help reduce any confusion. If anything, I think our explicit warning about the fact that the transform is invalid and has been reset is a much clearer indication of what is actually going on than hiding the orientation labels...

You also have to take into account that MRView indeed needs this information (since stuff is displayed in scanner space); so hiding the labels doesn't mean they aren't used and relied upon even on that very moment. Apart from the (maybe more formal/abstract) same arguments @jdtournier provides about this in the last paragraph above, maybe consider this simple example: assume you open such an image (where the information is missing) as an overlay to an image where it isn't missing. This alone already shows you need to assume something somehow; how else will you have a definition of how to overlay the problematic image onto the other one? Also, from a GUI point of view, the labels will already be present anyway (because the main image didn't have any issues). We don't show the labels for separate images (or overlays): we just show them for the space, and the images are shown in this space. In a way, even the main image already is "an overlay": the real/scanner/... space (labels and all) "is already there", and anything that is loaded is displayed in it. Note I'm inherently just repeating what @jdtournier has already said, but maybe this way of seeing it makes sense to understand why there is an unavoidable need to assume something, so any image that comes through the front door of MRtrix lives in scanner space by the time it has passed that very front door. 🙂

Dear @jdtournier ,

It seems the more I try to explain this, the more confusion ensues...

Thank you really for the patience, I think that finally you clarified all the aspects of the problem very well :-)

As for the issue how to handle the missing image orientation in mrview, I understand as well. When you want to allow to process images (and image formats) where the orientation information is missing, there is probably no better way to do that providing this internal MRtrix design (whose reasons and advantages you described very well). Maybe, to increase safety and information for the user, apart from producing warning to command line, there could be also popup window with warning message displayed (as it is for example when invalid filename is provided).

MRtrix3 / mrtrix3