dimensions of as_array - Githubissues

SyneRBI / SIRF

Main repository for the CCP SynerBI software

http://www.ccpsynerbi.ac.uk

Other

60 stars 29 forks source link

dimensions of as_array #316

Open KrisThielemans opened 5 years ago

KrisThielemans commented 5 years ago

we have discussed multiple times on what as_array should return, e.g. at our 18th Software Meeting. Amazingly I cannot find an issue for it. So here it is...

This is getting some urgency because we want to support 3D Cartesian MR sequences #267 and TOF PET #315.

MR acquisition data dimensions (ISMRMRD end 2017, presumably current list for 1.1 is here:

samples
coils
kspace_encoding_step_1
kspace_encoding_step_2
average
slice
contrast
phase
repetition
set
segment
user (8)

MR ImageData presumably has about the same number.

PET acquisition data dimensions (for roughly cylindrical systems), see also the STIR glossary

tangential position (sometimes called radial)
views
"sinograms" (squashing segments and axial positions)
TOF
detector layers
energy windows
gates (could be dual or more)
time frames

PET image data dimensions: (see related #205)

3 spatial dimensions
gates (could be dual or more)
time frames or kinetic parameters

(SPECT would add energy windows or isotopes)

Problems:

too many dimensions for most people, where a lot of them will be size 1 ("singleton" dimensions)
using all of them makes code ugly (in MATLAB, "leading" singleton dimensions can be dropped, but not in Python)
squashing dimensions makes it hard to know what's what, and will only make sense for the simplest cases (example: if PET data is TOF but we return it as 3D. Any display will be weird.)

Related problems:

for large data as_array would have to return a very large array. A proposal to resolve this is #90.

Even harder problems:

MR data doesn't have to be "regular" or "rectangular".
Non-cylindrical PET doesn't fit in the above 3D space tang/view/sino (would make more sense to use 4D)

KrisThielemans commented 5 years ago

MR acquisition data comments during the meeting were:

leave unsorted acquisition data as-is (3D array, Matlab order: sample x coil x acquisition)
sorted acquisition data. solutions
1. 19 dimensional array. People can use squeeze (if they don't want to be careful) and reshape.
2. 3D array but provide information on what's what such that people can reshape (i guess this is the current situation)
3. options to as_array as in acq_array = acq_data.as_array(‘slice’, ‘kspace_encode_step1’, ‘sample’), returning a 4D array, second index is slice, third is kspace_encode_step1, fourth is sample, the first dimension is the product of all other dimensions. (probably better to have 4 options then though)

KrisThielemans commented 5 years ago

I vote for option i. My reasons:

It makes the "discovery" process of "what is in the MR data" easier
option ii will lead to surprising results for the unsuspecting user.
option iii could always be added later. It could also be used to allow people to use option ii.

DANAJK commented 5 years ago

I agree with Kris.

ReconFrame (software from Gyrotools for handling Philips raw data) provides an insight into this issue. Simplified slightly, it handles data and dimensions as follows:

user specifies name of raw data file.
user can specify only certain parameters to be read in (e.g. a specific slice) - useful when debugging and if RAM is limited.
k-space is available as an array with size [nx nprofiles] where nx is the number of samples in a readout and nprofiles corresponds to all the loaded profiles in the order they were measured. - useful if you need the time order, or you want to sort yourself, or the data isn't easily going to fit into a regular matrix.
sort method is called
after sort, k-space data is available as a 12D array. The dimensions are labelled: x – y – z – coils – dynamics - cardiac phases – echoes – locations – mixes – extr1 – extr2 – averages Two of these dimensions (extr1 and extr2) have different meanings depending upon the type of acquisition, for example, in a diffusion scan they would index b-value and gradient direction).
after FFT of the spatial dimensions, the data available is still 12D.
during the recon process, the sizes of the dimensions change (e.g. with zero-padding or cropping of data, or combinations of coils), but the order of dimensions remains the same.

In SIRF, I don't like the first dimension sometimes being a combination of dimensions (I would rather the last). I'm not even sure that reshape works correctly when dimensions have been combined at the front, rather than the end?

mdmlsjm2 commented 5 years ago

I will comment on the PET data dimensions. The first point is I would separate the spatial dimensions from the temporal dimensions for both the raw and image PET data. The rationale is that the temporal aspects are shared between image and raw data. I would argue that the spatial dimension of the raw PET data is 5 dimensional, although 2 are commonly combined as indicated (i.e. 4 dimensional is fine). I do not see detector layer as part of this. This relates to a point of detection and not a sinogram address with mappings from paired detection positions to sinograms bins typically a many-to-one mapping.

On top of the spatial dimensions there is the type of event. In addition to energy window there is whether it is a prompt or delayed event. As the number of these options is typically small this might be better to deal with a type of event with some option or flag to determine what as_array returns.

If we are particularly concerned about multidimensional arrays then one solution could be to return a 2-D array (spatial by temporal) with additional information on how values increment within these dimensions.

DANAJK commented 5 years ago

Also for MR - if data is returned as 2D [readout temporal_profile], then also returning the labels that index the profiles is useful for subsequent sorting. These labels are in the ISMRMRD file.

ckolbPTB commented 5 years ago

Regarding MR, I would either keep the data unsorted (i.e. 3D sample + coil + acquisition or keep all possible ISMRMRD dimensions (12?) similar to ReconFrame. I would not try to reduce the dimensionality either of the MR raw or image data just to make things easier to read for the user.

If we restrict the image data now to a 4D matrix and later we realize for a certain application we would need a 5D matrix, then (at least in python and probably in C++ too) we would not to rewrite the entire code. If we treat everything as a 12D matrix then it will be more work now but at least future-proof.

The raw data should probably be left unsorted for as long as possible because that would then also directly allow for non-Cartesian trajectories for which a rectangular raw data matrix might not make much sense.

KrisThielemans commented 5 years ago

Regarding PET, @mdmlsjm2, I'm not sure how scanners with dual detector layers store their sinograms (possibly they indeed forget about the layers and put all events on a "virtual scanner", similar to what @jafische did), but they could keep them as "normal" sinograms if they wanted to (4 of them for 2 layers). Maybe it's a corner case that we don't have to worry about.

Good point about prompts and delayeds. I guess I thought we'd return them as different datasets with a flag, but that's almost like the flags given in option iii.

one solution could be to return a 2-D array (spatial by temporal) with additional information on how values increment within these dimensions.

that is really essentially the same as option ii (just with fewer dimensions) and has therefore the same drawbacks.

@mdmlsjm2 note that you didn't comment on energy windows. If you have 2 energy windows, you have 4 combinations per spatial sinogram-bin.

ctsoumpas commented 5 years ago

I will comment mainly for PET at this stage. Personally I like to keep it as simple as possible. So, I would preferred to stay with LM data, once we have so many dimensions, I cannot see the point of multidimensional matrices that will hold only a tiny amount of counts. And for LM data, I would even place it to the most generic possible, i.e. x, y, z position of detection at time t. So, if we record these four we should be able to do everything else like TOF, gates, frames. Previous discussion forgot to mention bed position, but I would dare to say that the generic approach would cover this as well. Then, we obviously need an Energy dimension as well, but this is less complicated I guess. I understand that we may wish to keep some compatibility and move from LM to sinogram, but perhaps this is not necessary and not so important. Happy to hear counter-arguments.

With respect to images, things are a little simpler as there we will have the 3D of space (including bed positions) and the 1D of time. A bit challenging when we collapse multi-bed of different time points in one volume, so this is a current approximation on clinical systems anyway. If then we have gates, obviously we add these one/two extra dimensions in any of the above. I do not see anything special in this. If we have kinetic parameters (By the way, these also can appear in some MRI acquisitions), this gets a little more complicated, because these can be too many in theory for some kinetic models and there are also kinetic macroparameters derived from microparameters, so we could store only kinetic parameters of interest in the 4D volume and always have in the header file which is the name (and units) of the kinetic parameter without having to be consistent with the numbering in the array. Note, that this concept could be also considered for multi-parametric voxel-wise images as well, which may also appear in MRI.