MRtrix3 / mrtrix3

MRtrix3 provides a set of tools to perform various advanced diffusion MRI analyses, including constrained spherical deconvolution (CSD), probabilistic tractography, track-density imaging, and apparent fibre density
http://www.mrtrix.org
Mozilla Public License 2.0
281 stars 176 forks source link

dwidenoise: try using bidiagonal divide and conquer SVD #2906

Open jdtournier opened 1 month ago

jdtournier commented 1 month ago

This changes the mode of operation from our current fast, but less numerically stable eigenvalue decomposition approach to the fast bidiagonal divide & conquer SVD implementation now provided with Eigen 3.4. Seems to work on my limited testing, producing results that are indistinguishable (though not identical) from the current implementation on my test data (BATMAN tutorial data).

However, this comes with a considerable additional computation cost. On my system (Arch Linux, AMD Ryzen 9 3950X 16-Core Processor, 64GB RAM), these are the runtimes:

Thoughts welcome.

github-actions[bot] commented 1 month ago

clang-tidy review says "All clean, LGTM! :+1:"

dchristiaens commented 1 month ago

I've tested this on the Workshop data (single subject Siemens). On these data too, the denoised data and noise level maps are nearly identical. My timings on these data are similar to yours:

The JacobiSVD in double precision should have the best numerical stability of all tested methods. I therefore calculated the error of the current implementation and the BDC-SVD implementation w.r.t. this baseline. For the noise level, the mean absolute and relative errors are:

This means that the result of BDC-SVD is one order of magnitude more accurate than the current EVD-based implementation in single precision. The EVD in double precision leads to an absolute error that is exactly 0, but this isn't a fair comparison because the input data are in single precision. These results confirm the theory I seem to remember from my numerical linear algebra class ages ago :wink:

Based on this, I'd be in favour of switching to BDC-SVD. I think the longer run time is a fair price to pay for accuracy in single precision, and it's not that much longer than using the current implementation in double precision. I'd then keep float32 as the default, except perhaps when the input data already has datatype float64.

Lestropie commented 1 month ago

Given the relatively small differences in code, would be not too difficult to template by decomposition method and provide a command-line option.

dchristiaens commented 1 month ago

I've made a change to the code to avoid calculating the full (thin) matrix V. Only the central column of S.V' is needed to recombine the data, so we can take advantage of that to save some computation time. I've tested this for cases MN; the output is identical to Donald's BDC-SVD implementation. I've also tried calculating only the non-truncated part of this vector, but this had no significant effect on the run time and would impede adding other shrinkage functions later.

The run time is ~20% faster:

github-actions[bot] commented 1 month ago

clang-tidy review says "All clean, LGTM! :+1:"