MRtrix3 provides a set of tools to perform various advanced diffusion MRI analyses, including constrained spherical deconvolution (CSD), probabilistic tractography, track-density imaging, and apparent fibre density
Branching train of thought from #2857 as much of the discussion there relates specifically to the std::lgamma() function, whereas the generation of lookup tables for computing Z-scores is an issue in its own right.
From specifically the latest comment, I'm considering point 5: doing an up-front generation of the necessary lookup tables before multi-threaded processing, such that those tables can be shared across threads as const.
I also think that the heteroscedastic case may require different handling. I need to do some testing to confirm, but I suspect that the Welch-Satterthwaite degrees of freedom are regularly going to not attain precise floating-point equivalence, and therefore generating lookup tables for specific values doesn't make sense. It probably needs a 2D lookup table.
So my current logic is:
[ ] For no NaNs, homoscedastic errors:
[ ] For t-tests:
[ ] Generate 1D lookup table: t->Z
[ ] For F-tests:
[ ] For each unique hypothesis rank:
[ ] Generate 1D lookup table: F->Z
[ ] For NaNs present, homoscedastic errors:
[ ] For t-tests:
[ ] For every possible DoF following data scrubbing:
[ ] Generate 1D lookup table: t->Z
[ ] For F-tests:
[ ] For every possible DoF following data scrubbing:
[ ] Generate 1D lookup table: F->Z
[ ] For heteroscedastic errors:
[ ] For t-tests:
[ ] Generate 2D lookup table: (t, dof)->Z
[ ] For F-tests:
[ ] Generate 2D lookup table: (F, dof)->Z
[ ] Ensure vectorstats tests cover all use cases
[ ] Ensure pre-generated tables always cover the requirements of subsequent processing
[ ] Consider whether F-test lookup tables should be done in logarithmic domain
F < 1.0 I'm not particularly concerned about loss of precision since statistical enhancement will zero out anything with Z < 0.0; but it might be a better basis nevertheless given the requisite lookup range is quadratically larger than that of t-tests.
Also of note: If utilisation of std::lgamma() is fixed in #2857, could comment MRTRIX_USE_ZSTATISTIC_LOOKUP on dev until there's greater confidence in operation of the lookup tables from here.
Branching train of thought from #2857 as much of the discussion there relates specifically to the
std::lgamma()
function, whereas the generation of lookup tables for computing Z-scores is an issue in its own right.From specifically the latest comment, I'm considering point 5: doing an up-front generation of the necessary lookup tables before multi-threaded processing, such that those tables can be shared across threads as
const
.I also think that the heteroscedastic case may require different handling. I need to do some testing to confirm, but I suspect that the Welch-Satterthwaite degrees of freedom are regularly going to not attain precise floating-point equivalence, and therefore generating lookup tables for specific values doesn't make sense. It probably needs a 2D lookup table.
So my current logic is:
[ ] For no NaNs, homoscedastic errors:
[ ] For NaNs present, homoscedastic errors:
[ ] For heteroscedastic errors:
[ ] Ensure
vectorstats
tests cover all use cases[ ] Ensure pre-generated tables always cover the requirements of subsequent processing