GLM: Fixing race condition in Z-transform

Lestropie commented 6 months ago

Branching train of thought from #2857 as much of the discussion there relates specifically to the std::lgamma() function, whereas the generation of lookup tables for computing Z-scores is an issue in its own right.

From specifically the latest comment, I'm considering point 5: doing an up-front generation of the necessary lookup tables before multi-threaded processing, such that those tables can be shared across threads as const.

I also think that the heteroscedastic case may require different handling. I need to do some testing to confirm, but I suspect that the Welch-Satterthwaite degrees of freedom are regularly going to not attain precise floating-point equivalence, and therefore generating lookup tables for specific values doesn't make sense. It probably needs a 2D lookup table.

So my current logic is:

[ ] For no NaNs, homoscedastic errors:
- [ ] For t-tests:
  - [ ] Generate 1D lookup table: t->Z
- [ ] For F-tests:
  - [ ] For each unique hypothesis rank:
    - [ ] Generate 1D lookup table: F->Z
[ ] For NaNs present, homoscedastic errors:
- [ ] For t-tests:
  - [ ] For every possible DoF following data scrubbing:
    - [ ] Generate 1D lookup table: t->Z
- [ ] For F-tests:
  - [ ] For every possible DoF following data scrubbing:
    - [ ] Generate 1D lookup table: F->Z
[ ] For heteroscedastic errors:
- [ ] For t-tests:
  - [ ] Generate 2D lookup table: (t, dof)->Z
- [ ] For F-tests:
  - [ ] Generate 2D lookup table: (F, dof)->Z
[ ] Ensure vectorstats tests cover all use cases
[ ] Ensure pre-generated tables always cover the requirements of subsequent processing

Lestropie commented 6 months ago

[ ] Consider whether F-test lookup tables should be done in logarithmic domain F < 1.0 I'm not particularly concerned about loss of precision since statistical enhancement will zero out anything with Z < 0.0; but it might be a better basis nevertheless given the requisite lookup range is quadratically larger than that of t-tests.

Lestropie commented 6 months ago

Also of note: If utilisation of std::lgamma() is fixed in #2857, could comment MRTRIX_USE_ZSTATISTIC_LOOKUP on dev until there's greater confidence in operation of the lookup tables from here.

MRtrix3 / mrtrix3

GLM: Fixing race condition in Z-transform #2868