Data race in vectorstats - std::lgamma is not thread safe on Linux

daljit46 commented 6 months ago

Thread sanitizer has reported another data race in our codebase. Here is the output:

    Start 469: bin_vectorstats_cleanup
1/2 Test #469: bin_vectorstats_cleanup ..........   Passed    0.01 sec
    Start 471: bin_vectorstats_2
2/2 Test #471: bin_vectorstats_2 ................***Failed    0.59 sec
vectorstats: [WARNING] existing output files will be overwritten
vectorstats: [done] Configuring data import from files listed in "subjects.txt" as found relative to directory "vectorstats/0"
vectorstats: Number of subjects: 32
vectorstats: Number of elements: 5
vectorstats: Number of factors: 3
vectorstats: Design matrix condition number: 1.28826
vectorstats: Number of hypotheses: 5
vectorstats: [100%] Calculating basic properties of default permutation
vectorstats: [100%] Outputting beta coefficients, effect size and standard deviation
vectorstats: [  3%] Running permutations...==================
WARNING: ThreadSanitizer: data race (pid=4225)
  Write of size 4 at 0xfffff6fd6068 by thread T5:
    #0 lgamma <null> (vectorstats+0x7ba18) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #1 MR::Math::betaincreg(double, double, double) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/betainc.cpp:65:33 (libmrtrix-core.so+0x27084c) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #2 MR::Math::(anonymous namespace)::F2z_lower(double, unsigned long, double) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:56:18 (libmrtrix-core.so+0x2e206c) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #3 MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2::operator()(double) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:267:57 (libmrtrix-core.so+0x2e206c)
    #4 double std::__invoke_impl<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>(std::__invoke_other, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (libmrtrix-core.so+0x2e206c)
    #5 std::enable_if<is_invocable_r_v<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>, double>::type std::__invoke_r<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>(MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114:9 (libmrtrix-core.so+0x2e206c)
    #6 std::_Function_handler<double (double), MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2>::_M_invoke(std::_Any_data const&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:290:9 (libmrtrix-core.so+0x2e206c)
    #7 std::function<double (double)>::operator()(double) const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9 (libmrtrix-core.so+0x2e206c)
    #8 MR::Math::Zstatistic::LookupBase::interp(double, double, double, Eigen::Array<double, -1, 1, 0, -1, 1> const&, std::function<double (double)>) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:134:10 (libmrtrix-core.so+0x2e206c)
    #9 MR::Math::Zstatistic::Lookup_F2z::operator()(double) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:271:12 (libmrtrix-core.so+0x2e206c)
    #10 MR::Math::Zstatistic::F2z(double, unsigned long, unsigned long) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:108:10 (libmrtrix-core.so+0x2e1dcc) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #11 MR::Math::Stats::GLM::TestFixedHomoscedastic::operator()(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/stats/glm.cpp:600:34 (libmrtrix-core.so+0x27dd10) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #12 MR::Math::Stats::GLM::TestBase::operator()(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/stats/glm.cpp:519:3 (libmrtrix-core.so+0x27d150) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #13 MR::Stats::PermTest::Processor::operator()(MR::Math::Stats::Shuffle const&) /Users/daljitsingh/Documents/Dev/mrtrix3/src/stats/permtest.cpp:93:3 (libmrtrix-headless.so+0x3b0460) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #14 MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::execute() /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread_queue.h:743:12 (libmrtrix-headless.so+0x3b8bcc) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #15 void std::__invoke_impl<void, void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>(std::__invoke_memfun_deref, void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*&&)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0x3b9110) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #16 std::__invoke_result<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>::type std::__invoke<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>(void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*&&)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0x3b9110)
    #17 void std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13 (libmrtrix-headless.so+0x3b9110)
    #18 std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11 (libmrtrix-headless.so+0x3b9110)
    #19 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:1409:6 (libmrtrix-headless.so+0x3b9110)
    #20 std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter> std::__invoke_impl<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>(std::__invoke_other, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (libmrtrix-headless.so+0x3b9110)
    #21 std::enable_if<is_invocable_r_v<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>, std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> >::type std::__invoke_r<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>(std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114:9 (libmrtrix-headless.so+0x3b9110)
    #22 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void> >::_M_invoke(std::_Any_data const&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:290:9 (libmrtrix-headless.so+0x3b9110)
    #23 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9 (libmrtrix-headless.so+0xb17fc) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #24 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:571:27 (libmrtrix-headless.so+0xb17fc)
    #25 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0xb1950) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #26 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0xb1950)
    #27 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:776:4 (libmrtrix-headless.so+0xb1950)
    #28 std::once_flag::_Prepare_execution::_Prepare_execution<void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()>(void (std::__future_base::_State_baseV2::*&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*))::'lambda'()::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:712:21 (libmrtrix-headless.so+0xb1950)
    #29 std::once_flag::_Prepare_execution::_Prepare_execution<void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()>(void (std::__future_base::_State_baseV2::*&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*))::'lambda'()::__invoke() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:712:16 (libmrtrix-headless.so+0xb1950)
    #30 pthread_once <null> (vectorstats+0x592a4) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #31 __gthread_once(int*, void (*)()) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/aarch64-linux-gnu/c++/11/bits/gthr-default.h:700:12 (libmrtrix-headless.so+0xb1540) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #32 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:783:21 (libmrtrix-headless.so+0xb1540)
    #33 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:411:2 (libmrtrix-headless.so+0xb1540)
    #34 std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:1748:6 (libmrtrix-headless.so+0x3b8e14) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #35 void std::__invoke_impl<void, void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>(std::__invoke_memfun_deref, void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*&&)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0x3b9328) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #36 std::__invoke_result<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>::type std::__invoke<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>(void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*&&)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0x3b9328)
    #37 void std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13 (libmrtrix-headless.so+0x3b9328)
    #38 std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11 (libmrtrix-headless.so+0x3b9328)
    #39 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (libmrtrix-headless.so+0x3b9328)
    #40 <null> <null> (libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)

  Previous write of size 4 at 0xfffff6fd6068 by thread T2:
    #0 lgamma <null> (vectorstats+0x7ba18) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #1 MR::Math::betaincreg(double, double, double) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/betainc.cpp:65:67 (libmrtrix-core.so+0x270864) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #2 MR::Math::(anonymous namespace)::F2z_lower(double, unsigned long, double) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:56:18 (libmrtrix-core.so+0x2e206c) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #3 MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2::operator()(double) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:267:57 (libmrtrix-core.so+0x2e206c)
    #4 double std::__invoke_impl<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>(std::__invoke_other, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (libmrtrix-core.so+0x2e206c)
    #5 std::enable_if<is_invocable_r_v<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>, double>::type std::__invoke_r<double, MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double>(MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114:9 (libmrtrix-core.so+0x2e206c)
    #6 std::_Function_handler<double (double), MR::Math::Zstatistic::Lookup_F2z::operator()(double) const::$_2>::_M_invoke(std::_Any_data const&, double&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:290:9 (libmrtrix-core.so+0x2e206c)
    #7 std::function<double (double)>::operator()(double) const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9 (libmrtrix-core.so+0x2e206c)
    #8 MR::Math::Zstatistic::LookupBase::interp(double, double, double, Eigen::Array<double, -1, 1, 0, -1, 1> const&, std::function<double (double)>) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:134:10 (libmrtrix-core.so+0x2e206c)
    #9 MR::Math::Zstatistic::Lookup_F2z::operator()(double) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:271:12 (libmrtrix-core.so+0x2e206c)
    #10 MR::Math::Zstatistic::F2z(double, unsigned long, unsigned long) /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/zstatistic.cpp:108:10 (libmrtrix-core.so+0x2e1dcc) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #11 MR::Math::Stats::GLM::TestFixedHomoscedastic::operator()(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/stats/glm.cpp:600:34 (libmrtrix-core.so+0x27dd10) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #12 MR::Math::Stats::GLM::TestBase::operator()(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) const /Users/daljitsingh/Documents/Dev/mrtrix3/core/math/stats/glm.cpp:519:3 (libmrtrix-core.so+0x27d150) (BuildId: a79c4f27541ceb99d46ea139cc93422db8b4cb08)
    #13 MR::Stats::PermTest::Processor::operator()(MR::Math::Stats::Shuffle const&) /Users/daljitsingh/Documents/Dev/mrtrix3/src/stats/permtest.cpp:93:3 (libmrtrix-headless.so+0x3b0460) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #14 MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::execute() /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread_queue.h:743:12 (libmrtrix-headless.so+0x3b8bcc) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #15 void std::__invoke_impl<void, void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>(std::__invoke_memfun_deref, void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*&&)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0x3b9110) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #16 std::__invoke_result<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>::type std::__invoke<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*>(void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*&&)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0x3b9110)
    #17 void std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13 (libmrtrix-headless.so+0x3b9110)
    #18 std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11 (libmrtrix-headless.so+0x3b9110)
    #19 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:1409:6 (libmrtrix-headless.so+0x3b9110)
    #20 std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter> std::__invoke_impl<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>(std::__invoke_other, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 (libmrtrix-headless.so+0x3b9110)
    #21 std::enable_if<is_invocable_r_v<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>, std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> >::type std::__invoke_r<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&>(std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114:9 (libmrtrix-headless.so+0x3b9110)
    #22 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void> >::_M_invoke(std::_Any_data const&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:290:9 (libmrtrix-headless.so+0x3b9110)
    #23 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9 (libmrtrix-headless.so+0xb17fc) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #24 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:571:27 (libmrtrix-headless.so+0xb17fc)
    #25 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0xb1950) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #26 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0xb1950)
    #27 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:776:4 (libmrtrix-headless.so+0xb1950)
    #28 std::once_flag::_Prepare_execution::_Prepare_execution<void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()>(void (std::__future_base::_State_baseV2::*&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*))::'lambda'()::operator()() const /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:712:21 (libmrtrix-headless.so+0xb1950)
    #29 std::once_flag::_Prepare_execution::_Prepare_execution<void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::'lambda'()>(void (std::__future_base::_State_baseV2::*&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*))::'lambda'()::__invoke() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:712:16 (libmrtrix-headless.so+0xb1950)
    #30 pthread_once <null> (vectorstats+0x592a4) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #31 __gthread_once(int*, void (*)()) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/aarch64-linux-gnu/c++/11/bits/gthr-default.h:700:12 (libmrtrix-headless.so+0xb1540) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #32 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/mutex:783:21 (libmrtrix-headless.so+0xb1540)
    #33 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:411:2 (libmrtrix-headless.so+0xb1540)
    #34 std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/future:1748:6 (libmrtrix-headless.so+0x3b8e14) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #35 void std::__invoke_impl<void, void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>(std::__invoke_memfun_deref, void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*&&)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14 (libmrtrix-headless.so+0x3b9328) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #36 std::__invoke_result<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>::type std::__invoke<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*>(void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*&&)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 (libmrtrix-headless.so+0x3b9328)
    #37 void std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13 (libmrtrix-headless.so+0x3b9328)
    #38 std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11 (libmrtrix-headless.so+0x3b9328)
    #39 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<void (MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >::*)(), MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >*> >, void>*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13 (libmrtrix-headless.so+0x3b9328)
    #40 <null> <null> (libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)

  Location is global '__signgam' of size 4 at 0xfffff6fd6068 (libm.so.6+0x96068)

  Thread T5 (tid=4231, running) created by main thread at:
    #0 pthread_create <null> (vectorstats+0x557a8) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd3320) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    #2 MR::Thread::(anonymous namespace)::__multi_thread<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >::__multi_thread(MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >&, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:147:25 (libmrtrix-headless.so+0x3b41c8) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #3 MR::Thread::(anonymous namespace)::__run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >::operator()(MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:230:80 (libmrtrix-headless.so+0x3b41c8)
    #4 MR::Thread::(anonymous namespace)::__run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >::type MR::Thread::run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >(MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:363:10 (libmrtrix-headless.so+0x3b41c8)
    #5 void MR::Thread::run_queue<MR::Math::Stats::Shuffler&, MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >(MR::Math::Stats::Shuffler&, MR::Math::Stats::Shuffle const&, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor>&&, unsigned long) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread_queue.h:958:13 (libmrtrix-headless.so+0x3b41c8)
    #6 MR::Stats::PermTest::run_permutations(std::shared_ptr<MR::Math::Stats::GLM::TestBase>, std::shared_ptr<MR::Stats::EnhancerBase>, Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, bool, Eigen::Matrix<double, -1, -1, 0, -1, -1>&, Eigen::Array<unsigned int, -1, -1, 0, -1, -1>&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) /Users/daljitsingh/Documents/Dev/mrtrix3/src/stats/permtest.cpp:203:5 (libmrtrix-headless.so+0x3b41c8)
    #7 run() /Users/daljitsingh/Documents/Dev/mrtrix3/cmd/vectorstats.cpp:270:5 (vectorstats+0xe1ba0) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #8 main /Users/daljitsingh/Documents/Dev/mrtrix3/core/command.h:108:5 (vectorstats+0xdc684) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)

  Thread T2 (tid=4228, running) created by main thread at:
    #0 pthread_create <null> (vectorstats+0x557a8) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd3320) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    #2 MR::Thread::(anonymous namespace)::__multi_thread<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >::__multi_thread(MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >&, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:147:25 (libmrtrix-headless.so+0x3b41c8) (BuildId: 4c46683fa7df29948448f4fe2d01c2caf8c1750e)
    #3 MR::Thread::(anonymous namespace)::__run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >::operator()(MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:230:80 (libmrtrix-headless.so+0x3b41c8)
    #4 MR::Thread::(anonymous namespace)::__run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >::type MR::Thread::run<MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > > >(MR::Thread::(anonymous namespace)::__Multi<MR::Thread::(anonymous namespace)::__Sink<MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> > >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread.h:363:10 (libmrtrix-headless.so+0x3b41c8)
    #5 void MR::Thread::run_queue<MR::Math::Stats::Shuffler&, MR::Math::Stats::Shuffle, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor> >(MR::Math::Stats::Shuffler&, MR::Math::Stats::Shuffle const&, MR::Thread::(anonymous namespace)::__Multi<MR::Stats::PermTest::Processor>&&, unsigned long) /Users/daljitsingh/Documents/Dev/mrtrix3/core/thread_queue.h:958:13 (libmrtrix-headless.so+0x3b41c8)
    #6 MR::Stats::PermTest::run_permutations(std::shared_ptr<MR::Math::Stats::GLM::TestBase>, std::shared_ptr<MR::Stats::EnhancerBase>, Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, bool, Eigen::Matrix<double, -1, -1, 0, -1, -1>&, Eigen::Array<unsigned int, -1, -1, 0, -1, -1>&, Eigen::Matrix<double, -1, -1, 0, -1, -1>&) /Users/daljitsingh/Documents/Dev/mrtrix3/src/stats/permtest.cpp:203:5 (libmrtrix-headless.so+0x3b41c8)
    #7 run() /Users/daljitsingh/Documents/Dev/mrtrix3/cmd/vectorstats.cpp:270:5 (vectorstats+0xe1ba0) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)
    #8 main /Users/daljitsingh/Documents/Dev/mrtrix3/core/command.h:108:5 (vectorstats+0xdc684) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be)

SUMMARY: ThreadSanitizer: data race (/Users/daljitsingh/Documents/Dev/mrtrix3/build_linux/bin/vectorstats+0x7ba18) (BuildId: 22da1b465ace08490c25eb77a6b35ac7133081be) in lgamma
==================
vectorstats: [100%] Running permutations
ThreadSanitizer: reported 1 warnings

50% tests passed, 1 tests failed out of 2

Total Test time (real) =   0.64 sec

The following tests FAILED:
    471 - bin_vectorstats_2 (Failed)
Errors while running CTest

The issue seems to be triggered in the function std::lgamma in core/math/betainc.cpp:65. It's not fully clear to me whether this is a problem with our code or not ( addressing #2795 would make the code much more readable and easier to debug). However, it seems likely that this is an issue in the function itself as apparently std::lgamma is not guaranteed to be thread safe (e.g. see here). From cppreference:

The POSIX version of lgamma is not thread-safe: each execution of the function stores the sign of the gamma function of num in the static external variable signgam. Some implementations provide lgamma_r, which takes a pointer to user-provided storage for singgam as the second parameter, and is thread-safe.

It's worth noting that I can only reproduce this on Linux, while on MacOS there is no race condition. This helps corrobrate the idea that issue lies in the speficic implementation of lgamma on Linux.

daljit46 commented 6 months ago

I found an implementation of the natural logarithm of the gamma function in the Ruby language repository. After subsituting std::lgamma with that implemenation, it seems that the race condition disappear. The other alternative is use Boost, but it may be overkill for our needs.

Lestropie commented 6 months ago

Yikes. Okay. I was not aware of the std::lgamma non-thread-safety. Having said that, I'd have thought that the code would be okay: if the surrounding code was doing as intended, it should have precluded this. So hopefully the issue can be rectified there.

The transformation from t-statistic / F-statistic to Z-statistic is computationally expensive. So I use pre-generation of a lookup table, on which linear interpolation is then performed. In the TestFixedHomoscedastic case here, there's only one such lookup table, since the degrees of freedom is always the same. However because there are other more complex cases (different DoF for different elements being tested), I don't immediately generate this table up-front; instead, I generate a table whenever an element is tested that has a DoF for which a table has not yet been generated.

What I suspect it doesn't like is the fact that I'm trying to avoid mutexing every single time I read from one of those tables. What I do is (code):

Check for the presence of the required table
If it doesn't exist:
1. Acquire a mutex lock
2. Check again for its presence (chance another thread may have created it between first check and our acquiring the lock)
3. If it still doesn't exist, generate it
Make use of table

So is the problem that the release of the mutex lock is insufficient for the newly generated data to become thread-synchronized? There's probably some other more appropriate design pattern. Been too long since I played with threading primitives. But since these transformations are done for every element (eg. voxel, fixel) for every permutation, I would like for it to be possible to read from shared data without having to thread lock every single time.

This is still only for the fixed homoscedastic case. If std::lgamma() isn't thread-safe, then over and above whether the current code is doing as intended, I shouldn't be permitting two different threads to be generating two different tables at the same time.

It's also currently possible to disable use of the lookup tables by commenting this line. If std::lgamma() is not thread safe on Linux, then it shouldn't be permissible to compile with that line uncommented. Unless we now include a non-std lgamma() implementation.

daljit46 commented 6 months ago

I shouldn't be permitting two different threads to be generating two different tables at the same time.

Yes, effectively we need to provide a global mutex (or the equivalent logic) that protects calls to lgamma in the entire app (which is what I think what the Mac implementation is doing anyway).

This simple piece of code compiled with TSAN reports a race condition:

#include <cmath>
#include <iostream>
#include <thread>

int main ()
{
    std::thread t1([]{
        for (int i = 0; i < 100000000; ++i) {
            std::lgamma(i);
        }
    });

    std::thread t2([]{
        for (int i = 0; i < 100000000; ++i) {
            std::lgamma(i);
        }
    });

    t1.join();
    t2.join();

    std::cout << "Done" << std::endl;
}

So I think we need to find an alternative to lgamma or just ensure that no two threads are calling it simultaneously.

I wrote a small microbenchmark here to get a vague idea on how that implementation in the Ruby repo might perform and it appears reasonably fast compared to lgamma. So using that may be an option (I'm not sure what are its accuracy guarantee, but since it's by the Ruby guys it may be reliable enough).

Lestropie commented 6 months ago

The code around the t->Z and F->Z conversions is complex precisely because of numerical accuracy problems. Indeed I added a unit test for the erfinv() function for that reason. So if changing one of the underlying functions I'd definitely need to be verifying closely.

My concern is that addressing having two threads generating tables at the same time may not fully resolve: in the use case you pasted, there should only need to be one table generated, yet a problem is still reported. So presumably two threads are generating the same table at the same time despite the current checks.

Perhaps we should avoid std::lgamma() as a first pass, given that I'd like for it to be possible to run without use of the lookup tables, and then if there's still detected threading problems, I can then look into the shared data synchronization.

I wrote a small microbenchmark here

Your benchmark runs the log-abs-gamma against log-gamma only, across positive values only. It's been too long since I did this work to recall whether compatibility with negative values is required here; but at least the former will have a code branch that the latter doesn't, which could bias the benchmark.

RE code structure:

I'm way out of practise here. I want threads to be able to generate these as needed, but for each unique table to be generated only once, by whichever thread needs it first, and for other threads to then synchrnoise with the shared data and be able to make use of them. And only one thread can be generating a table at any given time (unlike intent of current code, which tries to allow different threads to generate different tables at the same time). Importantly, data reads need to be cheap and non-locking wherever possible.

So I think I need something like:

If cannot already see table:
- Grab empty spinlock A with load barrier, to ensure that subsequent read from shared data includes any new data written by other threads
- If table is still not present:
  - Obtain spinlock B with full barrier
  - If table is still not present:
    - Generate it
    - Write data to variable that is checked with spinlock A
Use table

?

Lestropie commented 6 months ago

Looking at the log gamma function and where it's invoked from, the only way that the negative sign bit could be set would be if either the DoF or the rank were to be between 2 and 4.
It might not necessarily be the Linux / Mac distinction: if MRTRIX_HAVE_EIGEN_UNSUPPORTED_SPECIAL_FUNCTIONS is set, then MR::Math::betaincreg(double, double, double) won't be called, an internal Eigen function is used instead.
On that point, Eigen itself has an lgamma() function, and that appears to be in the main Eigen namespace, not Unsupported. So that might be the better solution.

Lestropie commented 6 months ago

If we were to grab Eigen code in cmake as per #2584, we could hard-code to grab a version that includes unsupported functions, in which case the std::lgamma() calls would disappear.

daljit46 commented 6 months ago

I misread the definition of lgamma, what it does is to compute the natural log of the absolute value of the gamma function. So the benchmark above is useless and silly. Here's a better benchmark, that tests values between -4 and 4 (skipping -3,-2,-1 as that the $\Gamma(z)$ would be undefined). The alternative implementation seems to be slightly slower but not by much (plus it's reentrant so can be safely used from multiple threads). Of course this small test is not really indicative of actual performance and far from conclusive. Only proper testing within our codebase will tell us about any performance differences.

To evaluate precision, I wrote a very simple (and naive) program to see how the value of the alternative implementation differs from std::lgamma here and found that the we can expect a difference in values in the order $10^{-11}$. Of course, by no means this is conclusive, but it seems promising.

It might not necessarily be the Linux / Mac distinction: if MRTRIX_HAVE_EIGEN_UNSUPPORTED_SPECIAL_FUNCTIONS is set, then MR::Math::betaincreg(double, double, double) won't be called, an internal Eigen function is used instead.

In my testing, this was disabled. Furthermore, I cannot reproduce the race condition even with the simple two-threaded program I posted above on MacOS, but I can on Linux.

If we were to grab Eigen code in cmake as per https://github.com/MRtrix3/mrtrix3/issues/2584, we could hard-code to grab a version that includes unsupported functions, in which case the std::lgamma() calls would disappear.

That's good to hear! If Eigen has an independent implementation then using it would also be my preferred solution. I just hope that internally it doesn't call the OS' implementation of lgamma. Will test later.

Lestropie commented 6 months ago

In the way that std::lgamma() is used inside MR::Math::betaincreg(), it's based on the degrees of freedom in the model, and the rank of the hypothesis, both of which are strictly positive (and DoF could be up to, hypothetically, say, 10,000).

There is some concern that the source of the code (website; repository) may be unaware of the distinction, and be using std::lgamma() erroneously not realising that it yields the log of the absolute value...

daljit46 commented 6 months ago

Ok so testing with MRTRIX_HAVE_EIGEN_UNSUPPORTED_SPECIAL_FUNCTIONS, I can see that the data race is no longer there. In the implementation of Eigen::lgamma I can see that they are using lgamma_r, which is the re-entrant version of lgamma. If this is not present, they still call lgamma internally. Looking at their code, lgamma_r should be available on Linux with glibc >= 2.19, but not on Apple platforms. However, on mac OS the function seems to be implemented using a global mutex (e.g. see here) thus thread safe but inherently slower.

Should we just stick with the Eigen implementation and effectively keep using lgamma on Mac OS? The other alternative is to use a different implementation of lgamma (other than the Ruby repo, I also found an implementation in scipy). We could also do manual synchronisation ourselves, but I'm not sure if the effort would be worth it.

Lestropie commented 6 months ago

Bigger question is, in the case of having MRTRIX_HAVE_EIGEN_UNSUPPORTED_SPECIAL_FUNCTIONS, whether Eigen::betainc() actually makes use of any form of lgamma(). If the implementation of that function doesn't rely on lgamma() in any way, then if we can set up the compilation environment to guarantee its presence, then lgamma() becomes a red herring, since there would no longer be any dependence on it.

daljit46 commented 6 months ago

whether Eigen::betainc() actually makes use of any form of lgamma().

This seems to be the case (e.g. see here).

Lestropie commented 6 months ago

OK so there's not likely to be any clean solution that isn't without downsides here.

Force use of Eigen::betainc() via automatic build dependency (currently only used if available on system):
- :+1: Reduce size of our code
- :-1: Possible thread locking on Mac, which could hurt inference performance @daljit46: How confident are you that your link is relevant here? This page says lgamma_r() is missing only on a handful of platforms. The page you linked makes no mention of lgamma_r(), so maybe those posters were just unaware of it. If it were not for mutex locking on Mac, this would be my preference. Can you confirm for sure that lgamma_r() is absent on Mac? Because if it's there, we could for a quick fix change to using that function within MR::Math::betainc(), with a vision of possibly removing that function later.
Provide own implementation of log of absolute value of gamma function
- :-1: More code in repository that is duplicated from elsewhere
- :-1: Would be avoiding use of functionality that is explicitly provided within Eigen
- Continue using custom implementation of regularised incomplete beta function, where it is currently uncertain as to whether the creator is aware of this detail of std::lgamma()
- :+1: Re-entrant guarantees as we would control the full code stack
Flag threading race condition to be ignored by sanitiser (We never read from the sign bit, so don't care if there is a race in what writes to it when)
- :-1: Still uncertainty about whether ignoring that bit is in fact a bug in the regularised incomplete beta function algorithm

Lestropie commented 6 months ago

Notably the Egein code seems to be using lgamma() in a similar way. So if the signedness of the gamma function is relevant, Eigen has made the same error. Which probably makes it more likely that it's not a problem.

Lestropie commented 6 months ago

I think I've been tying myself in knots again. It's not about whether the logarithm of the gamma function is negative, but whether the gamma function itself is negative. Here the inputs (DoF / rank) are exclusively positive, in which case the gamma function is always positive. So in our use case what's written to the sign bit is not only never read from, but should also be 1 each and every time.

So I think the solution here is actually to tell thread sanitiser to ignore this race condition. It's inconsequential, and we don't want to incur mutex locking on Mac.

Lestropie commented 6 months ago

Also the regularised incomplete beta function is defined only for a, b > 0.

Lestropie commented 6 months ago

@daljit46 I might need your help with cmake here.

What I think I want to do is:

[ ] Have MR::Math::betaincreg() throw an exception not only on x being outside of [0.0, 1.0], but also a or b not being positive
[ ] If MRTRIX_HAVE_EIGEN_UNSUPPORTED_SPECIAL_FUNCTIONS is not defined: (I'm currently not seeing where / how this is being set)
- [ ] Run a compilation test to see if std::lgamma_r() is present, and set an environment variable accordingly
- [ ] Within MR::Math::betaincreg(), #ifdefto usestd::lgamma_r()rather thanstd::lgamma()` if available
[ ] Either:
- [ ] Exclude this file from thread sanitiser tests
- [ ] Add a fat comment in the source code stating that a race condition there is not of consequence

Previously in configure we'd specify our own test commands to attempt to compile and run, and set envvars accordingly. With cmake it looks like find_package() and target_compile_definitions() are doing a lot of the heavy lifting, so I'm not immediately seeing a template that I can modify to evaluate whether this function is available.

daljit46 commented 6 months ago

So I think the solution here is actually to tell thread sanitiser to ignore this race condition. It's inconsequential, and we don't want to incur mutex locking on Mac.

My opinion is that we should try to avoid this "solution". The problem is that in C++, a data race is undefined behaviour ans there are virtually zero guarantees that there will be no fatal consequences. Thus a C++ program is allowed to crash or worse even perform a nonsensical transformation (or even set fire to your device) without contradicting the ISO C++ specification. Even if we know that a specific implementation of the data race on a certain platform behaves as expected, there is no guarantee that this will hold with future versions of the compiler and the platform.

daljit46 commented 6 months ago

Additionally, I see that lgamma_r is available in libc on Mac OSX (at least since 10.5). So we could explicitly use it like this:

#include <iostream>

extern "C" double lgamma_r(double, int *);

int main()
{
    int sign;
    std::cout << lgamma_r(5.5, &sign) << std::endl;
    return 0;
}

jdtournier commented 6 months ago

OK, I've had a quick look into this, and came across much the same posts as you already have. The main thing I was looking into was whether a data race that involves writes only, with no intended or actual reading of the racing variable, was likely to be a problem. As far as I can tell, all the examples of undefined behaviour I've seen involve getting the wrong value of the variable - which may indeed have nasty sides effects - but only if the code actually ever needs to read the variable. If all the code does is write to the variable, it really doesn't matter how it reorders operations - as least I can't see how any of the examples I've come across would apply.

There is however one interesting example that involves re-ordering operations and speculative early loading - and that seems to apply to precisely the use case that @Lestropie was talking about in this previous post, with this pseudo-code:

Check for the presence of the required table

If it doesn't exist:

Acquire a mutex lock

Check again for its presence (chance another thread may have created it between first check and our acquiring the lock)

If it still doesn't exist, generate it

Make use of table

This looks similar to the example in section 2.1 Double checks for lazy initialization of Hans-J. Boehm's How to miscompile programs with “benign” data races paper. Maybe that's all we need to fix, assuming the generation of the table is then guaranteed to be single-threaded...?

Lestropie commented 6 months ago

The lookup table generation may be a more difficult fix than just the use of std::lgamma() vs. lgamma_r().

The theory there is that:

Different elements being tested may have different degrees of freedom.
There may be different hypotheses for which the ranks may differ.

Therefore there may be different t->Z / F->Z transforms required.

What the current code is attempting to achieve is: for any given rank / degrees of freedom, the lookup table should be generated just once, and then made available for all threads to read from. The current design is however seemingly Bad.

Options:

Mutex lock on reading from the set of already generated lookup tables, possibly generating a new table if necessary. Might detrimentally affect performance when using many threads.
Rather than a full mutex lock, enhance the current logic with an appropriate spinlock, where only the requisite synchronisation of read vs. write operations is performed.
There might be some possibility through eg. use of condition variables to broadcast the addition of some new table to the shared class instance, in which case other threads will do a forced synchronisation upon their next read. Speculative in my mind at the moment, would need to spend more time thinking about it.
Allow individual threads to generate their own lookup tables; would duplicate computation & storage, but prevent the race condition.
Write additional code that does a first pass through the dataset to determine which transformations are required, generate the lookup tables for those, and then commence multi-threaded processing.

MRtrix3 / mrtrix3

Data race in vectorstats - std::lgamma is not thread safe on Linux #2857