Closed bangerth closed 1 month ago
I was told that the following stack-trace corresponds to the issue:
#0 in __pthread_once_slow
#1 in std::call_once<void
at /data1/GCC/gcc-14.2.0/include/c++/14.2.0/mutex:916
#2 in std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base,
at /data1/GCC/gcc-14.2.0/include/c++/14.2.0/future:435
#3 in std::promise<std::unique_ptr<dealii::FiniteElement<2,
at /data1/GCC/gcc-14.2.0/include/c++/14.2.0/future:1166
#4 in dealii::Threads::internal::evaluate_and_set_promise<std::unique_ptr<dealii::FiniteElement<2,
at dealii/include/deal.II/base/thread_management.h:921
#5 in dealii::Threads::Task<std::unique_ptr<dealii::FiniteElement<2,
at dealii/include/deal.II/base/thread_management.h:1095
#6 in dealii::Threads::new_task<std::unique_ptr<dealii::FiniteElement<2,
at dealii/include/deal.II/base/thread_management.h:1484
#7 in dealii::Threads::new_task<dealii::FEValues<2,
at dealii/include/deal.II/base/thread_management.h:1571
.
.
.
#69 in main
at mcmc-laplace.cc:1293
So my hunch was right.
Fixed by 75b36041acb9bb5589d44efa3653b56e8c072943.
John reports that there are many many many calls to
futex
andsched_yield
from here:The code in question looks like this:
The reason this is calling mutex locks so much is because we're not actually using task-based parallelism in this benchmark (other than the thread pool stuff in the sampler) but end up here:
which I believe ends up in the second marked line, which itself looks like this:
That said, it could also be the first of the two marked lines -- investigation necessary.
This should be relatively easy to avoid at least for the benchmark. I just need to not call
new_task()
here, which shouldn't be doing anything anyway here.