oneapi-src / oneTBB

oneAPI Threading Building Blocks (oneTBB)
https://oneapi-src.github.io/oneTBB/
Apache License 2.0
5.59k stars 1.01k forks source link

Test test_buffer_node sporadically hangs on x86_64 #1467

Open phprus opened 1 month ago

phprus commented 1 month ago

Summary

Test test_buffer_node sporadically hangs on x86_64 Previous discussion: https://github.com/oneapi-src/oneTBB/pull/1445#issuecomment-2260671869

@pavelkumbrasev , @kboyarinov

Version

Commit: 306c75f5e906fc221e12f45dae45900f0d8db6ba

Environment

OS: Debian 12 gcc version 12.2.0 (Debian 12.2.0-14)

Steps To Reproduce

cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_STANDARD=20 -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON ../..
cmake --build . --verbose --config Release -j8
ctest --timeout 0 --build-config RelWithDebInfo -R test_buffer_node --repeat-until-fail 10000000

Backtrace:

phprus@srv2:~$ gdb /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/build/cpp20rwd/gnu_12.2_cxx20_64_relwithdebinfo/test_buffer_node -p 2468289
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/build/cpp20rwd/gnu_12.2_cxx20_64_relwithdebinfo/test_buffer_node...
Attaching to program: /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/build/cpp20rwd/gnu_12.2_cxx20_64_relwithdebinfo/test_buffer_node, process 2468289
[New LWP 2468290]
[New LWP 2468307]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation>::operator() (
    op_list=0x7ffff4159d90, this=0x7ffff4159fc0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:167
167         my_object->handle_operations(op_list);
(gdb) thread 1
[Switching to thread 1 (Thread 0x7feac2c36940 (LWP 2468289))]
#0  tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation>::operator() (
    op_list=0x7ffff4159d90, this=0x7ffff4159fc0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:167
167         my_object->handle_operations(op_list);
(gdb) bt
#0  tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation>::operator() (
    op_list=0x7ffff4159d90, this=0x7ffff4159fc0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:167
#1  tbb::detail::d1::aggregator_generic<tbb::detail::d2::buffer_node<int>::buffer_operation>::start_handle_operations<tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation> > (handle_operations=..., this=0x7ffff4159fb0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:129
#2  tbb::detail::d1::aggregator_generic<tbb::detail::d2::buffer_node<int>::buffer_operation>::execute<tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation> > (long_life_time=true, handle_operations=..., op=0x7ffff4159d90,
    this=0x7ffff4159fb0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:87
#3  tbb::detail::d1::aggregator<tbb::detail::d1::aggregating_functor<tbb::detail::d2::buffer_node<int>, tbb::detail::d2::buffer_node<int>::buffer_operation>, tbb::detail::d2::buffer_node<int>::buffer_operation>::execute (op=0x7ffff4159d90, this=0x7ffff4159fb0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/detail/_aggregator.h:150
#4  tbb::detail::d2::buffer_node<int>::try_get (v=@0x7ffff4159cf8: -1, this=0x7ffff4159f20)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/flow_graph.h:1422
#5  spin_try_get<int> (value=@0x7ffff4159cf8: -1, b=...)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/tbb/test_buffer_node.cpp:37
#6  test_serial<int>() [clone .isra.0] () at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/tbb/test_buffer_node.cpp:395
#7  0x000055ae0c3ce04a in operator() (__closure=0x7ffff415a500)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/tbb/test_buffer_node.cpp:468
#8  tbb::detail::d1::task_arena_function<DOCTEST_ANON_FUNC_326()::<lambda()>, void>::operator()(void) const (this=<optimized out>)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/task_arena.h:68
#9  0x00007feac2c7ced1 in tbb::detail::r1::task_arena_impl::execute (ta=..., d=warning: RTTI symbol not found for class 'tbb::detail::d1::task_arena_function<DOCTEST_ANON_FUNC_326()::{lambda()#1}, void>'
...)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/arena.cpp:822
#10 0x00007feac2c7d8f5 in tbb::detail::r1::execute (ta=..., d=...)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/arena.cpp:519
#11 0x000055ae0c3bc2f4 in tbb::detail::d1::task_arena::execute_impl<void, DOCTEST_ANON_FUNC_326()::<lambda()> > (f=..., this=0x7ffff415a540)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/task_arena.h:251
#12 tbb::detail::d1::task_arena::execute<DOCTEST_ANON_FUNC_326()::<lambda()> > (f=..., this=0x7ffff415a540)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/../../include/tbb/../oneapi/tbb/task_arena.h:404
#13 DOCTEST_ANON_FUNC_326 () at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/tbb/test_buffer_node.cpp:466
#14 0x000055ae0c3c1f32 in doctest::Context::run (this=this@entry=0x7ffff415a9e0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/common/doctest.h:7060
#15 0x000055ae0c3ad58e in main (argc=<optimized out>, argv=<optimized out>)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/test/common/doctest.h:7138
(gdb) thread 2
[Switching to thread 2 (Thread 0x7feac20aa6c0 (LWP 2468290))]
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
38  ../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007feac2c8980c in tbb::detail::r1::futex_wait (comparand=2, futex=0x7feac23b6124)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/semaphore.h:100
#2  tbb::detail::r1::binary_semaphore::P (this=0x7feac23b6124)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/semaphore.h:253
#3  tbb::detail::r1::rml::internal::thread_monitor::wait (this=0x7feac23b6120)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/rml_thread_monitor.h:235
#4  tbb::detail::r1::rml::private_worker::run (this=0x7feac23b6100)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/private_server.cpp:273
#5  tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7feac23b6100)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/private_server.cpp:221
#6  0x00007feac27c9134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#7  0x00007feac28497dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) thread 3
[Switching to thread 3 (Thread 0x7feac1ca96c0 (LWP 2468307))]
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
38  in ../sysdeps/unix/sysv/linux/x86_64/syscall.S
(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007feac2c8980c in tbb::detail::r1::futex_wait (comparand=2, futex=0x7feac23b60a4)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/semaphore.h:100
#2  tbb::detail::r1::binary_semaphore::P (this=0x7feac23b60a4)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/semaphore.h:253
#3  tbb::detail::r1::rml::internal::thread_monitor::wait (this=0x7feac23b60a0)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/rml_thread_monitor.h:235
#4  tbb::detail::r1::rml::private_worker::run (this=0x7feac23b6080)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/private_server.cpp:273
#5  tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7feac23b6080)
    at /home/phprus/test/tbb/oneTBB-306c75f5e906fc221e12f45dae45900f0d8db6ba/src/tbb/private_server.cpp:221
#6  0x00007feac27c9134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#7  0x00007feac28497dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) thread 4
Unknown thread 4.
pavelkumbrasev commented 1 month ago

Hi @phprus, thank you for the reproduction of the bug.