oneapi-src / oneTBB

oneAPI Threading Building Blocks (oneTBB)
https://oneapi-src.github.io/oneTBB/
Apache License 2.0
5.68k stars 1.02k forks source link

Test ``test_collaborative_call_once`` sporadically hangs on CI #712

Open phprus opened 2 years ago

phprus commented 2 years ago

CI error in my fork for upstream commit d1667d514df697f05d771602b268e92560c434c4:

https://github.com/phprus/oneTBB/runs/4691593788?check_suite_focus=true

Build: ubuntu-20.04_g++_cxx17_release_preview=ON

        Start   5: test_collaborative_call_once
  5/134 Test   #5: test_collaborative_call_once .............***Timeout 180.11 sec

...
99% tests passed, 1 tests failed out of 134

Total Test time (real) = 403.71 sec

The following tests FAILED:
      5 - test_collaborative_call_once (Timeout)
Errors while running CTest
Error: Process completed with exit code 8.
0s
anton-potapov commented 2 years ago

@isaevil could you please take a look ?

isaevil commented 2 years ago

@phprus @anton-potapov Yeah, I've seen this sporadic hang on CI but I couldn't reproduce it locally.

phprus commented 2 years ago

Another hang: https://github.com/oneapi-src/oneTBB/runs/4919469037?check_suite_focus=true

phprus commented 2 years ago

I was able to reproduce this bug locally.

Commit: bc7e83a2f081868ef6bd05acbdc767da5e0a0af3 Compiler: Visual Studio 2019 14.29 C++: 20 Arch: Win32 OS: Windows 7 SP1 with all updates RAM: 4GB CPU: 2 virtual core (1 physical core)

First - crash:

115/953 Test #115: test_collaborative_call_once ................................
.....***Failed    0.09 sec
TBB Warning: The number of workers is currently limited to 1. The request for 24
7836375 workers is ignored. Further requests for more workers will be silently i
gnored until the limit changes.

[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options
===============================================================================
F:\...\3rdparty\onetbb\src\test\tbb\test_collaborative_call_once.cpp(211
):
TEST CASE:  only calls once - stress test

F:\...\3rdparty\onetbb\src\test\tbb\test_collaborative_call_once.cpp(211
): FATAL ERROR: test case CRASHED: SIGABRT - Abort (abnormal termination) signal

===============================================================================
[doctest] test cases:   4 |   3 passed | 1 failed | 5 skipped
[doctest] assertions: 779 | 779 passed | 0 failed |
[doctest] Status: FAILURE!

        Start 116: test_concurrent_lru_cache

188/953 Test #188: conformance_collaborative_call_once .........................
.....***Failed    0.03 sec
TBB Warning: The number of workers is currently limited to 1. The request for 42
27199 workers is ignored. Further requests for more workers will be silently ign
ored until the limit changes.

[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options
===============================================================================
F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_ca
ll_once.cpp(89):
TEST CASE:  Exception is received only by winner thread

F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_ca
ll_once.cpp(89): FATAL ERROR: test case CRASHED: SIGABRT - Abort (abnormal termi
nation) signal

===============================================================================
[doctest] test cases: 3 | 2 passed | 1 failed | 0 skipped
[doctest] assertions: 7 | 7 passed | 0 failed |
[doctest] Status: FAILURE!

Second - hang: test_collaborative_call_once used one core at 100%

Backtrace thread 0:

    [External Code] 
    ntdll.dll![Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]   Unknown
>   [Inline Frame] tbb12.dll!tbb::detail::r1::binary_semaphore::P() Line 217    C++
    tbb12.dll!tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>::wait() Line 172 C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::commit_wait(tbb::detail::r1::wait_node<tbb::detail::r1::market_context> &) Line 232 C++
    tbb12.dll!tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::wait<tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>,`tbb::detail::r1::external_waiter::pause'::`2'::<lambda_1> &>(tbb::detail::r1::external_waiter::pause::__l2::<lambda_1> & pred, tbb::detail::r1::sleep_node<tbb::detail::r1::market_context> && node) Line 262   C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::sleep_waiter::sleep(unsigned int uniq_tag, tbb::detail::r1::external_waiter::pause::__l2::<lambda_1> wakeup_condition) Line 118   C++
    tbb12.dll!tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot &) Line 144   C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::receive_or_steal_task<0,tbb::detail::r1::external_waiter>(tbb::detail::r1::thread_data & tls, tbb::detail::r1::execution_data_ext & ed, tbb::detail::r1::external_waiter & waiter, int isolation, bool fifo_allowed, bool critical_allowed) Line 232    C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<0,tbb::detail::r1::external_waiter>(tbb::detail::d1::task * t, tbb::detail::r1::external_waiter & waiter) Line 350   C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all(tbb::detail::d1::task *) Line 458 C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task * t, tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 172 C++
    tbb12.dll!tbb::detail::r1::execute_and_wait(tbb::detail::d1::task & t, tbb::detail::d1::task_group_context & t_ctx, tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 121 C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::execute_and_wait(tbb::detail::d1::task &) Line 191 C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned int>,`call_once_in_parallel_for<`DOCTEST_ANON_FUNC_57'::`2'::<lambda_1> &,move_only_type>'::`2'::<lambda_1>,tbb::detail::d1::auto_partitioner const>::run(const tbb::detail::d1::blocked_range<unsigned int> &) Line 114 C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned int>,`call_once_in_parallel_for<`DOCTEST_ANON_FUNC_57'::`2'::<lambda_1> &,move_only_type>'::`2'::<lambda_1>,tbb::detail::d1::auto_partitioner const>::run(const tbb::detail::d1::blocked_range<unsigned int> &) Line 103 C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::parallel_for(const tbb::detail::d1::blocked_range<unsigned int> &) Line 231    C++
    test_collaborative_call_once.exe!call_once_in_parallel_for<`DOCTEST_ANON_FUNC_57'::`2'::<lambda_1> &,move_only_type>(unsigned int body, DOCTEST_ANON_FUNC_57::__l2::<lambda_1> & <args_0>, move_only_type &&) Line 91   C++
    test_collaborative_call_once.exe!DOCTEST_ANON_FUNC_57() Line 195    C++
    test_collaborative_call_once.exe!doctest::Context::run() Line 6725  C++
    test_collaborative_call_once.exe!main(int argc, char * * argv) Line 6809    C++
    [External Code] 

Backtrace thread 1:

    [External Code] 
    ntdll.dll![Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]   Unknown
>   [Inline Frame] test_collaborative_call_once.exe!std::this_thread::yield() Line 180  C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d0::atomic_backoff::pause() Line 71    C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d0::spin_wait_while(const std::atomic<bool> &) Line 99 C++
    [Inline Frame] test_collaborative_call_once.exe!tbb::detail::d0::spin_wait_while_eq(const std::atomic<bool> &) Line 109 C++
    test_collaborative_call_once.exe!tbb::detail::d1::collaborative_once_runner::assist() Line 118  C++
    [External Code] 

and crash conformance_collaborative_call_once:

F:\...\build\target_v142_win32>onetbb_bin_release\conformance_collaborat
ive_call_once.exe
[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options
TBB Warning: The number of workers is currently limited to 1. The request for 67
108863 workers is ignored. Further requests for more workers will be silently ig
nored until the limit changes.

===============================================================================
F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_ca
ll_once.cpp(59):
TEST CASE:  collaborative_call_once executes function exactly once

F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_ca
ll_once.cpp(59): FATAL ERROR: test case CRASHED: SIGABRT - Abort (abnormal termi
nation) signal

===============================================================================
[doctest] test cases: 2 | 1 passed | 1 failed | 1 skipped
[doctest] assertions: 6 | 6 passed | 0 failed |
[doctest] Status: FAILURE!

and hang conformance_collaborative_call_once (used one core at 100%):

Backtrace thread 0:

    [External Code] 
    ntdll.dll![Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]   Unknown
>   [Inline Frame] tbb12.dll!tbb::detail::r1::binary_semaphore::P() Line 217    C++
    tbb12.dll!tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>::wait() Line 172 C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::commit_wait(tbb::detail::r1::wait_node<tbb::detail::r1::market_context> &) Line 232 C++
    tbb12.dll!tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::wait<tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>,`tbb::detail::r1::external_waiter::pause'::`2'::<lambda_1> &>(tbb::detail::r1::external_waiter::pause::__l2::<lambda_1> & pred, tbb::detail::r1::sleep_node<tbb::detail::r1::market_context> && node) Line 262   C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::sleep_waiter::sleep(unsigned int uniq_tag, tbb::detail::r1::external_waiter::pause::__l2::<lambda_1> wakeup_condition) Line 118   C++
    tbb12.dll!tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot &) Line 144   C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::receive_or_steal_task<0,tbb::detail::r1::external_waiter>(tbb::detail::r1::thread_data & tls, tbb::detail::r1::execution_data_ext & ed, tbb::detail::r1::external_waiter & waiter, int isolation, bool fifo_allowed, bool critical_allowed) Line 232    C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<0,tbb::detail::r1::external_waiter>(tbb::detail::d1::task * t, tbb::detail::r1::external_waiter & waiter) Line 350   C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all(tbb::detail::d1::task *) Line 458 C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task * t, tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 172 C++
    tbb12.dll!tbb::detail::r1::wait(tbb::detail::d1::wait_context & wait_ctx, tbb::detail::d1::task_group_context & w_ctx) Line 126 C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::wait(tbb::detail::d1::wait_context &) Line 197  C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::task_group_base::wait::__l2::<lambda_1>::operator()() Line 582  C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d0::try_call_proxy<`tbb::detail::d1::task_group_base::wait'::`2'::<lambda_1>>::on_completion(tbb::detail::d1::task_group_base::wait::__l2::<lambda_2>) Line 230 C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::task_group_base::wait() Line 581    C++
    conformance_collaborative_call_once.exe!DOCTEST_ANON_FUNC_54() Line 118 C++
    conformance_collaborative_call_once.exe!doctest::Context::run() Line 6725   C++
    conformance_collaborative_call_once.exe!main(int argc, char * * argv) Line 6809 C++
    [External Code] 

Backtrace thread 1:

>   conformance_collaborative_call_once.exe!tbb::detail::d1::task_arena::initialize() Line 315  C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::task_arena::execute_impl(tbb::detail::d1::collaborative_once_runner::assist::__l2::<lambda_1> &) Line 253   C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::task_arena::execute(tbb::detail::d1::collaborative_once_runner::assist::__l2::<lambda_1> &&) Line 412   C++
    conformance_collaborative_call_once.exe!tbb::detail::d1::collaborative_once_runner::assist() Line 119   C++
    conformance_collaborative_call_once.exe!tbb::detail::d1::collaborative_once_flag::do_collaborative_call_once<`tbb::detail::d1::collaborative_call_once<``DOCTEST_ANON_FUNC_54'::`4'::<lambda_1>::operator()'::`3'::<lambda_1>>'::`5'::<lambda_1> &>(tbb::detail::d1::collaborative_call_once::__l5::<lambda_1> & f) Line 191    C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d1::collaborative_call_once(tbb::detail::d1::collaborative_once_flag & flag, DOCTEST_ANON_FUNC_54::__l4::<lambda_1>::()::__l3::<lambda_1> &&) Line 220  C++
    conformance_collaborative_call_once.exe!`DOCTEST_ANON_FUNC_54'::`4'::<lambda_1>::operator()() Line 104  C++
    [Inline Frame] conformance_collaborative_call_once.exe!tbb::detail::d2::?A0x97c44df4::task_ptr_or_nullptr(const DOCTEST_ANON_FUNC_54::__l4::<lambda_1> &) Line 132  C++
    conformance_collaborative_call_once.exe!tbb::detail::d1::function_task<`DOCTEST_ANON_FUNC_54'::`4'::<lambda_1>>::execute(tbb::detail::d1::execution_data & ed) Line 453 C++
    tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all<0,tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task *) Line 322    C++
    [Inline Frame] tbb12.dll!tbb::detail::r1::task_dispatcher::local_wait_for_all(tbb::detail::d1::task *) Line 458 C++
    tbb12.dll!tbb::detail::r1::arena::process(tbb::detail::r1::thread_data & tls) Line 140  C++
    tbb12.dll!tbb::detail::r1::market::process(rml::job & j) Line 600   C++
    tbb12.dll!tbb::detail::r1::rml::private_worker::run() Line 271  C++
    tbb12.dll!tbb::detail::r1::rml::private_worker::thread_routine(void * arg) Line 223 C++
    [External Code] 
    ucrtbase.dll![Frames below may be incorrect and/or missing, no symbols loaded for ucrtbase.dll] Unknown
phprus commented 2 years ago

Compile and link command with all options:

test_collaborative_call_once.exe:

ClCompile:
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x86\CL.exe /c /IF:\...\3rdparty\onetbb\src\test\.. /IF:\...\3rdparty\onetbb\src\test /IF:\...\3rdparty\onetbb\src\src\tbb\..\..\include /Zi /W4 /WX /diagnostics:column /O2 /Ob2 /Oy- /GL /D WIN32 /D _WINDOWS /D NDEBUG /D _WIN32_WINNT=0x0601 /D "CMAKE_INTDIR=\"Release\"" /D _MBCS /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"test_collaborative_call_once.dir\Release\\" /Fd"test_collaborative_call_once.dir\Release\vc142.pdb" /external:W4 /Gd /TP /analyze- /errorReport:queue  /bigobj /Zc:__cplusplus /utf-8 /bigobj /volatile:iso /FS -std:c++20 F:\...\3rdparty\onetbb\src\test\tbb\test_collaborative_call_once.cpp
  Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30145 for x86
  Copyright (C) Microsoft Corporation.  All rights reserved.
  cl /c /IF:\...\3rdparty\onetbb\src\test\.. /IF:\...\3rdparty\onetbb\src\test /IF:\...\3rdparty\onetbb\src\src\tbb\..\..\include /Zi /W4 /WX /diagnostics:column /O2 /Ob2 /Oy- /GL /D WIN32 /D _WINDOWS /D NDEBUG /D _WIN32_WINNT=0x0601 /D "CMAKE_INTDIR=\"Release\"" /D _MBCS /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"test_collaborative_call_once.dir\Release\\" /Fd"test_collaborative_call_once.dir\Release\vc142.pdb" /external:W4 /Gd /TP /analyze- /errorReport:queue  /bigobj /Zc:__cplusplus /utf-8 /bigobj /volatile:iso /FS -std:c++20 F:\...\3rdparty\onetbb\src\test\tbb\test_collaborative_call_once.cpp
  test_collaborative_call_once.cpp
Link:
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x86\link.exe /ERRORREPORT:QUEUE /OUT:"F:\...\build\target_v142_win32\onetbb_bin_release\test_collaborative_call_once.exe" /NOLOGO ..\..\onetbb_bin_release\tbb12.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /manifest:embed /DEBUG /PDB:"F:/.../build/target_v142_win32/onetbb_bin_release/test_collaborative_call_once.pdb" /SUBSYSTEM:CONSOLE /LARGEADDRESSAWARE /LTCG:incremental /LTCGOUT:"test_collaborative_call_once.dir\Release\test_collaborative_call_once.iobj" /TLBID:1 /DYNAMICBASE /NXCOMPAT /IMPLIB:"F:/.../build/target_v142_win32/onetbb_bin_release/test_collaborative_call_once.lib" /MACHINE:X86 /SAFESEH  /machine:X86 test_collaborative_call_once.dir\Release\test_collaborative_call_once.obj
  Generating code
  Previous IPDB not found, fall back to full compilation.
  All 3066 functions were compiled because no usable IPDB/IOBJ from previous compilation was found.
  Finished generating code
  test_collaborative_call_once.vcxproj -> F:\...\build\target_v142_win32\onetbb_bin_release\test_collaborative_call_once.exe

conformance_collaborative_call_once.exe:

ClCompile:
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x86\CL.exe /c /IF:\...\3rdparty\onetbb\src\test\.. /IF:\...\3rdparty\onetbb\src\test /IF:\...\3rdparty\onetbb\src\src\tbb\..\..\include /Zi /W4 /WX /diagnostics:column /O2 /Ob2 /Oy- /GL /D WIN32 /D _WINDOWS /D NDEBUG /D _WIN32_WINNT=0x0601 /D "CMAKE_INTDIR=\"Release\"" /D _MBCS /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"conformance_collaborative_call_once.dir\Release\\" /Fd"conformance_collaborative_call_once.dir\Release\vc142.pdb" /external:W4 /Gd /TP /analyze- /errorReport:queue  /bigobj /Zc:__cplusplus /utf-8 /bigobj /volatile:iso /FS -std:c++20 F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_call_once.cpp
  Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30145 for x86
  Copyright (C) Microsoft Corporation.  All rights reserved.
  cl /c /IF:\...\3rdparty\onetbb\src\test\.. /IF:\...\3rdparty\onetbb\src\test /IF:\...\3rdparty\onetbb\src\src\tbb\..\..\include /Zi /W4 /WX /diagnostics:column /O2 /Ob2 /Oy- /GL /D WIN32 /D _WINDOWS /D NDEBUG /D _WIN32_WINNT=0x0601 /D "CMAKE_INTDIR=\"Release\"" /D _MBCS /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"conformance_collaborative_call_once.dir\Release\\" /Fd"conformance_collaborative_call_once.dir\Release\vc142.pdb" /external:W4 /Gd /TP /analyze- /errorReport:queue  /bigobj /Zc:__cplusplus /utf-8 /bigobj /volatile:iso /FS -std:c++20 F:\...\3rdparty\onetbb\src\test\conformance\conformance_collaborative_call_once.cpp
  conformance_collaborative_call_once.cpp
Link:
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x86\link.exe /ERRORREPORT:QUEUE /OUT:"F:\...\build\target_v142_win32\onetbb_bin_release\conformance_collaborative_call_once.exe" /NOLOGO ..\..\onetbb_bin_release\tbb12.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /manifest:embed /DEBUG /PDB:"F:/.../build/target_v142_win32/onetbb_bin_release/conformance_collaborative_call_once.pdb" /SUBSYSTEM:CONSOLE /LARGEADDRESSAWARE /LTCG:incremental /LTCGOUT:"conformance_collaborative_call_once.dir\Release\conformance_collaborative_call_once.iobj" /TLBID:1 /DYNAMICBASE /NXCOMPAT /IMPLIB:"F:/.../build/target_v142_win32/onetbb_bin_release/conformance_collaborative_call_once.lib" /MACHINE:X86 /SAFESEH  /machine:X86 conformance_collaborative_call_once.dir\Release\conformance_collaborative_call_once.obj
  Generating code
  Previous IPDB not found, fall back to full compilation.
  All 2171 functions were compiled because no usable IPDB/IOBJ from previous compilation was found.
  Finished generating code
  conformance_collaborative_call_once.vcxproj -> F:\...\build\target_v142_win32\onetbb_bin_release\conformance_collaborative_call_once.exe
phprus commented 2 years ago

Build without LTO:

cd C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\build\vc142_win32

cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_FLAGS_INIT="/D_WIN32_WINNT=0x0601" -G "Visual Studio 16 2019" -A Win32 -T v142,version=14.29 ../..

cmake --build . --config Release -v -j 2

ctest --build-config Release --output-on-failure

All test passed.

Build with LTO:

cd C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\build\vc142_win32_lto

cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON -DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_FLAGS_INIT="/D_WIN32_WINNT=0x0601" -G "Visual Studio 16 2019" -A Win32 -T v142,version=14.29 ../..

cmake --build . --config Release -v -j 2

Crash:

C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\build\vc142_win32_lto>msvc_19.29_cxx20_32_md_release\test_collaborative_call_once.exe
[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\test\tbb\test_collaborative_call_once.cpp(211):
TEST CASE:  only calls once - stress test

C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\test\tbb\test_collaborative_call_once.cpp(236): FATAL ERROR: REQUIRE( f.ct == i ) is NOT correc
t!
  values: REQUIRE( -1610612736 == 1 )

C:\tmp\oneTBB-323260671b40db33c9fc0d66d1f1eed6ecc82ce2\test\tbb\test_collaborative_call_once.cpp(211): FATAL ERROR: test case CRASHED: SIGABRT - Abort
 (abnormal termination) signal

===============================================================================
[doctest] test cases:   4 |   3 passed | 1 failed | 5 skipped
[doctest] assertions: 489 | 488 passed | 1 failed |
[doctest] Status: FAILURE!
phprus commented 2 years ago

Reproduced in Github Actions!

Commit with new actions: https://github.com/phprus/oneTBB/commit/eeb0154a8ca95e7ec12e5d4209225cb22195372e Branch: https://github.com/phprus/oneTBB/tree/msvc_vs_ci

Test logs: https://github.com/phprus/oneTBB/runs/7818187404?check_suite_focus=true https://github.com/phprus/oneTBB/runs/7818187463?check_suite_focus=true

phprus commented 2 years ago

cc @pavelkumbrasev, @kboyarinov

phprus commented 2 years ago

@pavelkumbrasev, @isaevil,

On win32:

[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options
===============================================================================
C:\tmp\oneTBB\test\tbb\test_collaborative_call_once.cpp(181):
TEST CASE:  only calls once - move only argument

C:\tmp\oneTBB\test\tbb\test_collaborative_call_once.cpp(181): FATAL ERROR: test case CRASHED: SIGSEGV - Segmentation violation signal

https://github.com/oneapi-src/oneTBB/blob/13544f9432c946d11cbdd39b4726407c8aa9e3a9/include/oneapi/tbb/collaborative_call_once.h#L190

shared_runner->m_storage.task_arena has bad values (from visual studio debugger):

        my_version_and_traits   0   int
+       my_initialization_state pending (1) std::atomic<enum tbb::detail::d0::do_once_state>
+       my_arena    0x00000000 <NULL>   std::atomic<tbb::detail::r1::arena *>
        my_max_concurrency  4   int
        my_num_reserved_slots   1   unsigned int
        my_priority 1   tbb::detail::d1::task_arena_base::priority
        my_numa_id  0   int
        my_core_type    -1  int
        my_max_threads_per_core 1   int
phprus commented 2 years ago

@pavelkumbrasev, @isaevil,

Commit 07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0 from branch https://github.com/phprus/oneTBB/tree/msvc_vs_ci-1 (with new CI tasks for this issue).

Error:

Exception thrown: read access violation.
**ta** was 0x13EFD90.

Stacktrace:

tbb12.dll!tbb::detail::r1::task_arena_impl::initialize(tbb::detail::d1::task_arena_base & ta) Line 437
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\src\tbb\arena.cpp(437)
[Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::task_arena::initialize::__l2::<lambda_1>::operator()() Line 315
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\task_arena.h(315)
test_collaborative_call_once.exe!tbb::detail::d0::run_initializer<`tbb::detail::d1::task_arena::initialize'::`2'::<lambda_1>>(const tbb::detail::d1::task_arena::initialize::__l2::<lambda_1> & f, std::atomic<enum tbb::detail::d0::do_once_state> & state) Line 288
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\detail\_utils.h(288)
test_collaborative_call_once.exe!tbb::detail::d0::atomic_do_once<`tbb::detail::d1::task_arena::initialize'::`2'::<lambda_1>>(const tbb::detail::d1::task_arena::initialize::__l2::<lambda_1> & initializer={...}, std::atomic<enum tbb::detail::d0::do_once_state> & state) Line 283
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\detail\_utils.h(283)
[Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::task_arena::initialize() Line 315
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\task_arena.h(315)
[Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::task_arena::execute_impl(tbb::detail::d1::collaborative_once_runner::assist::__l2::<lambda_1> &) Line 253
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\task_arena.h(253)
[Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::task_arena::execute(tbb::detail::d1::collaborative_once_runner::assist::__l2::<lambda_1> &&) Line 412
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\task_arena.h(412)
[Inline Frame] test_collaborative_call_once.exe!tbb::detail::d1::collaborative_once_runner::assist() Line 120
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\collaborative_call_once.h(120)
test_collaborative_call_once.exe!tbb::detail::d1::collaborative_once_flag::do_collaborative_call_once<`tbb::detail::d1::collaborative_call_once<increment_functor &>'::`5'::<lambda_1> &>(tbb::detail::d1::collaborative_call_once::__l5::<lambda_1> & f={...}) Line 193
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\collaborative_call_once.h(193)
test_collaborative_call_once.exe!tbb::detail::d1::collaborative_call_once<increment_functor &>(tbb::detail::d1::collaborative_once_flag & flag, increment_functor & fn={...}) Line 225
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\include\oneapi\tbb\collaborative_call_once.h(225)
[Inline Frame] test_collaborative_call_once.exe!call_once_threads::__l4::<lambda_1>::operator()() Line 118
    at C:\tmp\oneTBB-07415c5e0ad86a5fe5a20c647ef68145c1e5c8a0\test\tbb\test_collaborative_call_once.cpp(118)
[Inline Frame] test_collaborative_call_once.exe!std::invoke(call_once_threads::__l4::<lambda_1> &&) Line 1524
    at C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\type_traits(1524)
test_collaborative_call_once.exe!std::thread::_Invoke<std::tuple<`call_once_threads<increment_functor &>'::`4'::<lambda_1>>,0>(void * _RawVals=0x0030e560) Line 56
    at C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\thread(56)
ucrtbase.dll!0fc562e4()
ucrtbase.dll![Frames below may be incorrect and/or missing, no symbols loaded for ucrtbase.dll]
kernel32.dll!764d344d()
ntdll.dll!___RtlUserThreadStart@8()
ntdll.dll!__RtlUserThreadStart@8()

Dump with heap and pdb files: dump.zip

isaevil commented 2 years ago

@phprus thanks for your logs and investigation. Based on our investigation we can tell that alignment of collaborative_once_runner doesn't work when LTO is enabled for Windows. The implementation counts on the fact that after alignment on 2^N bytes there are free N lower bits available. But when LTO is enabled on Windows it looks like that it is broken and it leads to Segmentation Fault so hangs you did find on Windows are also caused by that... We need to reproduce that without using TBB.

phprus commented 2 years ago

@isaevil Hangs happen not only in Windows, but also in ubuntu (in github actions CI). When using Visual Studio 2019 (Win32) with LTO enabled, a hang or crash occurs on every launch.

isaevil commented 2 years ago

@phprus I mean that hangs on Windows caused by issue I have mentioned above. For Ubuntu it seems like that it is different issue.

phprus commented 2 years ago

@isaevil You're right! This will not work in Win32.

In MSVC alignment requirements/promises only really apply to variables on the heap, not on the stack.

Source: https://www.boost.org/doc/libs/1_80_0/libs/type_traits/doc/html/boost_typetraits/reference/aligned_storage.html

What do you think of a next workaround for Visual Studio:

class collaborative_once_runner_stackholder : no_copy {
    collaborative_once_runner *m_ptr;
    char m_storage[max_nfs_size + sizeof(collaborative_once_runner)];
public:
    collaborative_once_runner_stackholder() {
        m_ptr = new (reinterpret_cast<char *>((reinterpret_cast<std::uintptr_t>(m_storage) +
                                               (collaborative_once_references_mask - 1)) &
                                              ~collaborative_once_references_mask))
                    collaborative_once_runner();
    }
    ~collaborative_once_runner_stackholder() {
        m_ptr->~collaborative_once_runner();
    }
    collaborative_once_runner *operator->() const noexcept {
        return m_ptr;
    }
};

And use collaborative_once_runner_stackholder instead of collaborative_once_runner?

I can make a PR.

pavelkumbrasev commented 2 years ago

We will discuss this problem. Thank you for proposed solution, on first glance it should be not be used for all configurations. If this issue related only to MSVC + LTO it should wrap only this case.

phprus commented 2 years ago

Yes, only for MSVC. Unfortunately, LTO cannot be detected using predefined macros.

phprus commented 2 years ago

@isaevil , @pavelkumbrasev Commit with workaround: https://github.com/phprus/oneTBB/commit/d7055c53d601af8914dea0ef2ed1807b33fa907b All tests passed. Should I create an RR?

Pre Requirement: #870 because the current build system sets the invalid working directory for tests.