catchorg / Catch2

A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)
https://discord.gg/4CWS9zD
Boost Software License 1.0
18.25k stars 3.01k forks source link

ThreadSanitizer: signal-unsafe call inside of a signal #1833

Open greenrobot opened 4 years ago

greenrobot commented 4 years ago

Describe the bug Using a thread sanitizer setup with clang, our CI seems to hit an assertion; probably in a background thread. After that, our log is flooded with sanitizer violations that seem to be related to Catch, starting like this:

terminate called without an active exception
==================
WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=46206)
    #0 operator new(unsigned long) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:64 (sync-test+0x531217)
    #1 __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/ext/new_allocator.h:111:27 (sync-test+0x53c998)
    #2 std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/basic_string.tcc:1057:49 (sync-test+0x53c70b)
    #3 char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/basic_string.tcc:578:14 (sync-test+0x5410b4)
    #4 char* std::string::_S_construct_aux<char const*>(char const*, char const*, std::allocator<char> const&, std::__false_type) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/basic_string.h:5033:18 (sync-test+0x540f58)
    #5 char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/basic_string.h:5054:11 (sync-test+0x540ed8)
    #6 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/basic_string.tcc:657:19 (sync-test+0x5ca878)
    #7 Catch::StringRef::operator std::string() const /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:610:20 (sync-test+0x5a90aa)
    #8 Catch::RunContext::handleFatalErrorCondition(Catch::StringRef) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:12693:55 (sync-test+0x55f2a5)
    #9 (anonymous namespace)::reportFatal(char const*) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10644:56 (sync-test+0x5505dd)
    #10 Catch::FatalConditionHandler::handleSignal(int) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10738:9 (sync-test+0x550443)
    #11 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1914 (sync-test+0x487bd3)
    #12 __gnu_cxx::__verbose_terminate_handler() <null> (libstdc++.so.6+0x607d4)
    #13 [application code stack follows...]

Another example:

WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=46206)
    #0 operator delete(void*) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_new_delete.cc:126 (sync-test+0x531ae9)
    #1 __gnu_cxx::new_allocator<std::string>::deallocate(std::string*, unsigned long) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/ext/new_allocator.h:125:2 (sync-test+0x5ccde3)
    #2 std::allocator_traits<std::allocator<std::string> >::deallocate(std::allocator<std::string>&, std::string*, unsigned long) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/alloc_traits.h:462:13 (sync-test+0x5ccd93)
    #3 std::__cxx1998::_Vector_base<std::string, std::allocator<std::string> >::_M_deallocate(std::string*, unsigned long) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/stl_vector.h:180:4 (sync-test+0x612b0b)
    #4 void std::__cxx1998::vector<std::string, std::allocator<std::string> >::_M_realloc_insert<std::string const&>(__gnu_cxx::__normal_iterator<std::string*, std::__cxx1998::vector<std::string, std::allocator<std::string> > >, std::string const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/vector.tcc:448:7 (sync-test+0x63d638)
    #5 std::__cxx1998::vector<std::string, std::allocator<std::string> >::push_back(std::string const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/bits/stl_vector.h:948:4 (sync-test+0x63d19e)
    #6 std::__debug::vector<std::string, std::allocator<std::string> >::push_back(std::string const&) /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7/../../../../include/c++/7/debug/vector:467:9 (sync-test+0x5b9dbe)
    #7 Catch::XmlWriter::startElement(std::string const&, Catch::XmlFormatting) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:15326:16 (sync-test+0x5709e7)
    #8 Catch::XmlWriter::scopedElement(std::string const&, Catch::XmlFormatting) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:15334:9 (sync-test+0x570c3e)
    #9 Catch::JunitReporter::writeGroup(Catch::CumulativeReporterBase<Catch::JunitReporter>::Node<Catch::TestGroupStats, Catch::CumulativeReporterBase<Catch::JunitReporter>::Node<Catch::TestCaseStats, Catch::CumulativeReporterBase<Catch::JunitReporter>::SectionNode> > const&, double) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:16558:42 (sync-test+0x57a2a7)
    #10 Catch::JunitReporter::testGroupEnded(Catch::TestGroupStats const&) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:16550:9 (sync-test+0x57a1a6)
    #11 Catch::RunContext::testGroupEnded(std::string const&, Catch::Totals const&, unsigned long, unsigned long) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:12505:21 (sync-test+0x55c1b4)
    #12 Catch::RunContext::handleFatalErrorCondition(Catch::StringRef) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:12720:9 (sync-test+0x55f5b4)
    #13 (anonymous namespace)::reportFatal(char const*) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10644:56 (sync-test+0x5505dd)
    #14 Catch::FatalConditionHandler::handleSignal(int) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_sync_2/cbuild/Debug-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10738:9 (sync-test+0x550443)
    #15 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1914 (sync-test+0x487bd3)
    #16 __gnu_cxx::__verbose_terminate_handler() <null> (libstdc++.so.6+0x607d4)
   [...]

In total, there are 68 warnings like that in the log.

Expected behavior Catch should not trigger tsan warnings

Reproduction steps Currently, around 3 out of 4 builds produce an assertion and thus trigger the "warning flood". No simple repro known.

Platform information:

Additional context async-signal-safe functions

Of course we will need to fix triggering the assertion and this is not the issue. The question is if Catch is doing things in a signal handler that it should not do, like calling new and delete.

greenrobot commented 2 years ago

Ping. This still happens with Catch v2.13.6 and clang-13.

WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=79673)
    #0 operator new(unsigned long) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp:64 (sync-test+0x5286f4)
    #1 std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) <null> (libstdc++.so.6+0xbdcd8)
    #2 (anonymous namespace)::reportFatal(char const*) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_cluster/cbuild/Release-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10772:56 (sync-test+0x598a62)
    #3 Catch::handleSignal(int) /home/jenkins/agent/workspace/ObjectBox-Sanitizers_cluster/cbuild/Release-tsan/objectbox/src/main/cpp/sync/test/../../../../../../../../external/catch/catch.hpp:10908:9 (sync-test+0x598a62)
    #4 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) /tmp/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1967 (sync-test+0x476f03)
justhecuke commented 4 months ago

Ping. We still see this as well w/ v3.5.2.

WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=949967)
    #0 operator new(unsigned long) /drive/drive-linux_src/yocto/build/tmp/work-shared/llvm-project-source-17.0.4-r0/git/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp(64,3) (trt_engines_test+0x3a5ce7) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #1 __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) external/_main~yocto~x86_64_sysroot/skylake-64-gnu-linux/usr/include/c++/9.3.0/ext/new_allocator.h(114,27) (trt_engines_test+0x4263ab) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #2 std::allocator_traits<std::allocator<char>>::allocate(std::allocator<char>&, unsigned long) external/_main~yocto~x86_64_sysroot/skylake-64-gnu-linux/usr/include/c++/9.3.0/bits/alloc_traits.h(444,20) (trt_engines_test+0x4263ab)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_create(unsigned long&, unsigned long) external/_main~yocto~x86_64_sysroot/skylake-64-gnu-linux/usr/include/c++/9.3.0/bits/basic_string.tcc(153,14) (trt_engines_test+0x4263ab)
    #4 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::reserve(unsigned long) external/_main~yocto~x86_64_sysroot/skylake-64-gnu-linux/usr/include/c++/9.3.0/bits/basic_string.tcc(293,24) (trt_engines_test+0x4263ab)
    #5 Catch::AssertionResult::getExpression[abi:cxx11]() const external/catch2~3.5.2/src/catch2/catch_assertion_result.cpp(59,32) (trt_engines_test+0x42ea5d) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #6 Catch::JunitReporter::writeAssertion(Catch::AssertionStats const&) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(279,54) (trt_engines_test+0x4bebc4) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #7 Catch::JunitReporter::writeAssertions(Catch::CumulativeReporterBase::SectionNode const&) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(243,17) (trt_engines_test+0x4be56e) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #8 Catch::JunitReporter::writeSection(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, Catch::CumulativeReporterBase::SectionNode const&, bool) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(225,13) (trt_engines_test+0x4be56e)
    #9 Catch::JunitReporter::writeTestCase(Catch::CumulativeReporterBase::Node<Catch::TestCaseStats, Catch::CumulativeReporterBase::SectionNode> const&) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(190,9) (trt_engines_test+0x4bdcf9) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #10 Catch::JunitReporter::writeRun(Catch::CumulativeReporterBase::Node<Catch::TestRunStats, Catch::CumulativeReporterBase::Node<Catch::TestCaseStats, Catch::CumulativeReporterBase::SectionNode>> const&, double) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(161,13) (trt_engines_test+0x4bcafb) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #11 Catch::JunitReporter::testRunEndedCumulative() external/catch2~3.5.2/src/catch2/reporters/catch_reporter_junit.cpp(126,9) (trt_engines_test+0x4bc421) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #12 Catch::CumulativeReporterBase::testRunEnded(Catch::TestRunStats const&) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_cumulative_base.cpp(147,9) (trt_engines_test+0x4ad735) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #13 Catch::MultiReporter::testRunEnded(Catch::TestRunStats const&) external/catch2~3.5.2/src/catch2/reporters/catch_reporter_multi.cpp(161,26) (trt_engines_test+0x4c0336) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #14 Catch::RunContext::handleFatalErrorCondition(Catch::StringRef) external/catch2~3.5.2/src/catch2/internal/catch_run_context.cpp(475,21) (trt_engines_test+0x466f58) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #15 (anonymous namespace)::reportFatal(char const*) external/catch2~3.5.2/src/catch2/internal/catch_fatal_condition_handler.cpp(62,56) (trt_engines_test+0x46ad04) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
    #16 Catch::handleSignal(int) external/catch2~3.5.2/src/catch2/internal/catch_fatal_condition_handler.cpp(199,9) (trt_engines_test+0x46ad04)
    #17 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) /drive/drive-linux_src/yocto/build/tmp/work-shared/llvm-project-source-17.0.4-r0/git/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp(2094,5) (trt_engines_test+0x327220) (BuildId: d244502fdf22bbe08fd142612bd70a6a)
greenrobot commented 4 months ago

Another instance of this with v3.5.2:

WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=15007)
    #0 operator new(unsigned long) <null> (libtsan.so.2+0x8d389)
    #1 void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) <null> (libstdc++.so.6+0x12fd0c)
    #2 reportFatal ../../catch2-src/src/catch2/internal/catch_fatal_condition_handler.cpp:62 (objectbox-test+0xae9c27)
    #3 handleSignal ../../catch2-src/src/catch2/internal/catch_fatal_condition_handler.cpp:199 (objectbox-test+0xae9c8d)
    #4 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) <null> (libtsan.so.2+0x3bc0b)

The problem is that Catch2 just prevented to report valuable information on a non-reproducible issue that is otherwise hard to track down.

greenrobot commented 4 months ago

Digged a bit in the Catch2 sources; at least one of the issues is the string creation at the last line here:

    void RunContext::handleFatalErrorCondition( StringRef message ) {
        // First notify reporter that bad things happened
        m_reporter->fatalErrorEncountered(message);

        // Don't rebuild the result -- the stringification itself can cause more fatal errors
        // Instead, fake a result data.
        AssertionResultData tempResult( ResultWas::FatalErrorCondition, { false } );
        tempResult.message = static_cast<std::string>(message);

This creates a new string (using a newly allocated buffer). I wonder, if we could preallocate a string with a large enough buffer somewhere, just in case it's needed for signal handling. Or, would that just be the tip of the iceberg and the issue would be much larger? Hard to tell on a first glance. If you point me in the right direction, I could try to start a PR.

greenrobot commented 4 months ago

Well, as a workaround for sanitizer builds, I now disable Catch2 signal handling like this now:

target_compile_definitions(Catch2 PRIVATE CATCH_CONFIG_NO_POSIX_SIGNALS)