morganstanley / binlog

A high performance C++ log library, producing structured binary logs
http://opensource.morganstanley.com/binlog/
Apache License 2.0
608 stars 72 forks source link

TSAN error when accessing default_thread_local_writer #131

Open andrewkcorcoran opened 3 years ago

andrewkcorcoran commented 3 years ago

The below minimal example consistently fails with a TSAN error. The original use case was when trying to change the name and id of the default thread local writer to match internal thread names/ids.

#include <thread>
#include <vector>

#include "binlog/binlog.hpp"

int main()
{
    auto v = std::vector<std::thread>{};
    for(int i = 0; i < 10; ++i)
    {
        v.emplace_back(std::thread{[](){
              binlog::default_thread_local_writer().setId(42);
                }});
    }
    for(auto & t : v)
    {
        t.join();
    }
    return 0;
}
andrewkcorcoran commented 3 years ago

Adding the below snippet to https://github.com/morganstanley/binlog/blob/main/include/binlog/default_session.hpp#L17 fixes the issue but I can't say I fully understand why as at first glance the code in that method looks thread safe to me.

inline std::string this_thread_id_string() {
   static auto m = std::mutex{};
   auto lock = std::scoped_lock{m};
....
}
erenon commented 3 years ago

Hi, thanks for the report. I'm unable to reproduce. Here's what I tried:

$ /usr/bin/c++ --version
c++ (GCC) 11.1.0
$ cat ../../test/integration/ConcurrentSetId.cpp
#include <thread>
#include <vector>

#include <binlog/binlog.hpp>

int main()
{
  auto v = std::vector<std::thread>{};
  for(int i = 0; i < 10; ++i)
  {
    v.emplace_back(std::thread{[](){
        binlog::default_thread_local_writer().setId(42);
        }});
  }
  for(auto & t : v)
  {
    t.join();
  }
  return 0;
}
$ cmake ../.. -DBINLOG_USE_TSAN=On
$ make ConcurrentSetId VERBOSE=1 -B
...
[ 50%] Building CXX object CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o
/usr/bin/c++  -I/home/erenon/src/binlog/include -O2 -g -DNDEBUG -fsanitize=thread -Wall -Wextra -Werror -pedantic -Wconversion -Wsign-conversion -Wold-style-cast -Wsuggest-override -Wshadow -std=c++14 -MD -MT CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -MF CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o.d -o CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -c /home/erenon/src/binlog/test/integration/ConcurrentSetId.cpp
[100%] Linking CXX executable ConcurrentSetId
/usr/bin/cmake -E cmake_link_script CMakeFiles/ConcurrentSetId.dir/link.txt --verbose=1
/usr/bin/c++ -O2 -g -DNDEBUG -fsanitize=thread CMakeFiles/ConcurrentSetId.dir/test/integration/ConcurrentSetId.cpp.o -o ConcurrentSetId 
...
$ ./ConcurrentSetId 
$ echo $?
0
andrewkcorcoran commented 3 years ago

Interesting.

I can't duplicate on either gcc or clang when adding my sample code into the binlog integration tests versus what was working before (having it as a standalone binary using binlog as an library).

If I add the line binlog::default_thread_local_writer().setName("test"); I can recreate the error in clang (but not gcc). Possibly related - I had to disable https://github.com/morganstanley/binlog/blob/main/cmake/ThreadSanitizer.cmake#L23 as it was giving me a false return re. clangs support for thread sanitizer?

andrewkcorcoran commented 3 years ago
$ git diff --staged
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 859229b..828eedc 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -286,6 +286,15 @@ add_example(TscClock)

   message(STATUS "Build integration tests")

+  add_executable(ConcurrentSetId test/integration/ConcurrentSetId.cpp)
+    target_compile_definitions(ConcurrentSetId PRIVATE
+      DOCTEST_CONFIG_SUPER_FAST_ASSERTS
+      DOCTEST_CONFIG_NO_MULTI_LANE_ATOMICS
+    )
+    target_link_libraries(ConcurrentSetId headers)
+    target_include_directories(ConcurrentSetId SYSTEM PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/test) # for doctest/doctest.h
+    set_target_properties(ConcurrentSetId PROPERTIES CXX_CLANG_TIDY "")
+
   add_executable(IntegrationTest test/integration/IntegrationTest.cpp)
     target_compile_definitions(IntegrationTest PRIVATE
       DOCTEST_CONFIG_SUPER_FAST_ASSERTS
diff --git a/cmake/ThreadSanitizer.cmake b/cmake/ThreadSanitizer.cmake
index c9d340e..0256947 100644
--- a/cmake/ThreadSanitizer.cmake
+++ b/cmake/ThreadSanitizer.cmake
@@ -20,7 +20,8 @@ if (BINLOG_USE_TSAN)
     add_compile_options("-fsanitize=thread" "-static-libtsan")
     add_link_options("-fsanitize=thread" "-static-libtsan")
   else ()
-    message(FATAL_ERROR "Thread Sanitizer requested by BINLOG_USE_TSAN, but appears to be not supported on this platform")
+    add_compile_options("-fsanitize=thread")
+    add_link_options("-fsanitize=thread")
   endif ()

   set(MEMORYCHECK_TYPE ThreadSanitizer)
diff --git a/test/integration/ConcurrentSetId.cpp b/test/integration/ConcurrentSetId.cpp
new file mode 100644
index 0000000..9a91c14
--- /dev/null
+++ b/test/integration/ConcurrentSetId.cpp
@@ -0,0 +1,21 @@
+#include <thread>
+#include <vector>
+
+#include <binlog/binlog.hpp>
+
+int main()
+{
+       auto v = std::vector<std::thread>{};
+       for(int i = 0; i < 10; ++i)
+       {
+               v.emplace_back(std::thread{[](){
+                               binlog::default_thread_local_writer().setId(42);
+                               binlog::default_thread_local_writer().setName("test");
+                               }});
+       }
+       for(auto & t : v)
+       {
+               t.join();
+       }
+       return 0;
+}
$ gcc --version
gcc (GCC) 10.2.1 20200804 (Red Hat 10.2.1-2)
$ clang --version
clang version 10.0.1 (Red Hat 10.0.1-4.el7)
erenon commented 3 years ago

cmake issue: it seems that recent setups require -fsanitize=thread passed to the linker as well, here's the fix:

$ git diff cmake/         
diff --git a/cmake/ThreadSanitizer.cmake b/cmake/ThreadSanitizer.cmake
index c9d340e..fbbc187 100644
--- a/cmake/ThreadSanitizer.cmake
+++ b/cmake/ThreadSanitizer.cmake
@@ -5,7 +5,11 @@ option(BINLOG_USE_TSAN "Build with Thread Sanitizer (TSAN) enabled" OFF)

 if (BINLOG_USE_TSAN)

-  check_cxx_compiler_flag("-fsanitize=thread" HAS_TSAN)
+  CMAKE_PUSH_CHECK_STATE(RESET)
+    # Make check_cxx_compiler_flag pass required flags to linker as well:
+    set(CMAKE_REQUIRED_FLAGS "-fsanitize=thread")
+    check_cxx_compiler_flag("-fsanitize=thread" HAS_TSAN)
+  CMAKE_POP_CHECK_STATE()

   CMAKE_PUSH_CHECK_STATE(RESET)
     # Make check_cxx_compiler_flag pass required flags to linker as well:

Even if I add binlog::default_thread_local_writer().setName("test"); , I'm unable to repro (tried both clang and gcc):

$ gcc --version
gcc (GCC) 11.1.0
$ clang --version
clang version 12.0.0
andrewkcorcoran commented 3 years ago

I guess the only other difference I can see is the compiler version - perhaps a false positive in clang 10? I don't currently have access to clang 11 - are you able to downgrade to 10 and see if you can reproduce?

erenon commented 3 years ago

I can reproduce it with clang 10 and clang 11, but not with clang 12.

erenon commented 3 years ago

FTR, the race is on:

#0 memcmp
#1 std::ctype<char>::_M_widen_init
#2 binlog::default_thread_local_writer() 

Frame 2 creates a std::string, in the constructor of the function local static. I'm not sure if this is a real issue, or a false positive. I'm trying to minimize it (reproduce without binlog), unfortunately, the report is not deterministic: sometimes it throws a warning, sometimes it does not.

andrewkcorcoran commented 2 years ago

Any update on this? We've moved to clang 12 and gcc 11 and this error is occurring on both regularly (though not 100%)

erenon commented 2 years ago

hi, no update. I wasn't able to reproduce it on clang 13, and discarded this as a clang/tsan bug. The short piece of code I was able to repro the issue on older clangs (10, 11, 12) seemed perfectly fine to me.