google / sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer
Other
11.35k stars 1.02k forks source link

TSan false positive with standalone memory fence #1415

Open alexey-katranov opened 3 years ago

alexey-katranov commented 3 years ago

Thread Sanitizer reports a race over data in the following code snippet. I do not know if TSan supports standalone fences but I see some __tsan_atomic_thread_fence in the code. In accordance with the C++ standard the code seems valid: the release operation X is observed with any fenced load Y followed with acquire fence F should give us happens before between operations before X and after F. (strictly speaking there is the second path over numThreadsFinished++ but it seems pretty simple) (surely if Y operation with acquire fence, everything is Ok).

#include <atomic>
#include <thread>

std::atomic<int> numThreadsFinished{};
std::atomic<int> epoch{};

void wait() {
    if (numThreadsFinished++ == 0) {
        while (epoch.load(std::memory_order_relaxed) == 0) {} // Y
        std::atomic_thread_fence(std::memory_order_acquire); // F
        return;
    }
    epoch++; // X
}

int main() {
    int data{};
    std::thread thr([&] {
        data = 1;
        wait();
    });
    wait();
    if (data != 1) {
        return -1;
    }
    thr.join();
    return 0;
}
[user@localhost build]$ ./clang_10.0_cxx11_64_debug/test_task
LLVMSymbolizer: error reading file: No such file or directory
==================
WARNING: ThreadSanitizer: data race (pid=2277694)
  Read of size 4 at 0x7ffd3fe3f2b8 by main thread:
    #0 main /home/user/c/github/oneTBB/test/tbb/test_task.cpp:24:9 (test_task+0x4cefce)

  Previous write of size 4 at 0x7ffd3fe3f2b8 by thread T1:
    #0 main::$_0::operator()() const /home/user/c/github/oneTBB/test/tbb/test_task.cpp:20:14 (test_task+0x4cf768)
    #1 void std::__invoke_impl<void, main::$_0>(std::__invoke_other, main::$_0&&) /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60:14 (test_task+0x4cf710)
    #2 std::__invoke_result<main::$_0>::type std::__invoke<main::$_0>(main::$_0&&) /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95:14 (test_task+0x4cf620)
    #3 void std::thread::_Invoker<std::tuple<main::$_0> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/thread:264:13 (test_task+0x4cf5c8)
    #4 std::thread::_Invoker<std::tuple<main::$_0> >::operator()() /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/thread:271:11 (test_task+0x4cf568)
    #5 std::thread::_State_impl<std::thread::_Invoker<std::tuple<main::$_0> > >::_M_run() /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/thread:215:13 (test_task+0x4cf44f)
    #6 <null> <null> (libstdc++.so.6+0xd8ad3)

  Location is stack of main thread.

  Location is global '??' at 0x7ffd3fe21000 ([stack]+0x00000001e2b8)

  Thread T1 (tid=2277696, finished) created by main thread at:
    #0 pthread_create <null> (test_task+0x464181)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd8d98)
    #2 main /home/user/c/github/oneTBB/test/tbb/test_task.cpp:19:17 (test_task+0x4cefb3)

SUMMARY: ThreadSanitizer: data race /home/user/c/github/oneTBB/test/tbb/test_task.cpp:24:9 in main
==================
ThreadSanitizer: reported 1 warnings
dvyukov commented 3 years ago

1352 is related

alexey-katranov commented 3 years ago

@dvyukov , I prototyped simple implementation of standalone fences. The idea is that each thread has release fence clock and acquire fence clock. Additionally, SyncVar is extended with clock for the fences. The algorithm is following:

  1. The release fence updates the release fence clock
  2. Any atomic store (incl. relaxed) copies the thread release fence clock into SyncVar fence clock
  3. Any atomic load (incl. relaxed) copies SyncVar fence clock into the thread acquire fence clock
  4. The acquire fence updates thread clock with acquire fence clock

The prototype seems to work correctly for the example above:

#include <atomic>
#include <thread>

std::atomic<int> numThreadsFinished{};
std::atomic<int> epoch{};

int racy_data{};

void wait() {
    if (numThreadsFinished++ == 0) {
        while (epoch.load(std::memory_order_relaxed) == 0) {} // Y

        volatile auto racy_load = racy_data; // RACE is reported!
        (void)racy_load;

        std::atomic_thread_fence(std::memory_order_acquire); // F
        return;
    }
    epoch++; // X
}

int main() {
    int data{};
    std::thread thr([&] {
        data = 1;
        racy_data = 1;
        wait();
    });
    wait();
    if (data != 1) { // RACE is NOT reported!
        return -1;
    }
    thr.join();
    return 0;
}

Did you think about such implementation in the past? Does it contain hidden issues that are difficult to resolve? Are you interested in further development? Surely, the prototype works only for this example and really suboptimal.

dvyukov commented 3 years ago

Hi Alex,

Please see: https://reviews.llvm.org/D47107#1125052 It should answer some of your questions.

Overall I think tsan needs to support standalone fences. There are correct programs that use fences and there are no good workarounds for these cases. However, if it increases runtime costs, it should be guarded by a separate flag.

Another aspect is that I have a change in-flight which effectively rewrites all of the runtime: https://github.com/dvyukov/llvm-project/pull/3/files It's not ready but we should decide soon on its fate. I am not sure what's the right way to order this change vs your change, I guess it may depend on the size of your change.

But either way a good first step would be a set of positive/negative test cases that will now just document the current behavior, but will also contains the expected behavior as comments

alexey-katranov commented 3 years ago

https://reviews.llvm.org/D47107#1125052 It should answer some of your questions.

It seems I reinvent the wheel. It was just one evening effort, so it is definitely better to finish your major change. Moreover, I am not in hurry and it is our of scope of my job. As for initial problem, we just rework the code not to use standalone fences.

It seems that the above review has not been updated for three years. What did block the patch and why not to finish it?

As I understand, ktsan implements the fences. It seems that it reimplements the logic of tsan, why it was not separated into a common module?

As for tests, I can try prepare some tests, do you have any BKMs, how to build and run tests (it would be good if I do not need to build clang).

dvyukov commented 3 years ago

BKMs

dvyukov commented 3 years ago

https://reviews.llvm.org/D47107#1125052 It should answer some of your questions.

It seems I reinvent the wheel. It was just one evening effort, so it is definitely better to finish your major change. Moreover, I am not in hurry and it is our of scope of my job. As for initial problem, we just rework the code not to use standalone fences.

A non-upstreammed wheel does not count :)

It seems that the above review has not been updated for three years. What did block the patch and why not to finish it?

I have no idea. That's how OSS works. You spend time writing an extensive reply and then you just never hear from the author again :)

As I understand, ktsan implements the fences. It seems that it reimplements the logic of tsan, why it was not separated into a common module?

Different languages, different code styles, different environments, different specifics.

As for tests, I can try prepare some tests, do you have any BKMs, how to build and run tests (it would be good if I do not need to build clang).

I don't think there is a way to avoid building llvm, but you need to do this only once. We have these docs, but there also should be some llvm docs: https://github.com/google/sanitizers/wiki/AddressSanitizerHowToBuild For testing you run ninja check-tsan in the build dir. There is also a way to run a single test: https://reviews.llvm.org/D103054

RalfJung commented 2 years ago

The problem with tsan not supporting fences still exists, right? Shouldn't the issue be reopened, or is this tracked somewhere else?

dvyukov commented 2 years ago

Not sure why I closed it. I usually write explanations when closing issues. Maybe just hit Close accidentally...

OlivierNicole commented 2 years ago

Is there any form of annotation or other to avoid a false positive due to TSan not taking into account an acquire fence?

dvyukov commented 2 years ago
  1. Change the code to not use stand-alone fences.
  2. There are __tsan_acquire/release annotations.
  3. It's also possible to use a fake aromic operations with acquire/release ordering in tsan build as annotations.
OlivierNicole commented 2 years ago

I see. Thanks!