google / sanitizers

AddressSanitizer, ThreadSanitizer, MemorySanitizer
Other
11.53k stars 1.04k forks source link

Any way to force tsan error? #1489

Open lindsayad opened 2 years ago

lindsayad commented 2 years ago

This is based off of https://github.com/libMesh/libmesh/pull/3146#issuecomment-1024526256. This simple function reports as tsan clean when run with multiple threads:

void
foo()
{
  const bool initially_done = done.load(std::memory_order_acquire);
  const auto val = payload;

  if (!initially_done)
  {
    std::lock_guard<std::mutex> lock(cv_m);
    if (!done.load(std::memory_order_relaxed))
    {
      payload = true;
      done.store(true, std::memory_order_release);
    }
  }
}

even though there is a read-write race for payload if acquire happens before release. If you swap the first two lines of the function, you get race detection. If I make the threads wait a while, with one waiting a bit longer than the other before proceeding to the racey section of the program, then I again get race detection

int
fib(int n)
{
  if (n <= 1)
    return n;
  return fib(n - 1) + fib(n - 2);
}

void
foo(const int thread_id)
{
  const bool initially_done = done.load(std::memory_order_acquire);
  assert(!initially_done);

  // Make thread 1 wait longer, but make both threads wait such that initially_done should be                         
  // false for both                                                                                                   
  if (thread_id)
    fib(31);
  else
    fib(30);

  const auto val = payload;

  if (!initially_done)
  {
    std::lock_guard<std::mutex> lock(cv_m);
    if (!done.load(std::memory_order_relaxed))
    {
      payload = true;
      done.store(true, std::memory_order_release);
    }
  }
}

I was wondering if there is a way to help tsan with "possible" race detection in the initial program? I've looked through the run-time and compile-time options and didn't really see anything that fit, but I could easily be missing something. I understand that conceptually the race is contingent on acquire happening before release, and that conceptually there is no race if release happens before acquire. Maybe it is impossible to bake in some logic that assumes no sychronizes-with relationship will actually occur at run-time, which I would think would then be able of catching the possible race. I am still not 100% comfortable with terminology so this could be a stupid post ... in which case please close.

dvyukov commented 2 years ago

Hi Alex,

Currently there is no existing option in tsan that can help to detect this. TSan could randomize scheduling of threads to provoke more bugs. It was discussed somewhere before, but can't find any references now. But this is not implemented in the current tsan version.

lindsayad commented 2 years ago

Understood. Thanks for the quick reply!