microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
9.88k stars 1.45k forks source link

`<condition_variable>`: `condition_variable_any::wait_for` returns `cv_status::timeout` when the elapsed time is shorter than requested #4723

Open cpplearner opened 2 weeks ago

cpplearner commented 2 weeks ago

Describe the bug

Revealed by libc++ test test/std/thread/thread.condition/thread.condition.condvarany/wait_for.pass.cpp.

[thread.condvarany.wait]/13:

Returns: cv_status​::​timeout if the relative timeout ([thread.req.timing]) specified by rel_time expired, otherwise cv_status​::​no_timeout.

But MSVC STL's condition_variable_any::wait_for sometimes returns cv_status​::​timeout even though the elapsed time (measured by high_resolution_clock) is shorter than the requested timeout.

Command-line test case

D:\test>type test-condvarany.cpp
#include <condition_variable>
#include <atomic>
#include <cassert>
#include <chrono>
#include <mutex>
#include <thread>
#include <print>

template <class Mutex>
struct MyLock : std::unique_lock<Mutex> {
  using std::unique_lock<Mutex>::unique_lock;
};

template <class Function>
std::chrono::microseconds measure(Function f) {
  std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();
  f();
  std::chrono::high_resolution_clock::time_point end = std::chrono::high_resolution_clock::now();
  return std::chrono::duration_cast<std::chrono::microseconds>(end - start);
}

template <class Lock>
void test() {
  using Mutex = typename Lock::mutex_type;

  // Test unblocking via a timeout.
  //
  // To test this, we create a thread that waits on a condition variable
  // with a certain timeout, and we never awaken it. To guard against
  // spurious wakeups, we wait again whenever we are awoken for a reason
  // other than a timeout.
  {
    auto timeout = std::chrono::milliseconds(250);
    std::condition_variable_any cv;
    Mutex mutex;

    std::thread t1([&] {
      Lock lock(mutex);
      std::cv_status result;
      do {
        auto elapsed = measure([&] { result = cv.wait_for(lock, timeout); });
        if (result == std::cv_status::timeout)
          if (elapsed < timeout) {
            std::println("elapsed: {}", elapsed);
            std::println("timeout: {}", timeout);
          }

      } while (result != std::cv_status::timeout);
    });

    t1.join();
  }
}

int main(int, char**) {
  test<std::unique_lock<std::mutex>>();
  test<std::unique_lock<std::timed_mutex>>();
  test<MyLock<std::mutex>>();
  test<MyLock<std::timed_mutex>>();
  return 0;
}

D:\test>cl /EHs /std:c++latest test-condvarany.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.41.33901 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

/std:c++latest is provided as a preview of language features from the latest C++
working draft, and we're eager to hear about bugs and suggestions for improvements.
However, note that these features are provided as-is without support, and subject
to changes or removal as the working draft evolves. See
https://go.microsoft.com/fwlink/?linkid=2045807 for details.

test-condvarany.cpp
Microsoft (R) Incremental Linker Version 14.41.33901.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:test-condvarany.exe
test-condvarany.obj

D:\test>.\test-condvarany.exe
elapsed: 249758us
timeout: 250ms

(You might need to execute the program several times to see the output.)

Expected behavior

.\test-condvarany.exe should consistently produce no output

STL version

v17.11.0-pre.1.0
frederick-vs-ja commented 2 weeks ago

~The round-up in _Clamped_rel_time_ms_count (introduced in #4457) looks like the reason. 249758μs would be turned into 250ms here.~

https://github.com/microsoft/STL/blob/e36ee6c2b9bc6f5b1f70776c18cf5d3a93a69798/stl/inc/thread#L187-L198

CC @AlexGuteniev.

AlexGuteniev commented 2 weeks ago

The round-up in _Clamped_rel_time_ms_count (introduced in #4457) looks like the reason

Why? It rounds the parameter passed to wait, so would err on the other side.

frederick-vs-ja commented 2 weeks ago

The round-up in _Clamped_rel_time_ms_count (introduced in #4457) looks like the reason

Why? It rounds the parameter passed to wait, so would err on the other side.

Oh, I see. Probably I was wrong then.

Can we reproduce the error with SleepConditionVariableSRW? If so, we may need to work around it.

StephanTLavavej commented 2 weeks ago

This looks like an STL bug that wasn't fixed by #4457 (despite how I Changelogged it, oopsy).

According to WG21-N4981 [thread.condvarany.wait]/11:

template<class Lock, class Rep, class Period>
  cv_status wait_for(Lock& lock, const chrono::duration<Rep, Period>& rel_time);

Effects: Equivalent to:

return wait_until(lock, chrono::steady_clock::now() + rel_time);

condition_variable_any::wait_for should use the steady_clock. But we don't do that:

https://github.com/microsoft/STL/blob/e36ee6c2b9bc6f5b1f70776c18cf5d3a93a69798/stl/inc/condition_variable#L117-L126 https://github.com/microsoft/STL/blob/e36ee6c2b9bc6f5b1f70776c18cf5d3a93a69798/stl/inc/condition_variable#L227-L233 https://github.com/microsoft/STL/blob/e36ee6c2b9bc6f5b1f70776c18cf5d3a93a69798/stl/src/sharedmutex.cpp#L44-L54

We're calling GetTickCount64() which is the system clock.

We need to audit this area again (both condition_variable and condition_variable_any) and make sure we Do What The Standard SaysTM.

Arup-Chauhan commented 2 weeks ago

@AlexGuteniev @frederick-vs-ja @StephanTLavavej @cpplearner I would like to collaborate on this, requested in Discord chat space too :)

AlexGuteniev commented 2 weeks ago

We're calling GetTickCount64() which is the system clock.

Actually it is not a system clock, it is a steady clock, but an imprecise steady clock not matching std::steady_clock.