boostorg / thread

Boost.org thread module
http://boost.org/libs/thread
203 stars 162 forks source link

Updates to fix interruption-point inconsistencies. #151

Closed austin-beer closed 7 years ago

austin-beer commented 7 years ago

NOTE: These changes also fix the two test failures with sleep_for/sleep_until when BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC is disabled.

viboes commented 7 years ago

Thanks for working on this. Please, could you share the results fro the 3 variants linu-mono linux-no-mono and windows?

austin-beer commented 7 years ago

Here are the test results from this PR.

linux_results_austin_no_monotonic.txt linux_results_austin_yes_monotonic.txt windows_results_austin_no_monotonic.txt

The following two failures in the no-monotonic test are unavoidable. This is because hidden::sleep_for() has to use a condition variable (so that it has an interruption point), but it doesn't have access to a monotonic clock to see how much time has passed (unlike the sleep_for() that takes a Chrono time).

boost::this_thread::sleep(), posix duration:
    While system clock remains stable:        Expected: 580 ms, Actual: 580 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 580 ms, Actual: 810 ms, Returned: timeout, FAILED: TOO LONG
    While system clock jumps forward (short): Expected: 580 ms, Actual: 350 ms, Returned: timeout, FAILED: TOO SHORT
    While system clock jumps forward (long):  Expected: 580 ms, Actual: 110 ms, Returned: timeout, FAILED: TOO SHORT

 boost::this_thread::sleep(), no thread, posix duration:
    While system clock remains stable:        Expected: 580 ms, Actual: 580 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 580 ms, Actual: 810 ms, Returned: timeout, FAILED: TOO LONG
    While system clock jumps forward (short): Expected: 580 ms, Actual: 350 ms, Returned: timeout, FAILED: TOO SHORT
    While system clock jumps forward (long):  Expected: 580 ms, Actual: 110 ms, Returned: timeout, FAILED: TOO SHORT
viboes commented 7 years ago

Thanks.

Shouldn't boost::this_thread::sleep(), posix duration: waits for 100ms? this will reduce the gap, isn't it?

austin-beer commented 7 years ago

I'm afraid I don't quite understand your question. Could you rephrase the question?

viboes commented 7 years ago

The current definition is

        template<typename TimeDuration>
        inline BOOST_SYMBOL_VISIBLE void sleep(TimeDuration const& rel_time)
        {
          boost::this_thread::hidden::sleep_for(detail::timespec_duration(rel_time));
        }

Instead of using boost::this_thread::hidden::sleep_for, why not pool on a condition as we do in https://github.com/boostorg/thread/pull/151/commits/f1dec7935fc2198702a2a5acf5512838628b384f#diff-52550ac33fe611a20ce1f27ce707309cR141

viboes commented 7 years ago

I'll take this PR already, as aside the posix-time all in linux is ok.

BTW,

I've see a regression on windows.

Before

boost::this_thread::sleep_until(), system time:
    While system clock remains stable:        Expected: 580 ms, Actual: 592 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 810 ms, Actual: 826 ms, Returned: timeout, Passed
    While system clock jumps forward (short): Expected: 350 ms, Actual: 373 ms, Returned: timeout, Passed
    While system clock jumps forward (long):  Expected: 110 ms, Actual: 217 ms, Returned: timeout, Passed
boost::this_thread::sleep_until(), custom time:
    While system clock remains stable:        Expected: 580 ms, Actual: 592 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 810 ms, Actual: 841 ms, Returned: timeout, Passed
    While system clock jumps forward (short): Expected: 350 ms, Actual: 357 ms, Returned: timeout, Passed
    While system clock jumps forward (long):  Expected: 110 ms, Actual: 233 ms, Returned: timeout, FAILED: TOO LONG

After

boost::this_thread::sleep_until(), system time:
    While system clock remains stable:        Expected: 580 ms, Actual: 607 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 810 ms, Actual: 841 ms, Returned: timeout, Passed
    While system clock jumps forward (short): Expected: 350 ms, Actual: 358 ms, Returned: timeout, Passed
    While system clock jumps forward (long):  Expected: 110 ms, Actual: 248 ms, Returned: timeout, FAILED: TOO LONG
boost::this_thread::sleep_until(), custom time:
    While system clock remains stable:        Expected: 580 ms, Actual: 592 ms, Returned: timeout, Passed
    While system clock jumps back:            Expected: 810 ms, Actual: 810 ms, Returned: timeout, Passed
    While system clock jumps forward (short): Expected: 350 ms, Actual: 358 ms, Returned: timeout, Passed
    While system clock jumps forward (long):  Expected: 110 ms, Actual: 124 ms, Returned: timeout, Passed
austin-beer commented 7 years ago

I'm not sure if that is a true regression on Windows or not. I'm running all of my windows tests in a VM, and so I've seen up to 50 milliseconds of variability in the test results, which sometimes causes false results. I do have a dedicated Windows machine that I can use, but I don't have easy access to it at the moment. Once we start focusing on the Windows side of things, I plan to move to running my tests on that system.

austin-beer commented 7 years ago

The current definition is

    template<typename TimeDuration>
    inline BOOST_SYMBOL_VISIBLE void sleep(TimeDuration const& rel_time)
    {
      boost::this_thread::hidden::sleep_for(detail::timespec_duration(rel_time));
    }

Instead of using boost::this_thread::hidden::sleep_for, why not pool on a condition as we do in f1dec79#diff-52550ac33fe611a20ce1f27ce707309cR141

We're able to poll every 100 milliseconds in other places in the code because we have a steady clock to compare against. In this function, all we have is the internal clock. When the internal clock is the system clock, there's no way to know when the system clock jumps, and so there's no way to wait the correct amount of time.

viboes commented 7 years ago

Right, we are able to implement sleep_for(chrono::duration) because we have a steady::clock. posix-duration is another kind of duration, but we don't have a Date-Time steady clock.
If we have BOOST_CHRONO we should be able to implement convert to chrono::duration, isn't it? It will left the case when the user uses Boost.DateTime and not use Boost.Chrono. I believe this is a really minor set of users. What do you think?

viboes commented 7 years ago

Don't worry for the Windows possible regression. You are surely right. I've observed much more differences in the past on Windows.

austin-beer commented 7 years ago

Yes, if we have BOOST_THREAD_USES_CHRONO we could convert to chrono::duration and that would fix the issue. And if BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC is defined that already fixes the issue. I'm not sure if it would be good to start mixing Chrono functions with DateTime functions. I like the fact that, currently, the two are completely separate from each other.

But most users do have BOOST_THREAD_USES_CHRONO and don't have BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC, so in that situation we should probably make sleep() work correctly by using chrono::duration under the covers.

viboes commented 7 years ago

I agree that it would be desirable that Chrono and DateTime code stay independent, but from a pragmatic point of view, it would be a pity to don't provide the correct functionality when we have Boost.Chrono.