Timer precision: Timers are off on windows by 15ms, on Linux by 1.13ms

Matthias247 commented 1 year ago

Version v1.21.1 (master)

Platform Windows 11

Description When setting up a tokio timer with a timeout of 1ms on windows, the timer is late by an averge of 15ms. One other interesting part is that the windows current thread runtime seems to perform even worse - every timer seems late by 14ms (min and max lateness are all 14-16ms). In the multi-threaded runtime some timers seem to have expired after 7ms.

Linux precision is much better. The timer is late on average by 1.13 ms. However I think it would still be expected to be in the 500us range, if the tokio timer wheel has a granularity of 1ms, and some timers get rounded up and others get rounded down.

Repro (placed into time_rt.rs integration test):

#[test]
fn timer_precision() {
    use tokio::runtime::Builder;

    let rt = Builder::new_current_thread().enable_all().build().unwrap();
    const ITERATIONS: u32 = 1000;

    rt.block_on(async move {
        let mut total = Duration::ZERO;
        let mut min = Duration::MAX;
        let mut max = Duration::ZERO;

        for _ in 0 .. ITERATIONS {
            let now = Instant::now();
            let when = now + Duration::from_millis(1);
            sleep_until(when).await;
            let delta = when.elapsed();
            total += delta;
            min = min.min(delta);
            max = max.max(delta);
        }

        let avg = total / ITERATIONS;
        assert!(
            avg <= Duration::from_millis(1),
            "Expected the timer not to be off more than one 1ms, but is was off by {:?} [Min: {:?}, Max: {:?}]",
            avg, min, max);
    });
}

Windows results

Multi-threaded runtime

---- timer_precision stdout ---- thread 'timer_precision' panicked at 'Expected the timer not to be off more than one 1ms, but is was off by 14.588249ms [Min: 7.4141ms, Max: 15.409ms]', tokio\tests\time_rt.rs:70:9

Current-thread runtime

thread 'timer_precision' panicked at 'Expected the timer not to be off more than one 1ms, but is was off by 14.607315ms [Min: 13.5838ms, Max: 15.3081ms]', tokio\tests\time_rt.rs:70:9

Linux results

Current-thread runtime

thread 'timer_precision' panicked at 'Expected the timer not to be off more than one 1ms, but is was off by 1.135601ms [Min: 95.816µs, Max: 1.480869ms]', tokio/tests/time_rt.rs:70:9

Multi-threaded runtime

thread 'timer_precision' panicked at 'Expected the timer not to be off more than one 1ms, but is was off by 1.133061ms [Min: 143.385µs, Max: 1.52983ms]', tokio/tests/time_rt.rs:70:9

Matthias247 commented 1 year ago

I checked what happens with std timers by modifying the test-case above to just call std::thread::sleep in a blocking context instead of using an async sleep. Seems they are off by 15ms too.

Improvements had been discussed in https://github.com/rust-lang/rust/issues/43376, but nothing was implemented so far.

ChrisDenton commented 1 year ago

Indeed, the 15ms is due to Windows timer resolution. Applications can use timeBeginPeriod if they require a higher resolution.

On modern Windows a higher resolution timer is also possible using CREATE_WAITABLE_TIMER_HIGH_RESOLUTION. Though I'll admit to being somewhat ambivalent about it being the default. In terms of batter life, it can be advantageous to not have the timers be too precise unless required.

Noah-Kennedy commented 1 year ago

Yeah, windows is kinda... a mess as far as timers are concerned.

Ralith commented 1 year ago

Though I'll admit to being somewhat ambivalent about it being the default.

This is a major footgun for e.g. Quinn users, since we require higher resolution timers for pacing, otherwise performance suffers drastically. Silently changing global process (system?) state inside Quinn is also unappealing for obvious reasons.

I think providing consistent timer resolution would be a better default for tokio (particularly given the use cases tokio has traditionally targeted), and if battery-sensitive uses come up, configuration could be explored.

Matthias247 commented 1 year ago

Thanks for listing all the alternatives @ChrisDenton !

Apparently windows developers even thought about the battery impact for timebeginperiod

Starting with Windows 11, if a window-owning process becomes fully occluded, minimized, or otherwise invisible or inaudible to the end user, Windows does not guarantee a higher resolution than the default system resolution. See SetProcessInformation for more information on this behavior.

That's neat, but also not useful for the networking use-case. One wouldn't want worse performance in a network stack (which might e.g. do audio streaming, or a download) just because the window is minimized.

The CREATE_WAITABLE_TIMER_HIGH_RESOLUTION route seems the most interesting one: It wouldn't make a process-wide change, and might not silently change precision. Apparently one would need to do runtime checks to see whether it's supported, and on windows 8 would only get 15ms precision. But that's probably acceptable and better than the status quo.

Whether it should be the default in tokio and the rust std library? Good question! I'm leaning towards "if a developer started a timer of 1ms, they probably really care that it finishes in the 1-2ms time range. But if the timer is set for 60s they probably won't mind it being 100ms off". So opting in for a higher precision based on the estimated accuracy might work. But I assume tokio creates a timer just once for the runtime and then reuses it - so the decision would happen before the usage is known.

coder137 commented 1 year ago

Is there any fix on the way for this issue? Just hit this problem (writeup is here)

Any alternatives or suggestions for high-resolution/accurate async timers would be appreciated.

Darksonn commented 1 year ago

There are two limiting factors to how accurate your timer is:

How accurate is the timer that the OS provides?
What is the resolution of the data structure that stores the timers?

On windows, the bottleneck is the first factor. The standard OS timer only has a resolution of 15 ms or so.

On Linux, the bottleneck is the second factor. The data structure that Tokio uses to store timers only has a resolution of 1 ms.

If the Tokio timer does not meet your needs, then I would try the following crates and see if any of them work for you:

tokio-timerfd
async-io
async-timer (you probably want the 1.0.0 beta)

Ralith commented 1 year ago

On windows, the bottleneck is the first factor. The standard OS timer only has a resolution of 15 ms or so.

Tokio could easily work around this, however. The current inconsistent behavior across platforms is surprising.

Darksonn commented 1 year ago

I don't really know anything about the situation on Windows. How do you configure it?

Ralith commented 1 year ago

There were detailed discussions of a solution just above in this thread: https://github.com/tokio-rs/tokio/issues/5021#issuecomment-1249793113, https://github.com/tokio-rs/tokio/issues/5021#issuecomment-1249994013

tokio-rs / tokio