rust-windowing / winit

Window handling library in pure Rust
https://docs.rs/winit/
Apache License 2.0
4.81k stars 902 forks source link

State of the Windows platform ? #1429

Closed filnet closed 4 years ago

filnet commented 4 years ago

Hi,

As a foreword, I am new to Rust and winit. Also, English is not my native language, so please take what follows with a grain of salt.

My understanding about winit on Windows is that:

I've been looking at the code to address some issues [1][2][3] and it looks like the Windows code has inherited some complexity from the time it was multi threaded (?).

Thing's I have noticed:

I am open to be given directions or to discuss ways to improve the situation if it needs to.

[1] https://github.com/rust-windowing/winit/issues/1391 [2] https://github.com/rust-windowing/winit/issues/1427 [3] https://github.com/rust-windowing/winit/issues/1428

PS: Looks like some projects are rolling back their winit 0.20.0 update because of issue [1]. See https://github.com/aclysma/skulpin/issues/39

PPS : I have two PRs open :

filnet commented 4 years ago

One concrete example:

pub(crate) struct SubclassInput<T: 'static> {
    pub window_state: Arc<Mutex<WindowState>>,
    pub event_loop_runner: EventLoopRunnerShared<T>,
    pub file_drop_handler: FileDropHandler,
}

Is the Mutex protecting WindowState still needed ?

filnet commented 4 years ago

@Osspial, I believe this recent change has broken the Windows platform in a few ways : https://github.com/rust-windowing/winit/commit/6a330a2894873d29fbbfdeebfc1a215577213996#diff-d4d8c8dcf07874a35d08c6a70e0a6daf

It most probably introduced this regression: https://github.com/rust-windowing/winit/issues/1427 The change introduced a borrow() that can fail and fails. Similar borrows done right after it use try_borrow() and handle the case when it fails.

I am also pretty sure it introduced https://github.com/rust-windowing/winit/issues/1391.

And I found a suspect code change:

               if 0 == winuser::PeekMessageW(&mut msg, ptr::null_mut(), 0, 0, 0) {
                    if msg.message != 0 && msg.message != winuser::WM_PAINT {
                        queue_call_again();
                        return 0;

Here the test on msg after the PeekMessageW() call is pointless as we are in the case where there is no message.

I will try to revert the fix and reproduce the missing RedrawRequested issue. Were there any specifics needed to reproduce the issue?

filnet commented 4 years ago

PS: there is an open PR for issue https://github.com/rust-windowing/winit/issues/1391 : https://github.com/rust-windowing/winit/pull/1422

But I am not too happy about it...

filnet commented 4 years ago

I reverted https://github.com/rust-windowing/winit/commit/6a330a2894873d29fbbfdeebfc1a215577213996 and was able to reproduce an issue that looks similar to the one addressed by that fix.

If the user calls window.request_redraw() while handling the MainEventsCleared event, then the RedrawRequested event is sent on the next loop iteration and not the current iteration. Calling window.request_redraw() while handling the other events works fine.

I am now trying to fix the issue in a different way...

filnet commented 4 years ago

https://github.com/rust-windowing/winit/issues/1427 was closed as a duplicate of https://github.com/rust-windowing/winit/issues/1400

filnet commented 4 years ago

For reference:

    /// Emits a `WindowEvent::RedrawRequested` event in the associated event loop after all OS
    /// events have been processed by the event loop.
    ///
    /// This is the **strongly encouraged** method of redrawing windows, as it can integrate with
    /// OS-requested redraws (e.g. when a window gets resized).
    ///
    /// This function can cause `RedrawRequested` events to be emitted after `Event::MainEventsCleared`
    /// but before `Event::NewEvents` if called in the following circumstances:
    /// * While processing `MainEventsCleared`.
    /// * While processing a `RedrawRequested` event that was sent during `MainEventsCleared` or any
    ///   directly subsequent `RedrawRequested` event.
    ///
    /// ## Platform-specific
    ///
    /// - **iOS:** Can only be called on the main thread.
    #[inline]
    pub fn request_redraw(&self) {
        self.window.request_redraw()
    }

My understanding is that calling request_redraw should cause an RedrawRequested event to be sent during the current loop iteration not matter when this is called (when handling a MainEventsCleared event or even a RedrawRequested (possible infinite loop here?)

filnet commented 4 years ago

request_redraw in 0.21.0

    pub fn request_redraw(&self) {
        unsafe {
            winuser::RedrawWindow(
                self.window.0,
                ptr::null(),
                ptr::null_mut(),
                winuser::RDW_INTERNALPAINT,
            );
        }
    }

request_redraw in 0.20.0-alpha1

   #[inline]
    pub fn request_redraw(&self) {
        unsafe {
            if self.thread_executor.trigger_newevents_on_redraw() {
                winuser::RedrawWindow(
                    self.window.0,
                    ptr::null(),
                    ptr::null_mut(),
                    winuser::RDW_INTERNALPAINT,
                );
            } else {
                winuser::PostMessageW(self.window.0, *REQUEST_REDRAW_NO_NEWEVENTS_MSG_ID, 0, 0);
            }
        }
    }
filnet commented 4 years ago

What I am seeing after reverting the commit is confirmed by the associated PR description (which I overlooked) : https://github.com/rust-windowing/winit/pull/1366

Continuing my effort to understand and fix request_redraw on Windows, I did a bit of forensic and looked at how other platforms handle it.

On Windows, the current approach is to post a WM_PAINT message. Older versions (0.20.0-alpha) also posted a REQUEST_REDRAW_NO_NEWEVENTS_MSG_ID message to distinguish request_redraw called after or before events were cleared. Not sure why. These approaches are tricky because you need to dispatch these messages after clearing the main events.

Other platforms use a different approach: they raise a flag or buffer the redraw internally when request_redraw is called.

X11:

    pub fn request_redraw(&self) {
        self.pending_redraws
            .lock()
            .unwrap()
            .insert(WindowId(self.xwindow));
    }

Wayland:

    pub fn request_redraw(&self) {
        *self.need_refresh.lock().unwrap() = true;
    }

These approaches seem much easier to handle. Problem is that on Windows, the window struct does not have access to the event_loop or runner.

@Osspial any thoughts on that ?

Osspial commented 4 years ago

Hey, thanks for opening an issue to keep track of all this. I'll write up a more thorough response within the next couple days when I've got the time to look into this more. Sorry I didn't get back on this sooner - I was fairly stressed over the past month and wasn't keeping tabs on Winit to help manage that, but I'm feeling quite a bit better now and should be able to respond more promptly for the foreseeable future.

These approaches seem much easier to handle. Problem is that on Windows, the window struct does not have access to the event_loop or runner.

@Osspial any thoughts on that ?

The rationale behind using WM_PAINT is that that's the best way to get the OS to merge user-requested and OS-requested redraw events. We could theoretically keep track of that information ourselves and emit RedrawRequested events ourselves when the event loop has been drained, but Windows strongly prefers that you execute your redrawing code in the WM_PAINT handler rather than outside of it. That creates a tricky balance for Winit, since it's difficult to both satisfy its external API and winapi's preferences at the same time. #1366 was my initial attempt to address that, which... clearly didn't work out all that great. I've got ideas on how to handle that better, but it's been a bit since I've looked at this code closely so they aren't coming to mind at this particular moment.

filnet commented 4 years ago

Hi @Osspial,

Thanks for getting back!

I have looked at the request_redraw vs WM_PAINT issue quite a bit these last couple of weeks. I looked at older 0.20.0 alpha versions and at your recent attempt at fixing related issues.

I also looked at how other platforms handle it and came up with this approach: https://github.com/rust-windowing/winit/pull/1461. Please take a look.

We can discuss it further here or in the PR.

filnet commented 4 years ago

If redrawing must occur only when receiving a WM_PAINT message then my PR does not comply.

I am now looking into another approach using UpdateWindow instead. It looks promising. Calling UpdateWindow from request_redraw will cause the WM_PAINT message to be sent immediately. This removes the need to Peek/Get/Dispatch messages after calling main_events_cleared() in the event the user calls request_redraw when handling the MainEventsCleared event (which is the root of the problems).

Looks like RedrawWindow can work in that mode too if the RDW_UPDATENOW flag is set.

filnet commented 4 years ago

The UpdateWindow approach was a dead end.

So I went full circle and simply fixed the various panics using the current approach of calling RedrawWindow. I believe that all panics are now fixed. The code is also simpler now.

Use cases that have been tested:

A few corner cases have not been tested:

filnet commented 4 years ago

The fullscreen and window_debug examples still panic when entering borderless or exclusive full screen mode. See https://github.com/rust-windowing/winit/issues/1469.

The min_max_size example also panics (but not in master).

Looking into it now.

filnet commented 4 years ago

The remaining panics are all caused by calls to UpdateWindow. UpdateWindow will directly send a WM_PAINT message which might, in some cases, cause RedrawRequested events to be buffered. The runner is currently not able to correctly handle buffered RedrawRequested events and state transition will fail to be done orderly. Additionally we also want to handle RedrawRequested events during the processing of WM_PAINT messages (to make Windows happy ?). Buffering prevents this.

For now, all calls to UpdateWindow were changed to calls to InvalidateRgn. And buffering a RedrawRequested event will now cause a panic.

filnet commented 4 years ago

And back to square one :( : https://github.com/rust-windowing/winit/pull/1461#issuecomment-591719500

Seems like PeekMessage can, under some conditions, dispatch messages [1][2]. The non modal event loop looks like this (pseudo code):

looop {
    runner.new_events();
    while PeekMessage(msg) != WM_PAINT {
        DispatchMessage(msg);
    }
    runner.main_events_cleared();
    // Drain eventual WM_PAINT messages sent if user called request_redraw() during handling of MainEventsCleared.
    while PeekMessage(msg) == WM_PAINT {
        DispatchMessage(msg);
    }
    runner.redraw_events_cleared();
}

Trouble is that the loop to drain WM_PAINT messages can end up dispatching messages other than WM_PAINT. These are then sent to the runner but the runner is in MainEventsCleared state and cannot handler them. I have seen that issue before but cannot reproduce it systematically.

This is also interesting https://devblogs.microsoft.com/oldnewthing/?p=36493

[1] https://www.gamedev.net/forums/topic/685863-win32-message-handling-questions/ [2] https://www.gamedev.net/forums/topic/670857-peekmessage-changed-in-windows-10/

filnet commented 4 years ago

Here is a backtrace that shows that PeekMessage sends messages. This is mentioned in doc. This happens while draining WM_PAINT messages after clearing the main events.

MainEventsCleared
thread 'main' panicked at 'non-redraw event in redraw phase: Some(WindowEvent { window_id: WindowId(WindowId(0x310646)), event: KeyboardInput { device_id: DeviceId(DeviceId(0)), input: KeyboardInput { scancode: 0, state: Released, virtual_keycode: None, modifiers: (empty) }, is_synthetic: true } })', src\platform_impl\windows\event_loop\runner.rs:375:17
stack backtrace:
   0:   0xe7b009 - core::fmt::write
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libcore\fmt\mod.rs:1028
   1:   0xe65691 - std::io::Write::write_fmt<std::sys::windows::stdio::Stderr>
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\io\mod.rs:1412
   2:   0xe6a17a - std::panicking::default_hook::{{closure}}
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:188
   3:   0xe69da5 - std::panicking::default_hook
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:205
   4:   0xe6a818 - std::panicking::rust_panic_with_hook
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:464
   5:   0xe6a419 - std::panicking::continue_panic_fmt
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:373
   6:   0xe6a343 - std::panicking::begin_panic_fmt
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:328
   7:   0xe061f6 - winit::platform_impl::platform::event_loop::runner::EventLoopRunner<()>::process_event<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop\runner.rs:375
   8:   0xe04d88 - winit::platform_impl::platform::event_loop::runner::ELRShared<()>::send_event_unbuffered<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop\runner.rs:121
   9:   0xe049bf - winit::platform_impl::platform::event_loop::runner::ELRShared<()>::send_event<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop\runner.rs:106
  10:   0xd98317 - winit::platform_impl::platform::event_loop::SubclassInput<()>::send_event<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop.rs:104
  11:   0xd9c0b9 - winit::platform_impl::platform::event_loop::public_window_callback<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop.rs:1320
  12: 0x6548a9c0 - DefSubclassProc
  13: 0x6548a8b4 - DPA_Clone
  14: 0x7642630a - gapfnScSendMessage
  15: 0x76426d4a - GetThreadDesktop
  16: 0x76426df8 - GetThreadDesktop
  17: 0x76426e54 - GetThreadDesktop
  18: 0x7763010a - KiUserCallbackDispatcher
  19: 0x76426a9f - gapfnScSendMessage
  20: 0x76426afc - gapfnScSendMessage
  21: 0x7642630a - gapfnScSendMessage
  22: 0x76426d4a - GetThreadDesktop
  23: 0x76430d57 - GetClientRect
  24: 0x76430d7d - CallWindowProcW
  25: 0x6548a7b3 - DPA_Clone
  26: 0x6548a9c0 - DefSubclassProc
  27: 0x6548a975 - DefSubclassProc
  28:   0xd9d90f - winit::platform_impl::platform::event_loop::public_window_callback<()>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop.rs:1672
  29: 0x6548a9c0 - DefSubclassProc
  30: 0x6548a8b4 - DPA_Clone
  31: 0x7642630a - gapfnScSendMessage
  32: 0x76426d4a - GetThreadDesktop
  33: 0x76426df8 - GetThreadDesktop
  34: 0x76426e54 - GetThreadDesktop
  35: 0x7763010a - KiUserCallbackDispatcher
  36: 0x76430781 - PeekMessageW
  37:   0xd97dd3 - winit::platform_impl::platform::event_loop::EventLoop<()>::run_return<(),closure-0>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop.rs:250
  38:   0xd981f8 - winit::platform_impl::platform::event_loop::EventLoop<()>::run<(),closure-0>
                       at C:\msys64\home\philippe.renon\winit\src\platform_impl\windows\event_loop.rs:192
  39:   0xda31a7 - winit::event_loop::EventLoop<()>::run<(),closure-0>
                       at C:\msys64\home\philippe.renon\winit\src\event_loop.rs:148
  40: 0x7763010a - KiUserCallbackDispatcher
  41:   0xd956e1 - std::rt::lang_start::{{closure}}<()>
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\src\libstd\rt.rs:61
  42:   0xe687a4 - std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0,i32>
  43:   0xe6a2b1 - std::panicking::try::do_call<closure-0,i32>
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\panicking.rs:287
  44:   0xe6cf83 - panic_unwind::__rust_maybe_catch_panic
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libpanic_unwind\lib.rs:78
  45:   0xe6ab2f - std::rt::lang_start_internal
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\/src\libstd\rt.rs:47
  46:   0xd956bd - std::rt::lang_start<()>
                       at /rustc/73528e339aae0f17a15ffa49a8ac608f50c6cf14\src\libstd\rt.rs:61
  47:   0xe7ed8c - __scrt_common_main_seh
                       at d:\agent\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288

In this case it is a WM_SETFOCUS that gets sent. When handling this message, winit will first emit a WindowEvent::KeyboardInput event (no idea why...) and then a WindowEvent::Focused event. But the WindowEvent::KeyboardInput causes the panic because the runner has cleared the main events and is in the HandlingRedraw state.

A fix that seems to work is to pass the PM_QS_PAINT flag when peeking. This seems to prevent unwanted messages from being sent.

Edited to add this link : Non-Queued messages

Edited: it is PM_QS_PAINT not QS_PAINT that needs to be passed to the peek function.

filnet commented 4 years ago

We can close this issue now.