rust-windowing / winit

Window handling library in pure Rust
https://docs.rs/winit/
Apache License 2.0
4.9k stars 914 forks source link

Pointer Event Overhaul #3833

Closed daxpedda closed 1 month ago

daxpedda commented 4 months ago

EDIT: Implemented in https://github.com/rust-windowing/winit/pull/3876.

Goals

Unified Pointer Handling

We currently handle Touch completely separately from mouse events (WindowEvent::Cursor*), the initial idea was I assume that the type doesn't really fit all the information that touch wants to express. This would make sense but comes with a big downside: if users don't explicitly handle the Touch event, their application just won't work on touch devices.

This issue will only grow if we want to introduce pen/stylus support (#99). So to address this, we take inspiration from the Web Pointer Events Specification, which was mainly introduced to address this problem. The idea is to have a single event for all pointer input, which contains details about all possible inputs, be it touch, pen/stylus or anything else. That way users don't have to explicitly handle certain input types, but can still choose to do so, without accidentally not supporting a certain one.

Addresses #1555 (doesn't fix it, because there is a performance component to this on Wayland that would not be addressed by this proposal). Fixes #336.

Pen/Stylus Support

To support pen/stylus we run into the same problem as above: we can't introduce a separate event, otherwise applications will suddenly stop working when a pen is used and they don't explicitly handle it.

Additionally this will require adding much more data to our current WindowEvent::Cursor* events. So we should figure out an API that will allow us to add more pointer type support in the future while we are at it.

Addresses #99 (implementation is being done in #3810).

Position Data in Pointer Events

We are currently missing position data in all pointer events except WindowEvent::CursorMoved. This isn't just annoying to the user, because they have to handle two events to get the information they want, but it requires implementations to account for that as well and send "fake" WindowEvent::CursorMoved events to report the position to the users.

The solution should be quite simple: add position: PhysicalPosition<f64> to all pointer events.

Fixes #883. Fixes #1909.

Implementation

Proposed API ```rust pub enum DeviceEvent { // renamed from `MouseMotion` PointerMotion { ... }, ... } pub enum WindowEvent { // renamed from `CursorMoved` PointerMoved { device_id: DeviceId, position: PhysicalPosition, // new field source: PointerSource, }, // renamed from `CursorEntered` PointerEntered { device_id: DeviceId, // new field position: PhysicalPosition, // new field r#type: PointerType, }, // renamed from `CursorLeft` PointerLeft { device_id: DeviceId, // new field position: PhysicalPosition, // new field r#type: PointerType, }, // renamed from `MouseInput` PointerButton { device_id: DeviceId, state: ElementState, // new field position: PhysicalPosition, // changed type from `MouseButton` button: ButtonSource, }, // removed, folded into `Pointer*` events // Touch(Touch), ... } // removed, folded into `PointerType/PointerSource::Touch` // pub struct Touch { ... } // new types from here on out pub enum PointerType { Mouse, Touch, Unknown, } pub enum PointerSource { Mouse, Touch { finger_id: FingerId, force: Force }, Unknown, } impl From for PointerType { ... } pub enum ButtonSource { Mouse(MouseButton), Touch { finger_id: FingerId, force: Force }, Unknown(u16), } impl ButtonSource { fn mouse_button(self) -> MouseButton; } ```

Alternative Changes

Keep WindowEvent::Touch

We could additionally emit the Touch event, and introduce a new event for each device type, for user convenience; having a dedicated place to handle a specific device type is certainly a nice touch.

Though it would lead to a duplication of events, which personally I'm not a big fan of.

Remove position from WindowEvent::PointerLeft

While this is certainly useful, I don't know if it makes on all backends, it certainly works on Web. We could either wrap it in Option, if this information is not available on all backends, or remove it entirely.

Add WindowEvent::PointerChange event

Pointer devices like touch or pen/stylus, can update their details without actually moving. In the current proposal this would use PointerMove, but we could introduce a new PointerChange event to address this.

However, like other pointer events, we would probably still want to include a position field, so the data would look exactly like PointerMove.

Add KeyType

Maybe "Key" is not the best term used here, but we might want to use a similar device type distinction for keyboard input as well. This would probably include renaming the KeyboardInput event to KeyInput. The question is really what device types would show up here.

My understand is that devices like Joysticks, Steering Wheels and things like that would all fall under gamepads, which we already decided can be fully implemented by external libraries without any help by Winit.

The reason this has come up here is because of Waylands tablet protocol, which exposes a very strange amount of graphic tablet specific input events. However AFAIK this is very Wayland specific and no other backend exposes this amount of specific information. No matter how we want to address it though, it should be at least routed to KeyboardInput for the same reason we want to route all pointer events to Pointer* events.

Gestures

The idea would be to implement a new Gesture event with an enum containing all possible gestures. This is only related to this proposal because we might want to add PointerType/PointerSource to it (or a variation of it).

We currently have the following gestures:

And more in the coming like #3413. Additionally I would also argue that we could fit MouseWheel and pen/stylus "actions" into this as well (#3768).

I think this would make it much easier to add new gestures and centralize a place to handle them. Adding PointerType/PointerSource to it would also allow users to identify what kind of device this is coming from, which brings me to the next topic.

Touchpad

This is related to gestures and #2379. The idea is that we need more information on touch events if they relate to a touchpad and not a touchscreen. This could easily be addressed by the current proposal by adding a Touchpad variant to PointerType/PointerSource (where we could also expose the touchpad coordinates).

I don't know how this would look implementation-wise on other backends, but Web has an open issue for that: #2963.

FAQ

How would touch events work?

  1. PointerEntered
  2. * PointerButton with ElementState::Pressed.
  3. PointerMoved
  4. * PointerButton with ElementState::Released (when not cancelled)
  5. PointerLeft

*: Only emitted when it makes sense, e.g. touch screen. Touchpad doesn't emit those unless actually pressed.

This is exactly like using a mouse, fulfilling the purpose of this proposal The only difference is that all events come with PointerSource/PointerType::Touch, exposing the FingerId and Force.

How would pen/stylus input work?

Exactly the same as touch events. But coming with PointerSource/PointerType::Pen/Eraser/Airbrush, exposing any relevant information.

How would pen/stylus hover work without touching the screen?

Exactly like touch events, but not emitting PointerButton unless the pen/stylus actually makes contact. This would also be exactly how touchpads work.

How would changes to e.g. tilt in pen/stylus be represented?

By using the PointerMoved event. See the "Add WindowEvent::PointerChange event" alternative.

How would graphic tablet buttons be represented?

See the "Add KeyType" alternative.

In any case, they have to through regular KeyboardInput unless we want to repeat the same issue that motivates this proposal: if users don't explicitly handle it, the app won't react at all. This would sometimes make sense when certain actions are not translatable to e.g. keyboard input, but often they are.

kchibisov commented 3 months ago

We are currently missing position data in all pointer events except WindowEvent::CursorMoved. This isn't just annoying to the user, because they have to handle two events to get the information they want, but it requires implementations to account for that as well and send "fake" WindowEvent::CursorMoved events to report the position to the users.

The solution should be quite simple: add position: PhysicalPosition to all pointer events.

This would require internal tracking on some backends, because pointer events generally don't have information about position when you click.

Add WindowEvent::PointerChange event

I don't think it's a great idea to stuff more data into simple events when users don't really care about them.

The reason this has come up here is because of Waylands tablet protocol, which exposes a very strange amount of graphic tablet specific input events. However AFAIK this is very Wayland specific and no other backend exposes this amount of specific information. No matter how we want to address it though, it should be at least routed to KeyboardInput for the same reason we want to route all pointer events to Pointer* events.

it's not a keyboard input, it's on the same level with mouse buttons, so applying anything related to keyboard to them is not going to work, but I'd rather just have an extension for that users can implement.

In general, I'd suggest that we have a generic Pointer input which does simulation, but have TouchHandler/TabletHandler users can implement which will take priority over the generic handling, meaning that it'll have events more specialized.

For example, with generic touch input you want to emulate scrolling with 2 fingers, and maybe have some other gestures, however it'll collide with the users who want to have their own 2 fingers handling (maybe some touch based game, etc)

So such emulation should have a way to be turned off.

If we provide a way to turn such emulation off, it basically means that we don't have to do anything about the identity of the pointer at all, because once user starts to carry more about a specific source of input, they can implement a protocol for Touch/Tablet/Whatever input.

So I'd say that:

  1. Rename is fine.
  2. Touch should go into fn touch_handler() -> Option<&dyn TouchHandler> extension trait.
  3. Tablet should go into fn tablet_handler() -> Option<&dyn TabletHandler> extension trait
  4. Add generic PointerHandler trait which tries to plumb everything and behave as a generic mouse for backends where the emulation inside winit will be required.
  5. Not sure about position, though, maybe we can add that, but keep in mind that for example on Wayland I don't have the position when handling buttons, I should track it myself, but it's already tracked IIRC.

I think that's how it's usually done, even if you look into more complex systems with wayland protocol emulation, etc, etc.

This also solves the tilt problem, etc, since it'll have the cursor and all the data routed and it'll also save a lot on bandwidth, since you won't use anything unless the user asked for it.

Carrying fatty structs around when 90% of the users won't ever look into them (all the tablet tool stuff) is not really a great thing for performance.

Besides, trait extension means that all the code will be straight deleted as unused, and for Wayland I can completely not bind the protocols at all, meaning that even less code will be executed on the client in the end and the things will generally be more compact.

daxpedda commented 3 months ago

The reason this has come up here is because of Waylands tablet protocol, which exposes a very strange amount of graphic tablet specific input events. However AFAIK this is very Wayland specific and no other backend exposes this amount of specific information. No matter how we want to address it though, it should be at least routed to KeyboardInput for the same reason we want to route all pointer events to Pointer* events.

it's not a keyboard input, it's on the same level with mouse buttons, so applying anything related to keyboard to them is not going to work, but I'd rather just have an extension for that users can implement.

Uhm, the rest of your text argues in favor of generic input handling. E.g. pen can also be handled in your proposed PointerHandler. So are you saying that the tablet pad stuff should not be routed to a generic handler?

I think that's how it's usually done, even if you look into more complex systems with wayland protocol emulation, etc, etc.

If you have experience how others did that it would be great. Who is "usually"? E.g. I was taking inspiration from the Web API, which does it how I described.

This also solves the tilt problem, etc, since it'll have the cursor and all the data routed and it'll also save a lot on bandwidth, since you won't use anything unless the user asked for it.

Carrying fatty structs around when 90% of the users won't ever look into them (all the tablet tool stuff) is not really a great thing for performance.

I think we are talking about around 64 bytes ... so I don't think performance plays a role here.


I think in conclusion, you are suggesting taking the alternative route I proposed under "Keep WindowEvent::Touch", which I'm fine with as well.

There was one uncertainty that came up while reading your post: I'm not sure if you intend to stop routing events to the generic handler if users have registered for a specialized one, which I believe would be a mistake. The whole motivation for this overhaul is to let users have a single place to handle generic pointer input, it shouldn't matter if they registered for a more specialized one, that can be handled on a user-level.

The motivation behind this is described in "Unified Pointer Handling", but to add more to that, we should take into account how Winit is used in the wild, which is mostly not directly, e.g. Bevy or Eframe/Egui, our biggest users on crates.io (not counting Tauri, because AFAIK they don't expose Winit). In cases like this it should be possible that libraries hook into Winit without actually changing the Winit input for users. We have talked about use-cases like this in the past and come to the consensus that we want to add explicit event-suppression but not implicit!

We will discuss this proposal in the next meeting, so I will adjust it to whatever conclusion we come then, but otherwise I'm happy to change it to keeping specialized event handlers.

BattyBoopers commented 3 months ago

Carrying fatty structs around when 90% of the users won't ever look into them (all the tablet tool stuff) is not really a great thing for performance.

size_of::<WindowEvent>()
136

The event type is already about five times larger than what the pointer event variants would end up after adding all the pen axes as f32 each.

Besides, trait extension means that all the code will be straight deleted as unused

If there is such a benefit, wouldn't the other types of events also benefit from the same kind of API, such that other code paths could be optimized out? This sounds more like something that could be a more general refactor of the event system. But why does the implementation of touch and pen input in particular have to be burdened by this?

Why can't this be a feature flag instead?

Why can't this be a const bool in ApplicationHandler instead?

Besides, this is quite a verbose interface. You'd have to implement a trait, and also implement a method that returns a dyn trait reference.

daxpedda commented 3 months ago

If there is such a benefit, wouldn't the other types of events also benefit from the same kind of API, such that other code paths could be optimized out? This sounds more like something that could be a more general refactor of the event system. But why does the implementation of touch and pen input in particular have to be burdened by this?

Because its exactly how the Wayland API works. All other backends, except maybe X11, won't benefit much from this. But I don't mind that, I think we should try and use each backend to its fullest if possible. In this case it comes with little downsides.

Besides, this is quite a verbose interface. You'd have to implement a trait, and also implement a method that returns a dyn trait reference.

We know that in general the trait-based approach has been poorly received, but on the other hand we haven't really been offered an alternative solution. See https://github.com/rust-windowing/winit/issues/3432. But as long as we are already going with traits, this would be in line with the rest of the API.

kchibisov commented 3 months ago

Uhm, the rest of your text argues in favor of generic input handling. E.g. pen can also be handled in your proposed PointerHandler. So are you saying that the tablet pad stuff should not be routed to a generic handler?

yes, it should be left on its own, since it's probably similar to mouse buttons.

There was one uncertainty that came up while reading your post: I'm not sure if you intend to stop routing events to the generic handler if users have registered for a specialized one, which I believe would be a mistake. The whole motivation for this overhaul is to let users have a single place to handle generic pointer input, it shouldn't matter if they registered for a more specialized one, that can be handled on a user-level.

Hm, thinking about it, users have to store some state for their renderer anyway, probably on the ApplicationHandler trait, given that we can just use both interfaces and deliver auxiliary data via the specifically implemented events.

The only issue is when you want to do scrolling for the touch input, since you generally want to stop such a thing, so I guess having a trait should disable some of the handling.

That would also mean that we'd need a pointer device type identity, but that shouldn't be a problem.

If you have experience how others did that it would be great. Who is "usually"? E.g. I was taking inspiration from the Web API, which does it how I described.

I think GTK, they usually don't bind stuff unless you explicitly ask for it, at least it was like that with some handlers, like text-input(ime).

If there is such a benefit, wouldn't the other types of events also benefit from the same kind of API, such that other code paths could be optimized out? This sounds more like something that could be a more general refactor of the event system. But why does the implementation of touch and pen input in particular have to be burdened by this?

That's what the plan is, see #3432.

Why can't this be a feature flag instead?

You can not test all the feature permutations, imagine that you have 12, it means you have 12! combinations and that's just from the events, now add other features, and backends specific features, and you have untestable code.

Why can't this be a const bool in ApplicationHandler instead?

You can't do much with such a bool, so no. bool will just require you to have every method possible in your application, it won't result in all of that code to be deleted, unlike the dyn trait stuff.

Besides, this is quite a verbose interface. You'd have to implement a trait, and also implement a method that returns a dyn trait reference.

it's as verbose as enabling a feature, doing

fn pointer_handler(&self) -> Option<&dyn PointerHandler> {
    Some(self)
}

is not a much different to figuring out what feature to enable and add a bunch of cfg into your code, in fact, it's less verbose because you won't ever have cfg with this appreach, yes, even for web/macOS/darwin, since you don't have to cfg them, you'll just always get None if you try to cast to backend that doesn't work.

Because its exactly how the Wayland API works. All other backends, except maybe X11, won't benefit much from this. But I don't mind that, I think we should try and use each backend to its fullest if possible. In this case it comes with little downsides.

macOS is the same, you basically can stop registering selectors if user doesn't want certain events to be handled, I think windows is also the same, since you can omit some event handling all together.

kchibisov commented 3 months ago

@daxpedda in general, we can have both traits working at the same time, it's just we'd change certain behavior like disable touch scrolling emulation once you enable touch or other advanced emulations we might implement.

daxpedda commented 3 months ago

Uhm, the rest of your text argues in favor of generic input handling. E.g. pen can also be handled in your proposed PointerHandler. So are you saying that the tablet pad stuff should not be routed to a generic handler?

yes, it should be left on its own, since it's probably similar to mouse buttons.

No what I mean is, why are we not routing it to mouse buttons then? Ergo the proposed PointerInput.

Because its exactly how the Wayland API works. All other backends, except maybe X11, won't benefit much from this. But I don't mind that, I think we should try and use each backend to its fullest if possible. In this case it comes with little downsides.

macOS is the same, you basically can stop registering selectors if user doesn't want certain events to be handled, I think windows is also the same, since you can omit some event handling all together.

On macOS you don't require specific events for different pointer types, e.g. right now we already handle some pen input on macOS without registering any additional selectors. But maybe there is more I don't know about.

@daxpedda in general, we can have both traits working at the same time, it's just we'd change certain behavior like disable touch scrolling emulation once you enable touch or other advanced emulations we might implement.

I'm rather strongly against adding exceptions like this, I believe we should decide to go either one way or another. But I'm glad we were able to come to a consensus on everything so far except this one single point, its a detail we can get to at some later point.

kchibisov commented 3 months ago

I'm rather strongly against adding exceptions like this, I believe we should decide to go either one way or another. But I'm glad we were able to come to a consensus on everything so far except this one single point, its a detail we can get to at some later point.

I mean, users who do multi touch input want to hanle scrolling themselves, default behavior for clients who doesn't have touch input is that you have 2 finger scroll, maybe we'd need something like

fn emulate_two_finger_scroll(&self) -> bool {
    true
}

on the TouchHandler to control this behavior, because it's not really desired.

daxpedda commented 3 months ago

Changes from the latest meeting: