Closed RByers closed 9 years ago
I agree that this is a slippery slope, but I think that deviating from the behavior of other browser vendors here could be quite painful for web developers in the future.
We may want to keep a list of the ambiguous cases. A small non-normative table might be adequate.
Yeah, I'm not opposed to some non-normative suggestions here. I already kind of did that with the mention of resize
.
I made some improvements in 1b11ca1cef560bd1948e9ef9fd04501006e5083e to attempt to more precisely specify how sourceDevice
is supposed to behave in an abstract way (without needing to resort to event-specific text beyond examples).
Maybe this is good enough now?
In particular, are there any other scenarios whose behavior seems ambiguous? I think the "all events that follow from a single action must have the same sourceDevice" rule removes a lot of the ambiguity.
This is a significant improvement. I'm not sure we can make a hard guarantee that all events triggered from a single user action will have the same sourceDevice though.
For instance, on Windows, if the user taps an unfocused window, I think we get a WM_FOCUS message, which causes us to fire a focus event, as well as a WM_TOUCH message, which causes us to dispatch a touch event. Setting the sourceDevice of the focus event to the touchscreen which created the touch event may not be possible.
Thanks, if we can find examples like that then I agree we'd have to change the definition. I'm not sure I buy your example though. WM_FOCUS causes window focus, right? Does that in any scenario generate a 'focus' event (which is about indicating which element has keyboard focus)? For that scenario I thought the web app would always see the touchstart first, and only get 'focus' as a result of a tap gesture.
If you:
then on the touch press, you receive a focus event on the text input, even though no tap has occurred.
You can easily test this here (it just logs to the console every time the text input is focused).
Ah, that makes sense - thanks. I think we need to differentiate between the case where you press on the window chrome vs. on the web contents. On the chrome this is just like any other system-determined interaction (like resize), so 'null' is appropriate.
On the contents (at least on the frame), a touchstart event will get dispatched, so ideally I think the focus would have a sourceDevice for the touchscreen. eg. I can imagine affordance scenarios where you might like to know at focus time whether you should expect to see touchstart or mousedown following it. It seems like it might not be too hard to fix this in theory. Eg. queue sending 'focus' events on WM_FOCUS in this case and release the queue only when we see the input event. Seems like a bit of an edge case though - I'd personally treat it as a pretty low priority bug unless we find some compelling use case. WDYT?
This is getting pretty esoteric, but could we receive two focus events before dispatching a touch event?
I suspect that the queue approach would be adequate, and you're right that this is low enough priority that we can leave it for now, but I'm a bit worried that the queue could give us the same problems as the scroll event, where a single event can be contributed to by multiple inputDevices, and it isn't clear what the sourceDevice should be.
If the queue could reduce the number of focus events dispatched, that could be an incompatible change regardless of the reported sourceDevice.
I doubt eliding some focus events (which were immediately followed by a different focus event) would be a breaking change. But regardless we could certainly implement a queue that preserves all the events if that's what we wanted (no reason we have to coalesce like 'scroll' events do).
Of course, thanks.
I'm happy with the state of this currently. I'll close this issue for now, please re-open if you think there's additional work that needs to be done here.
There are a number of subtle implementation details for some events. Eg. does the sourceDevice of a keypress for an on-screen-keyboard represent the touchscreen or a logical keyboard device (probably the latter).
My thinking is that these are implementation details and since the API describes the behaviors you should see then it's OK if some user agents expose their subtly different behavior through this API. There's certainly a slippery slope here where we could add a lot of complexity to the spec and implementation for little actual developer value.
My preference is to wait for specific concrete use cases before considering adding any such complexity to the spec.
@tdresser WDYT?