Re-enable Gamepad Access from HTTP (aka insecure) Context

zhangyx1998 commented 1 month ago

As pointed out in #113 #120 , insecure context has privacy risks imposed by code injection. And the gamepad API now plans to be completely disabled in context other than a secure connection (or by local-network-access, which looks irrelevant to me since it focuses on HTTP request handling instead of device permission).

The security concerns brought up by @marcoscaceres is totally understandable and should indeed be treated seriously. However, it should not come at the cost of killing many existing or potential DIY projects or local applications that cannot run over SSL. Therefore, I opened this issue hoping to discuss for better solutions which will allow access to gamepads while keeping out malicious code injection attacks over HTTP connection.

I've provided my use case here. Other people also have concerns on this, as shown under the original issue and PR.

My proposal is to let the browser engine prompt the user for their permission for full access to the gamepad. Full access means the full feature of the gamepad being exposed from gamepad API, even when window is not focused.

Before full access was granted, code from an insecure context may still access an abstracted controller that do not contain any device-specific information nor physical actuators, and will only be updated when the window is focused (behaving just like KeyboardEvent and PointerEvent).

This proposal is inspired by @BlobTheKat in https://github.com/w3c/gamepad/issues/120#issuecomment-2101353498. Since the gamepad is just another Human Input Device (HID) like the keyboard, it makes no sense to completely block access to it even with potential presence of malicious code.

Please feel free to leave comments if your use case is also affected by this restriction. All ideas and opinions on this issue are welcome.

BlobTheKat commented 1 month ago

thank you 🙏 This issue describes exactly what I meant but better worded

nondebug commented 1 month ago

My proposal is to let the browser engine prompt the user for their permission for full access to the gamepad.

I'm generally in favor of adding a gamepad permission since that will make it possible to expose more powerful gamepad features. However, it will take significant effort to add a permission and I feel this use case is not common enough to justify the work. I hope we do add a permission eventually, but I expect it will be in conjunction with adding some powerful new capability.

We've discussed adding a permission in the past. The current API is not well suited for a permission; normally we would want an async API so we can insert a permission prompt between calling getGamepads() and returning a sequence<Gamepad>. The current API is synchronous, so even if we do show a prompt we still need to return something. If we add a permission we will probably also add a new async way to request gamepads. This could be a new method like getGamepadsAsync() or maybe we could add options to getGamepads() that modify its return value.

Something we should consider is what happens to the already-returned Gamepad objects when the user grants permission. Presumably there will be some attributes that expose different values when the permission is granted vs. not granted. Are the "granted" and "not granted" versions of the Gamepad considered the "same" Gamepad? If so, are the Gamepad attributes updated "live" or does the caller need to call getGamepads() again to access the extra functionality? Are pre-grant Gamepad objects still usable? Would we fire disconnect/connect events when transitioning between permission states?

Full access means the full feature of the gamepad being exposed from gamepad API, even when window is not focused.

Gamepad API as specified doesn't consider window focus, only page visibility. Some implementations may have focus requirements due to the underlying operating system API used to read gamepad inputs. The OS-level API for reading gamepad inputs may deliver events only to focused applications. This behavior isn't defined as part of the spec and it's unlikely that implementations would be able to work around it without finding an alternative for the OS gamepad API.

In the Chromium implementation, different OS-level APIs are used for different types of gamepads so in some cases the focus-related behavior can differ between gamepads.

I feel gamepad inputs should not be tied to focus because it's unusual to use a gamepad and keyboard/mouse at the same time. Interfaces designed for use with gamepads typically don't have input focus, since gamepad text input is usually modal and involves an on-screen keyboard. Also, there are known use cases where it's useful for script to access gamepad inputs without focus.

Before full access was granted, code from an insecure context may still access an abstracted controller that do not contain any device-specific information nor physical actuators

I like this idea. I agree it should be possible to expose an abstracted controller while preserving user privacy and security.

marcoscaceres commented 1 month ago

We dropped the secure context requirement a few weeks ago, so I think this should now work ok? Please see: https://github.com/w3c/gamepad/pull/194

@zhangyx1998 can you confirm?

zhangyx1998 commented 1 month ago

We dropped the secure context requirement a few weeks ago, so I think this should now work ok? Please see: #194

@zhangyx1998 can you confirm?

Thanks! It seems like a roll-back to me, so it would definitely solve the problem for now.

Are you (the community) still interested in finding other approaches to mitigate privacy risks? Or have the community been settled on not making any changes to the GamePad permission?

zhangyx1998 commented 1 month ago

In response to @nondebug's post:

Thanks for commenting!

We can add a boolean field in each gamepad instance, e.g. gamepad.hasFullAccess, to indicate if full access was granted to a specific gamepad. This change is unlikely to break any existing use case.

In this way, navigator.getGamepads() will always return immediately, but the returned instances will initially have limited access. Since sensitive fields are abstracted, e.g. gamepad.id === "Generic Gamepad", this might also help with #73 .
From my understanding, the gamepad instance, once grabbed, is immutable (i.e. only contains one "frame" of user input, at the time it was returned). Therefore, gamepad objects instantiated before full access being granted should stay as-is (i.e. an abstracted version).
Since GamePads are uniquely identified by their index, a non-intrusive change can be made to the API by simply adding a new function, for example:
```
navigator.requestGamePadFullAccess(
    index: number, // gamepad.index
    callback?: (ack: boolean, gamepad: Gamepad) => void
);
```
This will trigger an async prompt to the user asking for full access. Optionally the web page will explain to the user why it would like to have such permission. Also, someone mentioned that the browsers are trying to minimize prompts to the user. In this case, an additional option can be added to "always allow" a website to have full gamepad access by default.

Upon user approval, any new Gamepad instance returned from the API, either through a gamdpad event or the nevigator.getGamePads() API, will expose all properties of the physical device. And its gamepad.hasFullAccess priority will be set to true.
The window focus issue turned out a little trickier than I supposed. Thanks for bringing that up! The idea is basically blocking the gamepad from being updated outside of user focus before user explicitly allows it. This can be implemented as stop updating the gamepad instance, i.e. any subsequent access to sequence<gamepad> will freeze at the frame when window is unfocused.

In this way, I expect most applications to live along with the abstracted gamepad (including mine). For those who need to access the full feature set of the gamepad in a HTTP context, or those in need of accessing the gamepad in the background, they can simply prompt for user permission instead of asking users to install and trust their self-signed SSL certificate.

Please let me know how you think of this, thank you!

nondebug commented 1 month ago

Are you (the community) still interested in finding other approaches to mitigate privacy risks? Or have the community been settled on not making any changes to the GamePad permission?

We're always interested in better protecting the user's privacy. The permission discussion is ongoing, there are features we want to support through the API that wouldn't be appropriate if the only user consent is a button press so I think this is bound to happen eventually.

I also maintain device APIs like WebUSB and WebHID which use a per-device permission model. One of the frustrations with these APIs is you sometimes need to access the same device through multiple APIs and this currently requires multiple permission prompts. In my opinion, one prompt should be sufficient because the user shouldn't be expected to know or care about such low-level implementation details. If the user has already granted access for a site to control the device through a low-level interface, then we may as well grant permission to access the device through any interface. This puts the permission at a level that the user can actually understand.

If this sort of cross-API per-device permission were available then we could use it for Gamepad API as well. That's one way we could add a permission without an explicit "gamepad" permission. I think a per-device permission makes sense for gamepads because the capabilities can vary significantly. Just because the user is okay with a site accessing their button and thumbstick inputs doesn't mean they're okay with it accessing microphone data or motion sensor data.

From my understanding, the gamepad instance, once grabbed, is immutable

This is true in the current version of the spec but it was initially unclear on whether getGamepads returns live objects or immutable snapshots and the current implementations don't agree. The Chromium implementation is immutable while Safari and Firefox are closer to live objects. It's unfortunate that the behavior differs but it isn't easily fixable without potentially breaking applications that rely on the current behaviors.

I don't have a strong opinion either way, I think we should do what's best for developers and I suspect that's a live object. So, there's a chance we may change the API to return live objects someday.

The idea is basically blocking the gamepad from being updated outside of user focus before user explicitly allows it.

Yes, this makes sense. I think we should improve the API's user consent by amending the spec to require window focus when handling the initial gamepad gesture. Filed as #206

zhangyx1998 commented 1 month ago

Seems like all my concerns are well taken care of. I'll close this issue.

w3c / gamepad

Re-enable Gamepad Access from HTTP (aka insecure) Context #203