w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
326 stars 55 forks source link

WebHID API (Human Interface Device) #370

Closed nondebug closed 3 years ago

nondebug commented 5 years ago

Góðan dag TAG!

I'm requesting a TAG review of:

Further details (optional):

You should also know that...

Most HID peripherals are also USB and Bluetooth devices, and in many cases a HID peripheral can also be accessed using one of these APIs. This specification is intended to follow the same usage patterns established by WebUSB and Web Bluetooth.

We'd prefer the TAG provide feedback as (please select one):

dbaron commented 5 years ago

(Ignore what I previously wrote here.)

It seems like some of the material in the security and privacy self assessment could also be in the explainer. If I'd seen that in the explainer I wouldn't have asked what I originally wrote here.

nondebug commented 5 years ago

Thanks for taking a look! I've added a "Security and privacy considerations" section to the explainer which distills the responses from the questionnaire.

lknik commented 5 years ago

"A user must explicitly grant access for an origin to access a device by selecting the device from a chooser list. "

Will you gate it with permissions too?

Thanks for a nice initial S&P considerations. I agree there is a risk of profiling and possibly leak of environmental data, that would be close to those from sensors. However, sensors are a quasi-standard thing and HID API potentially allows more.

nondebug commented 5 years ago

Yes, besides the per-device, per-origin permission granted when a user selects a device from the chooser, there will be an additional guard permission that determines whether any site can open a device chooser dialog. Enabled is "Ask when a site wants to access HID devices", disabled is "Do not allow any sites to access HID devices".

cynthia commented 5 years ago

We've looked at this during the Tokyo F2F.

Blunt question: will the device selector dialog come up every time the user tries to access the same device on an already approved origin?

One simple actionable point of feedback is that this HID should only be available to the current active tab. If you had particular use cases in mind that allow HID devices to be read from multiple tabs, we'd like to hear what you have in mind. (From a user's perspective, input hitting multiple tabs would be analogous to keyboard events being broadcasted to all tabs.)

One more bit is we aren't quite sure if providing lots of device specific details to be readable would outweight the benefits of increasing the fingerprinting surface. Would obstructing this information away be problematic?

Aside from that, I think we're fairly happy to see this move forward .

(Background tabs for example could either detach or just not give back any messages, which way is more appropriate we don't have a strong opinion.)

nondebug commented 5 years ago

Thanks Cynthia!

Blunt question: will the device selector dialog come up every time the user tries to access the same device on an already approved origin?

No, the final implementation will persist permissions so that the chooser dialog only needs to be shown once per origin/device pair.

One simple actionable point of feedback is that this HID should only be available to the current active tab. If you had particular use cases in mind that allow HID devices to be read from multiple tabs, we'd like to hear what you have in mind.

I'm not aware of any use cases that would require a single HID device to be accessible from multiple tabs at the same time, perhaps we can restrict access to a single tab per device. However, at the OS level the device is still accessible to other apps which could allow access from multiple tabs (eg, a second browser instance).

If HID is only available from the active tab then this will break use cases where the tab accessing the device is expected to be in the background. It would also make it impossible to access HID from contexts with no active tab, e.g. Service Worker.

Ex: USB headsets often provide advanced features using HID (mute button, volume controls, status LEDs). A user is using her USB headset to make a call in one tab, then switches to another tab. The audio functionality of the headset will continue to work but HID-based features will not work until she switches back.

Ex: WebHID can be used to control LEDs on connected HID devices. This could be used to match the color of RGB LEDs to the action of a game, or to match colors across multiple connected devices. If WebHID is only accessible to the current active tab then the LED colors cannot be changed when the tab is backgrounded, limiting possibilities for animated patterns.

From a user's perspective, input hitting multiple tabs would be analogous to keyboard events being broadcasted to all tabs.

This is dangerous for keyboards, but keyboard inputs will not be accessible through WebHID. For other types of HID devices, shared access is expected. For instance, gamepad inputs are typically available in all apps (or browser tabs) simultaneously.

One more bit is we aren't quite sure if providing lots of device specific details to be readable would outweight the benefits of increasing the fingerprinting surface. Would obstructing this information away be problematic?

Can you point out which details you are concerned about? Most of the device-specific details are in the HIDDevice.collections member which is a representation of the information in the HID report descriptor. The report descriptor allows the host to detect a device's capabilities without specific knowledge about the device. If we obscure too much information in the report descriptor we may prevent apps from building generic device drivers, which would be a shame since this is one of the greatest advantages of the HID protocol.

The report descriptor is typically going to be identical for any two devices with the same vendor and product IDs. As such, it usually does not expose more fingerprinting surface than just exposing the IDs.

Other device details in HIDDevice are necessary:

Vendor/Product ID are used to identify the device. Removing these IDs would make it impossible to detect when specific devices are connected.

The product name is the string shown in the chooser dialog and is the only user-facing identifier for the device. In some cases the site will need its own chooser (separate from the permissions dialog) to allow the user to select from multiple connected devices. In this scenario, it is best if the device identifier used by the site's chooser matches the identifier used by Chrome's permissions dialog.

cynthia commented 4 years ago

No, the final implementation will persist permissions so that the chooser dialog only needs to be shown once per origin/device pair.

Okay, we'll do the security assessment based on this information. One thing we discussed as a hypothetical attack vector is cross-origin identifier leaking through stateful HID devices - this is entirely hypothetical though.

For instance, gamepad inputs are typically available in all apps (or browser tabs) simultaneously.

I'm not quite sure if this is intended design, or an oversight. From the back of my head this doesn't feel right. Will take a look at the spec on this and report back.

Ex: WebHID can be used to control LEDs on connected HID devices. This could be used to match the color of RGB LEDs to the action of a game, or to match colors across multiple connected devices. If WebHID is only accessible to the current active tab then the LED colors cannot be changed when the tab is backgrounded, limiting possibilities for animated patterns.

So, for this particular example LED control can potentially race (one game flashing it in orange, while one game trying to flash it in green will definitely look completely bizarre), which we believe can confuse users - the easiest mitigation would be to allow only the current active tab. This unfortunately as you mention has limitations too - like the service worker constraint, which is an extremely good point. We'll have to give that bit some more thought.

Can you point out which details you are concerned about?

The main bit that came up was for example, if serial numbers of devices were surfaced that would be an extremely reliable cross origin tracking ID. Whether or not this would be exposed seems entirely device dependent, so whether or not that would be a problem I think depends on the amount of devices that come with unique identifiers built in.

(Note: I wrote "we" above, but this isn't group thinking yet as everyone is split up in TPAC meetings.)

nondebug commented 4 years ago

It is possible for a stateful HID device to be used to communicate an identifer across origins.

Ex: ThingM Blink(1) is a USB HID device with an RGB LED that can be controlled using HID reports as described here. Note the presence of both "Set RGB color now" and "Read current RGB color" reports. It wouldn't be difficult to implement serial communication between two origins using these reports.

I agree that exposing an identifier like a serial number is a major concern for this API. WebHID will need to capture a stable identifier in order to implement persistent permissions. The stable identifier will only be used internally and will never be exposed to script. For USB we plan to use the USB serial number (fetched along with the USB device descriptor when the device is connected) and for Bluetooth we plan to use the Bluetooth MAC address. (Other transports may not support persistent permissions due to the lack of a suitable stable ID.)

These identifiers are provided by the transport-level protocols and are generally not accessible at the HID level. However, it is possible to access this information on some devices using device-specific reports.

Ex: Sony DualShock 4 is a HID gamepad that supports both USB and Bluetooth modes, and may be concurrently connected to the same host over both transports. In USB mode, the device has a HID report that exposes the Bluetooth MAC address. The host can use this to detect when the device is double-connected. Blocking this report would require device-specific logic.

I'm not quite sure if this is intended design, or an oversight. From the back of my head this doesn't feel right. Will take a look at the spec on this and report back.

The Gamepad API working draft doesn't specify, and behavior varies between browser implementations. Chrome and Firefox provide simultaneous access to all tabs. In Safari, any tab can see the list of connected gamepads but only the active tab receives gamepad inputs.

At the OS level, shared access is typical. For instance, if Firefox has already opened a connection to a gamepad device, Chrome is not blocked from accessing the same device.

LED control can potentially race (one game flashing it in orange, while one game trying to flash it in green will definitely look completely bizarre), which we believe can confuse users

Agreed that this is confusing. Gamepad API has a similar concern where multiple tabs can issue vibration commands to the same device. For LEDs or gamepads it is relatively easy for the user to diagnose the issue and close the other tab, but for other types of HID devices it may not be as obvious why the device is behaving strangely.

I think introducing the concept of exclusive access would be helpful, but it comes with its own potential for confusion. For instance, if one tab has exclusive access to a device it may not be obvious why the device is unavailable in another tab. This can be mitigated by displaying a tab indicator icon when a HID device is in use, but it doesn't help for service workers which have no suitable indicator area.

Exclusive access needs to be implemented carefully. If one tab takes exclusive access of a device, other tabs can observe this change by trying to open the device and detecting failure. It may be appropriate to signal other tabs about the access change by simulating disconnection until the tab releases exclusive access. However, this would allow a tab to flood other tabs with connect/disconnect events by rapidly taking and releasing exclusive access. This could be mitigated by requiring a user activation for requesting exclusive access or imposing a cooldown after releasing access (connection events would be delayed until the end of this cooldown in case the tab re-acquires exclusive access). It might make sense to restrict some API features (like write access) to exclusive access mode.

kenchris commented 4 years ago

Could values like "english-linear" be called "imperial-linear" or similar, as there are English speaking countries that use SI units

plinss commented 4 years ago

Several of us took another look during our Wellington F2F, we think this is a valuable use case, but: 1) The Gamepad API should be based on this API, particularly since many of the stated use cases are around Gamepads. e.g. could the Gamepad API be implemented as a polyfill using this API? 2) The ability to communicate across domains which should be isolated from one another via the device seems like a major security hole which we are deeply concerned about.

kenchris commented 4 years ago

2. The ability to communicate across domains which should be isolated from one another via the device seems like a major security hole which we are deeply concerned about.

Actually this can be a major problem if using a common JS library for gamepad support, as this can mean that the same gamepad is registered across multiple sites (which might be opened at the same time) and then the library might be doing things it is not supposed to, like allow communication across these sites without the user knowing

plinss commented 4 years ago

Could values like "english-linear" be called "imperial-linear" or similar, as there are English speaking countries that use SI units

Alternatively, would it be possible for the API to simply expose SI units and do the conversion?

cynthia commented 4 years ago

I realized that we haven't filed issues on the group's repository (as requested in the template) so I'll take an action to do that.

nondebug commented 4 years ago

Could values like "english-linear" be called "imperial-linear" or similar, as there are English speaking countries that use SI units.

Sure, let's s/english/imperial/g

The names come from the USB HID spec but the exact strings aren't important.

Alternatively, would it be possible for the API to simply expose SI units and do the conversion?

I don't think this is practical. The only values that would make sense to scale are Physical Minimum and Physical Maximum, which are "long" type because that's the type reported by the device. If we want to convert them, they need to be floating point and the converted values will stand out as obviously faked. I think it would be better to expose the information reported by the device without modification.

nondebug commented 4 years ago

The Gamepad API should be based on this API, particularly since many of the stated use cases are around Gamepads. e.g. could the Gamepad API be implemented as a polyfill using this API?

Only a subset of gamepads are HID. Additionally, I believe it's better for browsers to consume gamepad input through the platform's gamepad-specific input API. HID should be used as a last resort for gamepads that aren't exposed through higher level APIs.

That said, I do think there's an opportunity for WebHID-based polyfills that provide support for novel HID gamepads. Note that a WebHID-based polyfill would require a chooser dialog that isn't needed for normal Gamepad access.

nondebug commented 4 years ago

The ability to communicate across domains which should be isolated from one another via the device seems like a major security hole which we are deeply concerned about.

I don't see a good way to mitigate this. I'll note that the same security hole exists on all major OSes. If two applications have write access to the same device then they can use the shared state for communication.

Actually this can be a major problem if using a common JS library for gamepad support, as this can mean that the same gamepad is registered across multiple sites (which might be opened at the same time) and then the library might be doing things it is not supposed to, like allow communication across these sites without the user knowing

In principle it doesn't matter if the access is simultaneous since information can also be communicated asynchronously, although simultaneous access would allow faster communication. Supposing we block simultaneous access at the API level, it doesn't actually protect the user since they can still access the device from another browser instance running the same malicious library. To prevent simultaneous access we need to request exclusive access to the device at the platform level, which isn't possible on all OSes and may interfere with other running applications.

torgo commented 4 years ago

Also relevant to this discussion - the TAG published a finding about unsanctioned tracking. We're concerned about opening up any new avenue for such activity.

cynthia commented 4 years ago

@nondebug We'd like to have a separate (possibly synchronous) in-depth discussion about the security/privacy implications around this. (I can be found on irc.w3.org, if that makes things easier.)

roderickc commented 4 years ago

@cynthia @nondebug

I just became aware about the WebHID spec and am a bit worried about the security aspects. At Sony our gamepads (DualShock 3, 4 and next-gen controller) are all HID. The devices are quite complicated and a lot happens over HID and not just input.

From the security side you can issue a feature report to get the MAC address, firmware version and some more, so finger printing is an issue. What for me is even scarier than finger printing is that you can issue feature reports to perform firmware updates. A naughty website could brick controllers.

As I mentioned our devices are complicated as even audio data and microphone data all works over HID. That stuff has to be handled using proper device drivers, but anyone using raw HID with our controllers (even when not using audio) e.g. for rumble or lights would cause interference.

Similar using HID we can change our power settings and other settings. A platform driver (e.g. on Linux hid-sony) manages such settings. User mode drivers and kernel drivers managing a device is asking for trouble.

nondebug commented 4 years ago

@roderickc

From the security side you can issue a feature report to get the MAC address, firmware version and some more, so finger printing is an issue.

Access to device information and input/output/feature reports is gated behind a chooser dialog. By default, an origin cannot access any information about connected devices. This protects users from fingerprinting in the common case where an origin has no permissions for any devices.

In the case where permission is already granted, WebHID avoids exposing sensitive identifiers like USB serial number or Bluetooth MAC address. However, there is always the potential for vendor-specific functionality (like DS4's MAC address report) that can be used to create a persistent identifier.

Some devices are too sensitive to be exposed to script at all. In Chrome, we maintain an internal block list of USB vendor/product IDs, perhaps DS4 should be added to that list.

What for me is even scarier than finger printing is that you can issue feature reports to perform firmware updates. A naughty website could brick controllers.

I agree this is scary. There are probably many devices that support unsigned firmware upgrades through HID. These devices could be hijacked by a malicious script and reprogrammed to do any number of terrible things.

When a device is known to be vulnerable, we can add it to the block list or otherwise restrict access to the vulnerable reports. Let's discuss potential mitigations for Sony devices off-thread. In the general case it's not possible to know whether a device supports unsigned firmware updates or other dangerous features.

On some level, the risk here is no different than the risk of accessing a device through WebUSB. Both APIs use a similar chooser-based permission model. With WebUSB device access you can create persistent identifiers, rewrite firmware, etc.

anyone using raw HID with our controllers (even when not using audio) e.g. for rumble or lights would cause interference

This is something that must be addressed at the platform level with an exclusive driver. Desktop OSes generally default to non-exclusive access to HID devices, so the expectation should be that multiple applications could access the device at the same time.

cynthia commented 4 years ago

We've discussed this at length during our weekly conference call, and would like to hear more about how you folks plan to mitigate this. Would it be possible for us to be part of the mitigation discussion?

@nondebug @roderickc

kenchris commented 3 years ago

FYI @reillyeon do you have any comments on @cynthia concerns above?

torgo commented 3 years ago

@cynthia discussed the idea of bringing together some kind of discussion at TPAC around this…

nondebug commented 3 years ago

@cynthia @roderickc Will you be able to attend the Device and Sensors WG meeting on Oct 22-23? If not, let's schedule another time to discuss security and privacy mitigations.

cynthia commented 3 years ago

@nondebug I can block my calendar if I have a time slot to work with. ~Is there a detailed schedule? (Note that I'm in JST.)~ Found it.

anssiko commented 3 years ago

FTR: WebHID TPAC discussion scheduled for the Devices and Sensors WG meeting: https://github.com/w3c/devicesensors-wg/issues/31

toreini commented 3 years ago

Hi, Just a quick thought on the S&P aspects of this document: I think this api will likely be used in serious situations such as DRE machines for electronic voting (let us assume it gets implemented somewhere in the future, either in local or general elections). Should we add some example use cases for the S&P implications?

Cheers, Ehsan

nondebug commented 3 years ago

We discussed WebHID privacy and security concerns at the Device & Sensors meeting WG at TPAC 2020 (minutes).

@cynthia Do you have any additional comments for the TAG review?

nondebug commented 3 years ago

Summarizing the TPAC discussion.

@cynthia Do you have any topics you'd like to follow up on?

cynthia commented 3 years ago

@nondebug thanks a lot for the summary! The last two points look good to me. I thought about this for a bit after the meeting, but then forgot to follow up, so thanks for the reminder.

Also there are use cases where multiple origins may need to access the same device

If the permission dialog has three (allow shared, allow exclusive, deny) buttons instead of two (allow, deny) - this case might be coverable. (Whether or not current browser permission management lets you do that, that is another problem.)

Would this be a possible option considering?

any vendor may contribute new rules by pull request.

I had one question I did not manage to ask during the meeting - and that's about how to validate the provenance of the pull request. I'm assuming past requests were through corporate e-mail, which has some level of verification power.

But with pull requests, since Github basically lets you claim anything as your employer - so provenance forgery is definitely possible. (And unfortunately we've never had to deal with this kind of problem yet) Aside from first-party requests, there is the case of valid third party requests - for example from a security researcher. This is more of a policy problem than a technical problem, so I might want to bring it up with W3C to figure out how we want to deal with this.

Do you have any thoughts on this?

cynthia commented 3 years ago

Aside from the questions above, I'm happy with the outcome of the discussion - it's still slightly scary to think about corner case implications, but I think this is worth experimenting with in the wild.

nondebug commented 3 years ago

If the permission dialog has three (allow shared, allow exclusive, deny) buttons instead of two (allow, deny) - this case might be coverable. (Whether or not current browser permission management lets you do that, that is another problem.)

Would this be a possible option considering?

I'm concerned that it would be difficult to educate users on the difference between shared and exclusive access. 99% of the time either "allow" option would work. In the instances where one or the other doesn't work, the user will experience failures that can't be easily traced back to the permission choice.

Applications could mitigate this by guiding the user to select one or the other, but if the correct choice is already known by the application prior to requesting permission then perhaps it should just be an option for requestDevice instead of a choice presented to the user.

Aside from first-party requests, there is the case of valid third party requests - for example from a security researcher. This is more of a policy problem than a technical problem, so I might want to bring it up with W3C to figure out how we want to deal with this.

Do you have any thoughts on this?

That's a good point, I think we should be able to accept third party requests but with a higher bar for proof that the blocked functionality presents a credible risk to users. Even if we're convinced that some blocking is warranted, in most cases we will want to follow up with vendors to make sure the proposed blocklist rules have the correct scope.

cynthia commented 3 years ago

Applications could mitigate this by guiding the user to select one or the other, but if the correct choice is already known by the application prior to requesting permission then perhaps it should just be an option for requestDevice instead of a choice presented to the user.

This actually seems like an idea that might work! Can we have try to get this into the spec?

That's a good point, I think we should be able to accept third party requests but with a higher bar for proof that the blocked functionality presents a credible risk to users. Even if we're convinced that some blocking is warranted, in most cases we will want to follow up with vendors to make sure the proposed blocklist rules have the correct scope.

Sounds good. Just wanted to make sure we had some sort of protocol for this.

I think if we can try a direction where the application requests what they think they need, we think our work here is done. Thanks for the patience!