WebAudio / web-midi-api

The Web MIDI API, developed by the W3C Audio WG
http://webaudio.github.io/web-midi-api/
Other
325 stars 49 forks source link

getMIDIAccess should be [SecureContext] #183

Closed cwilso closed 6 years ago

marcoscaceres commented 6 years ago

Probably also anything with a Constructor too. Although the events and objects are benign, it’s just one less thing to worry about in insecure contexts.

cwilso commented 6 years ago

Fixed.

jwt27 commented 1 year ago

I think I've looked everywhere, but I've yet to find an explanation as to why this was needed. What exactly is insecure about MIDI, and how does requiring an SSL-enabled host alleviate that? Do people really send out their credit card details via sysex?

For someone looking to build a hardware synth with a web interface, they now have to support SSL/TLS in their embedded http server, somehow deal with certificates, accept the increased latency, etc.

Would appreciate if someone could explain the rationale behind this.

mjwilson-google commented 1 year ago

Yes, we need to explain this better and we have an open issue about adding more information to the Security section of the spec here: #185 -- the first comment has a link to Mozilla's standards position which has some discussion about MIDI security.

Briefly, the main security concerns are:

I think it is unlikely that we will change the context to not be secure, but if you feel strongly please feel free to discuss on #185 (that issue is still open and will probably get more visibility).

cwilso commented 1 year ago

I will call out that these have been detailed in the Web MIDI spec since its inception: https://webaudio.github.io/web-midi-api/#security-and-privacy-considerations-of-midi. There is good, solid reasoning behind why MIDI should be considered a powerful API, given its long history (and legacy of drivers and devices).

jwt27 commented 1 year ago

I will call out that these have been detailed in the Web MIDI spec since its inception: https://webaudio.github.io/web-midi-api/#security-and-privacy-considerations-of-midi. There is good, solid reasoning behind why MIDI should be considered a powerful API, given its long history (and legacy of drivers and devices).

Doh! Somehow I completely missed that, sorry. To address those points:

For fingerprinting, I don't see how requiring a secure context helps here. The solution is the same in either case; ask the user for permission to enumerate devices.

Sysex output, sure, that is a reasonable concern (but maybe there's something to be said about security-by-obscurity here). Again though, I don't think SSL offers much protection here. After granting permission, there is nothing stopping a "legitimate" server from transmitting malicious sysex packets.

With regular channel/realtime messages, at worst, someone could listen in on your music. I think it should be up to the end user to decide if that is an acceptable risk.

cwilso commented 1 year ago

To address both of these, powerful APIs (like those that require permissions) are typically only offered on secure contexts today. The W3C Technical Achitecture Group has guidance on this in their design principles for web APIs: https://w3ctag.github.io/design-principles/#secure-context. In short, powerful APIs in their opinion should only be offered on secure contexts, because in insecure contexts it's too easy to break the integrity of the context (i.e. get access from some other context; it's not just the domain you granted permission to you need to worry about, some other domain might use that to get access too).

jwt27 commented 1 year ago

I have noticed that trend, and while I'm generally in favor of it, for a protocol as well defined as MIDI (sysex aside) it seems very much overkill. The "powerful" aspect of this API is really just device enumeration (which should always require user permission, SSL or not), and manufacturer-specific sysex, which has no well-defined meaning. All other messages can safely be categorized as "entirely benign", and generally carry no confidential data. Not quite on the same level as say, WebUSB.

cwilso commented 1 year ago

Except that even just sending note-on/note-off messages cannot be safely categorized as benign. For example, in the past there was a potentially-exploitable system-crashing bug we discovered on Windows machines, when you sent a very large number and high density of only note-on/off messages to the built in Windows software synthesizer. It is likely true that the worst that would happen sending MIDI note-on/off/controller messages to arbitrary devices is nuisance (stuck notes on hardware synthesizers, etc.) - but even that is enough to need to treat it as powerful.

Sysex is potentially far, far more powerful; yes, there is no "well-defined meaning", but firmware updates to MIDI devices are frequently delivered via sysex. It is possible to discover legacy USB-MIDI devices that would allow a firmware update that would rewrite the USB interface to enable other USB interface classes, enabling other kinds of system attacks. Unlikely and narrow? Yes; but that's not the same as secure. The large number of varied legacy MIDI devices makes it more concerning, not evidence that it is safe in a web context.

(Keep in mind I am somewhat playing devil's advocate here; I believe quite strongly in enabling MIDI on the Web, and I believe with appropriate precautions, it is absolutely acceptable risk.)

jwt27 commented 1 year ago

I don't see any way to fully guard against such exploits, except by patching the affected software. You still have to trust the host not to send any malicious data - even if it does present a legitimate SSL certificate.

The same risk exist when downloading a .mid file, and you don't need any SecureContext there. (As an aside, I've also yet to see any browser complain that "this type of file can harm your computer".)

cwilso commented 1 year ago

That same risk did NOT occur when downloading a .mid file, however - because 1) .mid files cannot play without user intervention - haven't been able to since BGSOUND stopped being supported in IE (and this is one of the reasons why it was locked down) - and 2) .mid files would be saved, and then needed to be opened by another application. Even in cases like Windows where you might have a player associated in the browser - e.g. it would be handled by the Windows Media Player, it would end up being opened and played in the WMP application context - where it might crash, but it would not be crashing the browser process.

I get this may seem like I'm splitting hairs. I'm not. We went through a deep and long set of security reviews, over time, with MIDI support; it is not as universally safe as it might seem.

And yes, the user does need to express trust in a [secure, legitimate SSL cert] host. The reason for secure contexts is it's much harder to pass data/execution context BETWEEN secure contexts - so if the user lets https://a.com/ have MIDI access, that doesn't accidentally bleed through so b.com can execute against that API.

jwt27 commented 1 year ago

Okay, thanks, I'm starting to see how this gets tricky. Must also say this is the first time I've heard about such bugs/exploits. But of course, I shouldn't be surprised. They had to exist somewhere.

How about input though, surely that is safe in an "insecure" context? Users would have to be informed about potential eavesdropping, same as when entering login data on a plain http page. And enumeration could be anonymized, as I mentioned in the other thread, so that would open up at least half of the API (and incidentally, would be sufficient for my web synth idea).

cwilso commented 1 year ago

I can't immediately come up with a significant concern with offering an API that ONLY offered MIDI input (from a default device, or all attached devices, but with no device enumeration or identification, and ONLY when the page has focus), and limited it to note-on/note-off and controller messages (no MIDI machine control, sysex, timing). This is eavesdropping on your MIDI device, though - it's just unlikely to be anything but an empty stream unless you're actually playing the device. It's also a separate API entry point (instead of getMIDIAccess, you'd need something like attachMIDIListener that only ever got anonymized input short messages).

The concern, though, is that if there is any risk to potential eavesdropping, you fail the bar for insecure contexts - if users need to give permission or be informed, it's considered a powerful API. I'm not sure you could convince people that users would expect an arbitrary web page to be able to listen in on this in the same way as you can listen in on keystrokes when your page has focus.