WebAudio / web-midi-api

The Web MIDI API, developed by the W3C Audio WG
http://webaudio.github.io/web-midi-api/
Other
321 stars 55 forks source link

Virtual MIDI ports #45

Open cwilso opened 11 years ago

cwilso commented 11 years ago

In speaking with Korg at NAMM, they really wanted to have the ability to define virtual MIDI input/output ports - e.g., to build a software synthesizer and add it to the available MIDI devices when other apps/pages query the system.

Yamaha requested the same feature, even to the point of potentially creating a reference software synth.

We had talked about this feature early on, but cut it from v1; truly adding the device into the list of available MIDI devices to the system (e.g. other native Windows/OSX apps could access) would likely be quite hard, involving writing virtual device drivers, etc., which is why we decided not to feature in v1. We might consider a more limited feature of virtual devices that are only exposed to web applications; this might still be a substantial amount of work to add, but I wanted to capture the feedback.

jussi-kalliokoski commented 10 years ago

I've spent some time thinking about this feature lately and I want to share some ideas and open questions I have. At the simplest, what I see we could have is this:

dictionary MIDIPortOptions {
  string name;
  string manufacturer;
  string version;
}

partial interface MIDIAccess {
  MIDIInput createVirtualOutput(MIDIPortOptions options);
  MIDIOutput createVirtualInput(MIDIPortOptions options);
}

As you can see, createVirtualInput() creates a MIDIOutput, naturally, since the return value is not the resulting port but an API for making it do things, i.e. in order for a MIDI output port to generate output, you have to define that output.

The resulting ports would naturally have no id since they don't correspond to any actual ports, virtual or physical.

One open question is whether we want to have these as factories on the MIDIAccesss interface or as global constructors. On one hand, for the basic uses of the ports it doesn't make any sense to require user consent, there's no fingerprinting potential since even the id is not exposed, nor do the ports really do anything unless the user explicitly connects them somewhere manually or via another piece of software (s)he's running.

However, this needs to be thought through. I've heard that some iOS apps use virtual MIDI ports to communicate with each other. If that is the case, we need to consider whether a web app pretending to be another native application should be considered a potential risk. In the worst case nightmare scenario an app would be transmitting Lua (or similar) code via MIDI which could result in a cross-application scripting attack, possibly leveraging all the privileges the user has granted that application. Another, much likelier case would be that a user's credentials would be transferred from one application to another, similar to OAuth except that the authentication would happen in another application instead of on the web, and an intercepting application could steal these credentials.

toyoshim commented 10 years ago

Here is my thoughts for virtual ports. Sorry, but this note focus on another point.

If we start to support virtual ports, we should consider each device latency more seriously. Here, I suppose that major use cases of virtual ports are software synthesizers. Software synthesizers cause much more latency than real devices. As a result, without latency information, it must be very hard to use both hardware and software simultaneously in one web application.

Also virtual ports can be used for controlling remote MIDI devices through the internet. Even in this use case, latency is important and should be handled correctly.

So in the v2 spec, MIDIPort may want to have an additional attribute for latency from event arrivals to audio playing back.

jussi-kalliokoski commented 10 years ago

I agree about the latency, we need to take that into account. Some use cases:

So basically, I think what we need is for normal ports a way to read their latency (if not available, report 0) and for virtual ports to write their latency, e.g.

partial interface MIDIInput {
  readonly double latency;
};

partial interface MIDIOutput {
  readonly double latency;
};

partial interface VirtualMIDIInput {
  double latency;
};

partial interface VirtualMIDIOutput {
  double latency;
};
marcoscaceres commented 10 years ago

I prefer these were constructors instead of a factory. Agree about the latency, but I'm not sure about using 0 as meaning both 0 latency and unknown... but then, making the attribute nullable might not be great either.

jussi-kalliokoski commented 10 years ago

I'm not sure about using 0 as meaning both 0 latency and unknown

I can't think of any case where the default behavior in the case of unknown latency would not be to assume zero latency, so for most cases there would be no extra work to account for the unknown latency situation, hence the suggestion. If we're able to come up with sane scenarios where you'd benefit from it being non-zero and unknown, I'll be happy to use another value.

I prefer these were constructors instead of a factory

I agree, but we'll have to carefully assess whether there's a security risk in that.

marcoscaceres commented 10 years ago

I agree, but we'll have to carefully assess whether there's a security risk in that.

I understand the security issues you mentioned above - but those appear to be orthogonal to having a constructor or factory (maybe I'm missing something).

jussi-kalliokoski commented 10 years ago

but those appear to be orthogonal to having a constructor or factory

It comes down to whether we need to ask for permission or not, and if we do, the factory method has the permission model already set up (to get the MIDIAccess instance), whereas for global constructors there isn't one. That is, unless the constructor takes a MIDIAccess instance as an argument (in which case I'd argue that it doesn't make sense to detach it from the MIDIAccess) or we throw if a valid MIDIAccess instance hasn't been created during the current session.

cwilso commented 10 years ago

I'm not sure that the security/privacy model for virtual ports will be the same as for non-virtual ports, as I expect one would want to have virtual ports exposed to native software as well?

cwilso commented 9 years ago

I continue to hear demand for this from nearly every vendor I talk to.

notator commented 9 years ago

126 is pretty close to this issue, but:

1 I'm only asking for a virtual output device. :-) 2 I'm asking for one that can be loaded with custom sounds.

The situation has become more urgent than it was last year because operating systems are no longer providing GM Synths.

Whether this gets into the Web MIDI API itself or not, it would be great to have a shim.

toyoshim commented 9 years ago

Just for playing back a SMF file with GM synths or your own custom synths, Web MIDI is not needed at all. Web Audio is enough for such purpose. Did you see the link I posted in #126?

The important thing here is that we need a standardized way to share software synths written in JavaScript.

cwilso commented 9 years ago

@notator The delta between virtual input and virtual output ports is nearly zero. If we do one, we should do the other.

On the other hand, for "I'm asking for 'one' that can be loaded with custom sounds" - you're asking for an IMPLEMENTATION of a virtual device, that enables custom sound loading; this isn't going to get baked into the Web MIDI API, since it's like declaring one programmable synth is the right one.

joeberkovitz commented 9 years ago

I think this is a great idea but feels wide open to very different definitions of what constitutes a "virtual device", and where its implementation might live (in a browser page? in the browser itself, persistently? in native apps?).

Also there's some overlap with existing "virtual" device mechanisms for MIDI, e.g. in iOS.

And how persistent or transient is such a device? Is it only there when the page that instantiates it happens to be open?

Not to mention all the security considerations.

In short virtual devices seem cool and I'm sure vendors are asking for them (hey I want them too), but I wonder if we all know exactly what we mean when we say it, and if we mean the same thing. It feels like more research and discussion is needed to nail the use cases and make this idea definite enough to implement.

Also, Web Audio has a very similar problem to solve in terms of inter-app communication. I would hope that Web Audio and Web MIDI could wind up adopting a similar approach to abstracting sources and destinations for both audio and MIDI data.

agoode commented 9 years ago

If we solved issue #99 and then used ServiceWorkers (https://github.com/slightlyoff/ServiceWorker/blob/master/explainer.md) then it wouldn't be hard to extend the spec to allow for virtual ports and have a reasonable lifecycle story.

(At least on Linux and Mac. Windows has no standard way to do virtual ports.)

toyoshim commented 9 years ago

SGTM on the Worker story. I think exposing virtual ports against applications outside the web browser could be optional on platforms that underlying system supports such feature.

notator commented 9 years ago

@cwilso Hi Chris, I think it would be a good idea to concentrate on virtual output devices first, then see what the answer implies for input. That's because I can see from toyoshim's link [1] that output looks pretty ripe already...

@toyoshim Hi! Thanks for the link. Great stuff! Is that your site?

There is, of course quite a lot to discuss:

  1. Web MIDI applications send messages to real output devices in Uint8Arrays. As an application author, I don't want to have to convert the message to a string, just so that the device can do string analysis on every message I send. It would save a lot of code and processor time if the device just read the Uint8Array. That could easily be implemented in WebMidiLink, and in the existing devices, by defining a new message type (maybe "webmidi").
  2. Unfortunately, Yuuta Imaya's sf2player.js fails to load. Yes, there's even an sf2player there, that says it supports loading soundFonts!
  3. WebMidiLink uses parallel processes created by opening new tabs in Chrome. That's not ideal. Many of the devices look great, but I don't actually want/need to look at them. Also, I think I need to spawn subthreads in the device's thread, and that's not going to be easy. We need a better approach to creating threads.

To help thinking about threading, here's a use case: Let's say I'm writing a prepared piano emulator. (That's actually not quite true, but close enough to the truth for present purposes.) There's a (real) midi keyboard attached to my application, and I have things set up so that each key is associated with a (rather noisy) sequence of midi messages waiting to be played. Sending a noteOn from the keyboard triggers its sequence. Sending the noteOff tells the sequence to stop. I have no control over how many keys are playing at once, or when they are depressed relative to each other. The whole point is that the keyboard is never blocked. The performer is free to just play.

I'm a beginner with web workers, but currently imagine setting up something like this before the performance begins: The input device is in the browser's user thread. The output device is in a SharedWorker (@agoode or ServiceWorker?). Let's call this the Marshall. The keys' sequences run in their own (ordinary) web workers, and access the output device by sending messages to the Marshall.

If things do indeed work like that, then I'm the one who has control over the life cycle of all the threads and devices.

But can Sharedworkers or ServiceWorkers access virtual output devices? I have been unable to find out.

[1] http://www.g200kg.com/en/docs/webmidilink/

toyoshim commented 9 years ago

@notator It is not my site, but of my friend who is a famous music application developer in Japan. This is a good example how many software synth can be developed in a short period in the community. Once he proposed the WebMidiLink idea, many people developed their own synths supporting WebMidiLink. This is the reason why I do not stick on OS providing synths. We can develop a great synth with Web Audio, and web community has a power to create various kind of synths.

notator commented 9 years ago

@toyoshim Ah yes, I forgot: +1 for "The important thing here is that we need a standardized way to share software synths written in JavaScript."

I think the interface that should be implemented by software synths can be more or less dictated by the powers-that-be here. This is much easier than defining an interface for hardware manufacturers. The interface should, I think be modelled on the one for hardware synthesizers. For starters, I think, there should be a function that returns the synth's metadata. See the properties in http://www.g200kg.com/en/docs/webmidilink/synthlist.html

And, if its not clear enough already, my original request from #126 is no longer on the table. :-)

cwilso commented 9 years ago

@agoode Service Worker WILL NOT fix this. You wouldn't be able to keep an AudioContext alive inside a SW; SWs are designed to come alive when needed, but not be resident/running all the time. For a soft synth, you need to wake up and be alive (when routed, usually).

Let's keep this focused: THIS issue is about creating virtual MIDI ports, input and output, that can then be used by other programs on the system while this application is resident - i.e., creating a MIDI "pipe". The other issue (#124) is for managing and referring to virtual instruments - including, presumably, how you initialize them. (ServiceWorker might be involved there, but it's not going to solve it by itself.)

Whether the MIDI API is available to Workers (#99) is relevant in that you'll probably need it in the context of whatever the initialization context is for #124, but it's also useful in more narrow contexts (e.g. I want to run my sequencer in a non-UI thread).

cwilso commented 9 years ago

Forgot to say: @joeberkovitz : Note the above, I'm trying to keep each of these issues separated, because the bedrock they detail is independently useful. This issue, for example - you could utilize a Web MIDI synth that you loaded in your browser from Ableton (since it could show up as a MIDI device in OSX). The top-of-the-heap "virtual device" spec is #124, and yes, it's heavily related to the the virtual device/routing issue in Web Audio; I'd expect at the very least you'd want to be able to bundle them.

bja-ableton commented 7 years ago

I'd like to lend my support to this proposal. It would be wonderfully powerful for integrating the browser with other desktop software, synths and so on - upgrading the browser to a "first class" player in desktop audio. I see there has been no further comment for around a year and a half - any progress to report?

cwilso commented 7 years ago

This was moved to v2. It's extremely challenging - in particular, because you would want a Service-Worker-like "resident background app" type of environment for these, and that just doesn't exist in the Web. (Something that could register, and then be woken up to run when an app - native or web - opens the port.) In addition, even just registering a virtual port requires a virtual device driver to be installed on Windows, which is a really deep security challenge.

(I'm not saying "I don't think this is important" - just "it's really hard, and has to be thought through from many angles.)

toyoshim commented 7 years ago

Yeah, I myself like this idea too, and occasionally consider how this API design should be, and make some prototypes. But probably this should be discussed after the second browser implementation gets to be ready.

neauoire commented 5 years ago

Any update on this?

toyoshim commented 5 years ago

At this moment, this is out of scoped of v1.

F1LT3R commented 4 years ago

I would also like to add support for this feature. In my case I would like to be able to create Virtual MIDI Devices from inside the web browser that could trigger events in my DAW and my Video Composition software at the same time.

Currently, I am doing this with Node.js. I use Reason Suite to generate sound and MIDI CC data, then pipe those MIDI CC messages to Resolume Avenue to control visuals and live video mixes.

I have an example here: https://github.com/F1LT3R/remidi

For this I am using the Node.js EasyMidi package, which is cross-platform, but more of a pain to set up in Windows (I have not tried Linux yet). It would be game-changing to have a cross-platform way to create virtual devices without needing to compile code.

I think the Web Browser is a great candidate for this.

The Web Midi API has the opportunity to provide a platform-consistent workflow to create fully integrated MIDI applications that were applicable across multiple industries.

Use cases for Virtual MIDI Devices:

  1. Arts & Entertainment - In production & live settings.
  2. Broadcasting
    • Video Mixing Consoles
    • DMX Lighting
  3. Industrial
    • Robotics in the lab and the production line.

Many of these could have web user-interfaces that consumed external MIDI data over Web Sockets, passing those messages forward to connected devices.

notator commented 4 years ago

Apropos @F1LT3R's https://github.com/WebAudio/web-midi-api/issues/45#issuecomment-614408730 above and https://github.com/WebAudio/web-midi-api/issues/124#issuecomment-155523069:

I'd like to mention that I (provisionally) completed my ResidentWAFSynthHost (GitHub, application) project a couple of weeks ago. The hosted ResidentWAFSynth synthesizer has also been added to the more general WebMIDISynthHost (GitHub, application) and a couple of other projects. Such WebMIDISynths use the Web Audio API to implement the Web MIDI API Output Device interface (i.e. they know what to do with MIDI messages when they arrive).

If software that implements the Web MIDI API Input and Output Device Interfaces can be programmed to do things other than use the Web Audio API to create sounds, then maybe browsers can be spared the effort of having to implement Virtual MIDI ports?

F1LT3R commented 4 years ago

@notator can you help me understand what you mean here?

If software that implements the Web MIDI API Input and Output Device Interfaces can be programmed to do things other than use the Web Audio API to create sounds, then maybe browsers can be spared the effort of having to implement Virtual MIDI ports?

It sounds like you're making a case against implementing Virtual MIDI Devices in the browser, but I don't yet understand how that follows from "doing things other than use the Web Audio API to create sounds". Why would the use of the Web Audio API have any bearing on what features the Web MIDI API included?

notator commented 4 years ago

@F1LT3R Maybe there's also something I'm not understanding, and there are snags down the line. We'll see. Here's what I was thinking:

My ResidentWAFSynthHost, implements the Web MIDI API Input Device interface, so it understands midi messages coming from an external midi device. I'm using a hardware midi keyboard, but it could just as well be the midi data coming from your Reason Suite. The Host app sends the messages on to its "resident" synth (which is a Virtual MIDI Device that implements the Web MIDI Output Device interface) which processes the incoming info and uses the Web Audio API to produce sound. It would also be possible to process the incoming midi messages and send them to some other implementation of the Web MIDI API Output Device. Maybe you could make such an implementation that would pipe the messages to Resolume Avenue?

cwilso commented 4 years ago

@notator At the system level, this isn't particularly hard to do. You can set up a MIDI loopback device today, and use it as a virtual MIDI port.

The two "hard" things about this are the dynamic creation of virtual MIDI devices (this is challenging to do in Windows, IIRC, but maybe that's no longer true?), and more than anything, the security of this exposure might become concerning. (Because this would provide a cross-process communication mechanism, among other reasons.). I'm not saying that's a reason not to address it, but it does need some deep review.

jbflow commented 3 years ago

I would also like to add support for this feature. In my case I would like to be able to create Virtual MIDI Devices from inside the web browser that could trigger events in my DAW and my Video Composition software at the same time.

Currently, I am doing this with Node.js. I use Reason Suite to generate sound and MIDI CC data, then pipe those MIDI CC messages to Resolume Avenue to control visuals and live video mixes.

I have an example here: https://github.com/F1LT3R/remidi

For this I am using the Node.js EasyMidi package, which is cross-platform, but more of a pain to set up in Windows (I have not tried Linux yet). It would be game-changing to have a cross-platform way to create virtual devices without needing to compile code.

I think the Web Browser is a great candidate for this.

The Web Midi API has the opportunity to provide a platform-consistent workflow to create fully integrated MIDI applications that were applicable across multiple industries.

Use cases for Virtual MIDI Devices:

1. Arts & Entertainment - In production & live settings.

   * General DAW Control: Transport, Parameters, Instruments
   * Studio control: [motorized faders](https://www.behringer.com/Categories/Behringer/Computer-Audio/Desktop-Controllers/X-TOUCH-COMPACT/p/P0B3L#googtrans(en%7Cen)), etc.
   * [DMX Lighting](https://chamsyslighting.com/products/quickq-10)
   * Music Video/Visuals Generation
   * MIDI Routing ([MidiPipe](http://www.subtlesoft.square7.net/MidiPipe.html), [virtualMIDI](https://www.tobias-erichsen.de/software/virtualmidi.html))
   * MIDI Control Surface feedback: Eg: Ableton Live and [Novation Launch Pad](https://novationmusic.com/en/launch/launchpad-mini)

2. Broadcasting

   * Video Mixing Consoles
   * DMX Lighting

3. Industrial

   * [Robotics](https://github.com/mattsteinke/midi-robotics) in the lab and the production line.

Many of these could have web user-interfaces that consumed external MIDI data over Web Sockets, passing those messages forward to connected devices.

I'd like to add support for this feature, for all of the reasons mentioned here. I have a project that would make good use of virtual ports. Currently I'm using IAC and/or LoopMIDI.

some1else commented 3 years ago

Similar use case to @jbflow

I'd like to send MIDI from https://qwerkey.xyz into any other MIDI compatible app. Currently the user has to create and enable virtual MIDI ports in IAC on the Mac, or use third party software on Windows / Linux, that comes with opaque binaries or complicated setup.

Creating virtual MIDI ports in the browser would decrease the friction in connecting browser-based MIDI apps, elevating the experience and flow for the user. Very hopeful we'll get a spec like this landed in the standard some day.

cwilso commented 3 years ago

So I wanted to capture some thinking on this - to be frank, I expect this request is going to sit for a long time. The more deeply I've thought about it, the more concerns I have about building a comprehensive, coherent strategy for security and privacy mitigation here. In the short term, there is a workaround (the IAC virtual port or loopback driver on Windows). I'm happy to use this issue as a place to incubate security mitigations or the like, or have someone take it up, but this will be "hard" to get right.

some1else commented 3 years ago

A prompt that allows the domain to create virtual MIDI ports would be entirely acceptable, somewhat like the request for geolocation access.

cwilso commented 3 years ago

That would certainly be a component of the solution, but just like with geolocation (this is a hot topic for Geolocation: https://github.com/w3c/geolocation-api/issues/47) the lifetime of such a virtual port, and the user's understanding that such a port is around, would be a concern. As a user, you might expect once you allowed a port that it was persistent, but it's a background service that might have significant concerning side effects. (E.g.: you go to a well-known MIDI site; it asks to create a virtual port. You say yes. Months later, other sites ask for access to that port, because it turns out they can use it to communicate and circumvent cross-site restrictions.)

On Thu, Aug 27, 2020 at 10:41 AM Srđan Prodanović notifications@github.com wrote:

A prompt that allows the domain to create virtual MIDI ports would be entirely acceptable, somewhat like the request for geolocation access.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAudio/web-midi-api/issues/45#issuecomment-682093778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD3Y6OKL3IBY43QXSK33OLSC2LFBANCNFSM4ADEPFXQ .

some1else commented 3 years ago

Thanks for sharing that scenario.

It does seem like two malicious apps working in tandem could use the feature to track a user for advertising/profiling purposes. It's currently possible to track users with third-party cookies and CORS requests, but using MIDI would make that activity opaque and harder to prevent.

I see two mitigation approaches, that are inspired by the WebRTC generateCertificate spec.

It seems that, at the expense of some complexity, Virtual MIDI could be made as secure as cookies and fetch. Is there a comprehensive list of security considerations regarding this spec that we could add to, and attempt to potentially resolve within the standard?

Apologies if I'm talking rubbish. Would love to get involved and deepen my knowledge of the issues somehow, instead of just badgering here, if that's an option. Is there a more specific mailing list than public-audio, where this is being discussed?

Thank you for your continued involvement in bringing this feature to life.

caseydwayne commented 2 years ago

Adding support for a virtual MIDI port. Using external software to do so is sufficient but an embedded solution would be much more desirable.

xXGoziXx commented 2 years ago

I'm currently working on a project that should be able to interact with SoundTrap and other midi enabled sites. Adding support for virtual MIDI would make this so much easier. Especially for platforms like that of android where setting up virtual midi ports is next to impossible.

tobiasBora commented 2 years ago

I'm actually not sure to understand : isn't the current implementation already suffering from the same security issues as the ones mentionned in this thread regarding virtual ports? I mean, two malicious websites can certainly already communicate using the default MIDI Through ports that are created (at least it is the case in Linux+pipewire), I don't see how virtual ports change this.

Regarding the privacy, I was also thinking that it may be more private not to list all ports in javascript directly, but rather to provide a way to select the port through the browser directly, a bit like what is done for screen sharing where we can choose a specific window to share without giving the website a full access to the desktop. This menu could even propose to the user to connect to an already existing port or to create a brand new, unconnected port. This way, the only way to make two websites communicate between each others is to ask to the user to explicitely allow both of them to communicate via a common MIDI port.

Of course, it may be a bit less nice from a user point of view since a new menu will pop when they want to add new midi ports (at least the first time they allow the app to communicate to a port). This may be avoided by changing slightly the way MIDI ports are chosen : instead of choosing a single MIDI port, the user could be provided with a list of ports, and check in the list the ports that are allowed to be configured in the app… but I'm not sure if this is really necessary as it may complicate the setting for only a mild improvement in styling.

dderiso commented 1 year ago

To mitigate security issues, will the committee consider creating a user request dialog similar to "allow access to {microphone, camera, USB, bluetooth}" ? e.g. "Site is requesting access to MIDI. Allow creation of MIDI port?"

rianhunter commented 6 months ago

Given how long this GH issue has existed, what is blocking a feature like this? This feature is necessary for enabling critical functionality in many potential applications. @toyoshim is it possible we can implement an experimental version of this API in Chromium? If you provide me with the high level constraints and instructions I'm happy to write the patch.

mjwilson-google commented 6 months ago

Hi @rianhunter, there are a few things going on here:

cwilso commented 6 months ago

I do want to caution that the primary reason I didn't push harder on a design for this is because (as I mentioned above) it has some significant security concerns, as it enables a cross-domain communication channel outside the normal constraints of such things. It wouldn't necessarily be impossible to safely construct such a feature, but it would definitely not be trivial to do the security design.

rianhunter commented 6 months ago

Thanks for the info @mjwilson-google. If you'll spare me a few moments, I just have some questions.

1) I had assumed that WebMIDI was already a standard of some sort by some body. From your response and the standard to which this repo is linked, it now seems to me there is no completed standard around this API. It has been many years so I must be missing something w.r.t. to how these types of specs are approved. I guess my question is now, why hasn't v1 WebMIDI been approved? What is blocking that?

2) Given that there was no spec, what was the process for getting the existing API/implementations into browsers? I'm assuming you work at Google. Perhaps you understand the Chromium process better.

3) What would be the ideal process for getting this into the v1 or v2 spec? Should I focus on a working prototype in Chromium, ironing out potential implementation issues there, then going about proposing a spec?

4) @cwilso I don't fully understand the security issues blocking such a feature. Isn't the user required to explicitly opt-in to WebMIDI on a per-domain basis? Why is that not sufficient? It seems like a feature like this (and sufficiently powerful web apps that integrate with the desktop in general) will always require a certain level of trust in the domain by the user. If opt-ins are not considered secure or trustworthy enough, then I don't fully understand the subtleties of the logic behind their inclusion in the first place. FWIW on Linux, all MIDI implementations already provide a loopback interface by default that can already be used for inter-domain communication. Of course this is provided the user already explicitly allow both domains to use MIDI capabilities.

mjwilson-google commented 6 months ago
  1. Web MIDI is a standard in the "Working Draft" stage of the "Recommendation Track" right now; the process is described here: https://www.w3.org/2023/Process-20231103/#w3c-recommendation-track. One reason that it stayed in this stage for so long is Chromium had the only implementation of Web MIDI; now that Firefox has also implemented Web MIDI we have the go-ahead to clean things up to make this a full "Recommendation".

  2. I do work at Google. The general process is that first the API is specified in a web standard, and then the browsers will independently implement the API. We sometimes "work ahead" in Chromium, but this is discouraged and usually that work won't be released to general users until standardized.

  3. The ideal process is we discuss on GitHub, make a perfect spec revision, take it to an Audio Working Group meeting and get unanimous approval, then we merge the spec change and the browsers implement it exactly as specified. In reality it usually isn't that simple. The advantage of prototyping first is that we could shake out some issues that wouldn't be clear from just talking about it here. The disadvantage is we could spend a lot of time working on something that never actually gets merged or released to users, or that has to be completely rewritten depending on the final form of the spec. With the caveat that I really haven't spent enough time thinking about this, my intuition is that we should probably spend more time discussing here on GitHub and at least get Mozilla's opinion (@padenot) before working on an implementation. But if you have a clear idea and are motivated to prototype now I will try to support you as best as I can.

rianhunter commented 6 months ago

Thanks again for the info @mjwilson-google. I just gave the thread a more thorough once-over and there are three major concerns preventing a practical implementation:

1) Security: potentially allows a new channel through which web sites on different domains can communicate.

2) No native "virtual port" functionality on Windows

3) Latency and making this functionality coupled with a ServiceWorker API.


Addressing 1). One option is to not expose virtual ports created by the browser to other domains by default. This can be done by the pre-existing MIDIOptions.software member when calling requestMIDIAccess. Or a new member can be introduced, such as MIDIOptions.domains, which is an allow list of domains with which the current domain can interact using virtual MIDI ports, with "*" being an option for all domains. I like the idea of just not allowing inter-domain communication by default and later potentially adding a MIDIOptions.domains feature if there is sufficient demand.

Addressing 2) This is an annoying snag about which I wasn't aware. This implementation detail may not matter for specifying standard behavior but it makes considering whether or not to spend effort specifying this worth it. Without any numbers, I assume the vast majority of WebMIDI users are Windows users (at least in my experience this is true). Distributing a kernel driver with a browser is potentially significant additional complexity, considering automatic updates and code signing. It may even be the case that future versions of Windows drastically limit which software vendors are even able to load kernel modules, making this not future proof. This is something that needs discussion with Firefox and Chromium release teams or whoever is responsible for considering these types of issues. It may be the case that there is existing infrastructure for distributing kernel modules in those browser projects and if so that reduces the added complexity of this feature quite a bit.

Addressing 3) In my opinion, I think the API sketch in https://github.com/WebAudio/web-midi-api/issues/45#issuecomment-22088781 is more than sufficient for an initial spec and the vast majority of applications. This is something on which I could build now and it shares similarity to the existing API. I think decreasing latency and a ServiceWorker-based API should be treated mostly orthogonally and also include handling of the existing mechanism to access MIDIPorts.

cwilso commented 6 months ago

@rianhunter on your #1: it's not just exposing virtual ports created by the browser to other domains - it's the fact that this feature would (intentionally!) open up communications between arbitrary native apps as well as web domains, and it is really hard to ensure that such interactions are safe, given the unknown interactions possible. (For example: early in the time of Web MIDI shipping on Chrome, we enabled access to the built-in Windows software synthesizer; it turned out that you could cause it to crash by sending enough notes through it too quickly!) There are a LOT of potential interactions.

On #2, IIRC it wasn't that it wasn't possible to create a virtual port on Windows - it's that the old Windows APIs wouldn't allow you to add/remove MIDI devices without literally rebooting (i.e. kernel restart). I don't believe that's true anymore with the newer Windows MIDI APIs, though - but it's been well over a decade since I wrote any Windows runtime midi code.

On 3, it would be really really important to iterate all the use cases before doing a design; there are a lot of things modern browsers do to background pages, for example (like garbage-collecting them, or freezing their responsiveness) that would probably destroy the usefulness accidentally.

As former editor: as @mjwilson-google implied, I pretty much stalled the spec waiting for another major browser vendor to do an implementation, which is why Web MIDI was in draft for over a decade. Mozilla finally did, so I'm excited to see Michael finish it off.

rianhunter commented 6 months ago

it's the fact that this feature would (intentionally!) open up communications between arbitrary native apps as well as web domains, and it is really hard to ensure that such interactions are safe, given the unknown interactions possible.

I do appreciate this concern but I think it might be a bit overstated in this specific case. Virtual ports expose a sort of "opt-in" MIDI behavior. Other applications reserve the right to connect or not connect to your virtual port at their discretion, which is usually at the discretion of the user. The existing WebMIDI API, where all existing ports are exposed to running JS application without their discretion seems more dangerous in this vein. The new harm that can be done is from external native applications to the JS application in this case. I don't think this feature exposes new potential harm from JS applications to external native applications.

I don't believe that's true anymore with the newer Windows MIDI APIs, though

If that's the case then that's great news. I think conditionally enabling virtual ports on newer versions of Windows would be acceptable.

On 3, it would be really really important to iterate all the use cases before doing a design; there are a lot of things modern browsers do to background pages, for example (like garbage-collecting them, or freezing their responsiveness) that would probably destroy the usefulness accidentally.

That's fair but I just wanted to point out that these concerns more or less equally apply to the existing API when subscribing to MIDIInput ports. WebMIDI as it exists today is sufficient for many useful applications despite potential latency issues in practice. I wouldn't want to stall implementing a feature that could be useful today in a "perfect is the enemy of the good" fashion.

As former editor: as @mjwilson-google implied, I pretty much stalled the spec waiting for another major browser vendor to do an implementation, which is why Web MIDI was in draft for over a decade. Mozilla finally did, so I'm excited to see Michael finish it off.

Thanks for the context, that makes sense and it clears up that confusion for me.

rianhunter commented 6 months ago

Just wanted to add another note re: Security and unforeseen app interactions. Since creating a virtual port offers other native applications the ability to "opt-in" to sending or receiving data from JavaScript WebMIDI applications, this could potentially be used to mimic hardware devices to fool native applications into sending JS MIDI applications data that they should not receive, or vice versa.

I think an easy way to avoid something like this is to force an identifier on ports created in WebMIDI applications. E.g. If the application at www.foo.com creates an virtual port with the name "Test Port", adding a forced prefix to this so that it appears to other applications as "WebMIDI (foo.com): Test Port". There may be other metadata available that can also be used to disambiguate between hardware devices and WebMIDI applications but I just wanted to demonstrate the basic idea.


@mjwilson-google For my next step I will file a bug in Chromium and start prototype / implementation discussion there. I would also like for this to go hand-in-hand with a living spec. How should we move forward on that end? Should we continue to use this issue to discuss spec?

mjwilson-google commented 6 months ago

@mjwilson-google For my next step I will file a bug in Chromium and start prototype / implementation discussion there. I would also like for this to go hand-in-hand with a living spec. How should we move forward on that end? Should we continue to use this issue to discuss spec?

Yes, spec discussions belong here. For reference, #185 is about the overall security section of the spec and has some information and links that describe why Web MIDI security isn't always as straightforward as it seems.