w3c / mediasession

Media Session API
https://w3c.github.io/mediasession/
Other
129 stars 30 forks source link

Add a way of detecting which actions a UA supports #228

Open beccahughes opened 5 years ago

beccahughes commented 5 years ago

From a discussion with @beaufortfrancois and @mounirlamouri. We should have a way of detecting what action a UA supports e.g.

if (navigator.mediaSession.supports('pause'))
  navigator.mediaSession.setActionHandler('pause', _ => { /* ... */ });

or

if ('pause' in MediaSessionAction)
  navigator.mediaSession.setActionHandler('pause', _ => { /* ... */ });
if ('stop' in MediaSessionAction)
  navigator.mediaSession.setActionHandler('stop', _ => { /* ... */ });
// TODO: Set other action handlers...
jernoble commented 5 years ago

I think the discovery is flowing the wrong direction. The UA should be discovering what the page supports, and calling the appropriate handler, and not vice versa.

beaufortfrancois commented 5 years ago

My point is that we need a proper way of detecting which actions a UA supports to avoid code below.


if (!('mediaSession' in navigator)) {
  // Basic media session is supported but some actions might not be supported.
  // You'll have to try/catch some of them specifically for Chrome. Sorry ;(

  // Pause action is ok... I think...
  navigator.mediaSession.setActionHandler('pause', function() { /* ... */ });

  // I'm quite sure this one may fail before Chrome 77 so let's use try/catch.
  try {
    navigator.mediaSession.setActionHandler('stop', function() { /* ... */ });
  } catch(error) {
    // Sorry ;(
  }

  // TODO: Set other action handlers.
  ...
}
jernoble commented 5 years ago

I don’t understand; where in the spec does it specify that setActionHandler() should throw when it receives an unsupported action? Is it just due to that it takes an enum and that WebIDL requires a throw if the parameter isn’t in the enum? If so, isn’t there a defined way to do this already? Like:

if (‘stop’ in MediaSectionAction) {
    navigator.mediaSession.setActionHandler('stop', function() { /* ... */ });
}
beaufortfrancois commented 5 years ago

Indeed. navigator.mediaSession.setActionHandler('foo', _ => {}) throws a "TypeError" due to WebIDL bindings.

Failed to execute 'setActionHandler' on 'MediaSession': The provided value 'foo' is not a valid enum value of type MediaSessionAction.

And MediaSectionAction enum is not exposed to web developers as well sadly. See https://github.com/web-platform-tests/wpt/blob/master/mediasession/idlharness.window.js

jernoble commented 5 years ago

How do all other specs that take enums as arguments deal with this? It seems unnecessary and duplicative just to have to add a supports() method in every API that uses an enum. (Also, we couldn’t spec supports(MediaSessionAction action) as that would throw on supports('pause') too!)

jernoble commented 5 years ago

If we can’t find a way to make action in MediaSessionAction work, one workaround would be to add a method with no side-effects which takes a MediaSessionAction, like

partial interface MediaSession {
    bool hasActionHandler(MediaSessionAction);
}

Which could be used in a simple function:

bool supportsAction(action){
    try { navigator.mediaSession.hasAction(action); }
    catch(e) { return false; }
    return true;
}

Edit: people have told me this example is confusing. See my comment https://github.com/w3c/mediasession/issues/228#issuecomment-523581885 for clarification.

chrisn commented 5 years ago

We might need to think about the terminology. The spec currently defines a "supported media session action" as an action with an associated handler function, rather than an action that a UA doesn't support.

So would mediaSession.supports(action) be a different query to mediaSession.hasActionHandler(action)? And which of these would action in MediaSessionAction potentially correspond to?

jernoble commented 5 years ago

Yes that’s a great question. Is the end goal that the “stop” action is “supported” on all instances of (e.g.) Safari >= 15.0, or is the end goal that the ”stop” action is “supported” when the user has a keyboard with a “Stop” key.

Personally, I’m much more supportive of the former than the latter.

But the terminology we come up with will depend on the answer to that question. If we collectively agree on the latter answer, then supports() is a natural fit. However if we decide the former answer is correct, then supports() suffers from a lexical ambiguity that could lead clients to assume we meant the latter.

jernoble commented 5 years ago

FWIW, this is still being argued about in the WebIDL spec: https://github.com/heycam/webidl/issues/107

jernoble commented 5 years ago

One “solution” suggested by @tabatkins in the OP of that issue, when applied in this context, would look like:

”stop” in navigator.mediaSession.MediaSessionAction

This would be an alternative to just exposing the enum directly in global scope.

Edit: “solution” in quotes only because it’s not a WebIDL solution, not that it wouldn’t solve the issue.

mounirlamouri commented 5 years ago

Is the end goal that the “stop” action is “supported” on all instances of (e.g.) Safari >= 15.0, or is the end goal that the ”stop” action is “supported” when the user has a keyboard with a “Stop” key.

In my opinion, we should do the former, not the latter. The issue with the latter is that in some cases, we wouldn't know: for example, a keyboard may or may not have a next track button. In some rare cases, we know that an action will never be executed because of some specificity with an OS version but I don't think that's worth putting in spec.

I think originally the issue raised because @beaufortfrancois was thinking of how to write backward compatible code that uses the new stop action. Is that correct Francois?

jernoble commented 5 years ago

Someone pointed out off-list that my bool hasActionHandler(MediaSessionAction) example is confusing. In my straw man example, if we added a hasActionHandler() function, it would return true if someone had previously called addActionHandler() with a valid action and handler, return false if no handlers had been added for that action, or if all handlers had been removed, or would throw an exception if the action was not in the MediaSessionAction enum. But since there are no side-effects to calling hasActionHandler(), it could be used to detect whether a given action string was a valid MediaSessionAction.

That said, maybe adding a readonly attribute sequence<MediaSessionAction> MediaSessionAction to MediaSession is a much more direct solution to the problem at hand.

chrisn commented 5 years ago

I think originally the issue raised because @beaufortfrancois was thinking of how to write backward compatible code that uses the new stop action.

I certainly have this use case, experimenting with the Chrome for Android implementation which only partially implements the spec so far (e.g., adding a seekto handler currently throws in the Chrome version I'm using). I guess it would also be useful as a mechanism for future extensibility.

beaufortfrancois commented 3 years ago

Following up on https://groups.google.com/a/chromium.org/g/blink-dev/c/sXhlwO_R9To/m/c1ZG4qx8AgAJ discussion to figure out a safer way to enable feature detection for actions, I'd like to propose adding a new actions readonly attribute to the MediaSession interface that would contain an array of MediaSessionAction supported by the UA.

partial interface MediaSession {
  readonly attribute FrozenArray<MediaSessionAction> actions;
};
console.log(navigator.mediaSession.actions);
// ["play", "pause", "previoustrack", "nexttrack", "seekbackward", "seekforward",
// "skipad", "stop", "seekto", "togglemicrophone", "togglecamera", "hangup"]
if (navigator.mediaSession.actions.includes('hangup')) {
  navigator.mediaSession.setActionHandler('hangup', () => { ... }
}

Note that a sequence of enums is not supported in WebIDL, hence the use of a FrozenArray.

@jernoble @steimelchrome @chrisn WDYT?

chrisn commented 3 years ago

This seems reasonable to me, although I'd want to know if there's any change planned in WebIDL or other pattern specs should use for this kind of feature.

We have a WG meeting coming up next week, please add the agenda label if you'd like time in the meeting to discuss.

beaufortfrancois commented 3 years ago

WebIDL changes won't be needed. https://w3ctag.github.io/design-principles/#feature-detect is probably the closed we have from a spec design principle point of view.

I'll be OOO next week but I'd love if it could be discussed at the WG meeting.

beaufortfrancois commented 3 years ago

@chrisn Please add the agenda label, I don't have access.

tidoust commented 3 years ago

@chrisn Please add the agenda label, I don't have access.

Just a side note that you should now be able to manage issue labels on this repository.

domenic commented 3 years ago

I'm not sure I quite understand what problem this is solving. Is the main goal to replace try/catch with if?

domenic commented 3 years ago

To expand a bit more:

chrisn commented 3 years ago

Discussion from April Media WG meeting: https://www.w3.org/2021/04/13-mediawg-minutes.html#t04. The context was adding new action handlers (https://github.com/w3c/mediasession/issues/264).

jan-ivar commented 3 years ago

I agree with @domenic's option 1 here. This is why enums throw.

My point is that we need a proper way of detecting which actions a UA supports to avoid code below.

I think we have that. If a site doesn't care whether actions are supported or not, they can use the following pattern:

function setActionHandlerIfSupported(action, func) {
  try {
    navigator.mediaSession.setActionHandler(action, func);
  } catch (e) {
    if (e.name != "TypeError") throw e;
  }
}

// Basic media session is supported but some actions might not be supported.
setActionHandlerIfSupported('pause', () => { /* ... */ });
setActionHandlerIfSupported('stop', () => { /* ... */ });

FWIW, this is still being argued about in the WebIDL spec: heycam/webidl#107

I believe that issue is instead about cases where no other means of feature detection exists (dictionary members). That's not the case here. We have an API that throws, i.e. we already have feature detection.

jan-ivar commented 3 years ago

I think it's rare to need pure feature detection, so while I think it needs to be possible, I don't think it needs to be ergonomic.

one workaround would be to add a method with no side-effects

While contemplating how to write that using setActionHandler (by removing its potential side-effects), the lack of a getActionHandler function became apparent (a way to set, but not get?) — If we had that, then it would also be that method:

function supportsAction(action) {
  try {
    navigator.mediaSession.getActionHandler(action);
    return true;
  } catch (e) {
    if (e.name != "TypeError") throw e;
    return false;
  }
}
jan-ivar commented 3 years ago

It's also worth noting that navigator.mediaSession.setActionHandler(action, null) has no side-effects initially. This becomes a problem the same time the fact that setActionHandler is last-come-last-serve becomes one.

yoavweiss commented 3 years ago

My concern with using an enum and throwing as a feature detection method, is that developers MUST use try and catch for any new actions we add. Otherwise, they are running a risk of completely breaking their app in non-supporting browsers, unless they explicitly test in those browsers.

A supports() API coupled with silent failure would alleviate that concern.

jan-ivar commented 3 years ago

That's an ergonomics argument, and those have been rejected in the past for all feature detection.

beaufortfrancois commented 3 years ago

@jan-ivar Can you share links to similar issues where this was rejected?

yoavweiss commented 3 years ago

That's an ergonomics argument, and those have been rejected in the past for all feature detection.

The argument is not that it's hard for devs to perform such feature detection, but that it's a potential foot gun that can cause interop issues when introducing new values.

domenic commented 3 years ago

@yoavweiss can you share more about why you believe if with a new method is better than try with an existing method?

(I agree silent failure would change the story; that's the gist of https://github.com/w3c/mediasession/issues/228#issuecomment-876549150 . But at that point a new supports() method seems unneeded and potentially undesirable.)

yoavweiss commented 3 years ago

I think we need both a silent failure (to avoid the footgun) and supports() as an alternative feature detection mechanism. I agree that if instead of a try on its own will not change the risk of introducing new actions.

jan-ivar commented 3 years ago

@jan-ivar Can you share links to similar issues where this was rejected?

@beaufortfrancois I wasn't immediately able to locate some of the old discussions, but the web would be full of supports APIs by now, yet it is not. MDN has a section on feature detection, and there are hardly any custom APIs there.

Otherwise, they are running a risk of completely breaking their app in non-supporting browsers, unless they explicitly test in those browsers

@yoavweiss silent failure as a desirable error handling strategy assumes that doing nothing about errors is an acceptable strategy. I don't think this is always correct. As a web developer, you haven't necessarily "handled" an error by merely ignoring it. It depends on what happens next.

Also, web developers who don't test other browsers deserve to have their apps break "completely". This is often the fastest way to fix them. We need to weigh: what's more likely to help web developers fix stuff quickly to work in the most places?

  1. A report with an error message saying "X is not supported, on line 1234"?
  2. Some report of other misbehavior, or no report at all?

There may be reasons to choose to have setActionHandler take a DOMString instead of an enum — and presumably adding prose to bail silently — but it would leave developers with no way to know whether the handler got registered or not, which seems poor to me.

beaufortfrancois commented 3 years ago

@jan-ivar Can you share links to similar issues where this was rejected?

@beaufortfrancois I wasn't immediately able to locate some of the old discussions, but the web would be full of supports APIs by now, yet it is not. MDN has a section on feature detection, and there are hardly any custom APIs there.

Here are some Web APIs that provide those kind of "supports" patterns:

jan-ivar commented 3 years ago

Thanks, but none of those are for enums (except one redundant one), and there are over 200 enums in the platform.

MediaDevices.getSupportedConstraints()

Again, this one's in the category of "no other means of feature detection exists (dictionary members)". I introduced that API, which was a mistake (see https://github.com/heycam/webidl/issues/107#issuecomment-304334556). But it had complicated reasons for existing: It A) applied to dictionary members which are harder to avoid silent failure of, and B) the getUserMedia API explicitly supports both ignore semantics (ideal constraints) and reject semantics (exact constraints) of the same defined members at the same time, and it's permission related. This made try/catch detection difficult.

In contrast, what we seem to be discussing here is a simple method that throws on invalid input.

MediaRecorder.isTypeSupported() AudioDecoder.isConfigSupported() and VideoDecoder.isConfigSupported()

These aren't purely about feature detection of an API, since an open-ended number of mimetypes and codecs may exist, and results may vary by platform ("User agents are not required to support any particular codec type or configuration.")

XRSystem: isSessionSupported()

Permission related.

BarcodeDetector.getSupportedFormats()

This API does more than feature-detection, where it's a bit redundant, since the constructor already throws:

new BarcodeDetector({formats: ["foo"]});

VM93:1 Uncaught TypeError: Failed to construct 'BarcodeDetector': The provided value 'foo' is not a valid enum value of type BarcodeFormat. at :1:1

jan-ivar commented 3 years ago

Looking at setActionHandler more closely, I see success doesn't tell us anything about whether the handler can ever be called.

I now understand @jernoble's https://github.com/w3c/mediasession/issues/228#issuecomment-522773929: that this API is for pages to declare what actions they support, irrespective of whether the user agent can deliver on any of these actions (which seems irrelevant to most apps).

Also, the spec seems to suggest a user agent MUST implement all these enum values at once (even if it never calls a handler. Why would a user agent not do this?)

With that understanding, I agree silent failure seems appropriate.

But if we prefer silent failure, trivial feature detection, and last-come-only-served behavior, did anyone consider:

navigator.mediaSession.onpause = () => { ... };
navigator.mediaSession.onstop = () => { ... };
navigator.mediaSession.onseekbackward = () => { ... };
navigator.mediaSession.onseekforward = () => { ... };
navigator.mediaSession.onprevioustrack = () => { ... };
navigator.mediaSession.onnexttrack = () => { ... };
navigator.mediaSession.onskipad = () => { ... };
navigator.mediaSession.onstop = () => { ... };
navigator.mediaSession.onseekto = () => { ... };
navigator.mediaSession.ontogglemicrophone = () => { ... };
navigator.mediaSession.ontogglecamera = () => { ... };
navigator.mediaSession.onhangup = () => { ... };

? This seems like the most idiomatic semantics for these 3 behaviors. And it won't throw on Chrome 77.

beaufortfrancois commented 3 years ago

XRSystem: isSessionSupported()

Permission related.

That's not only about permissions. It checks for system hardware abilities such as AR and VR.

BarcodeDetector.getSupportedFormats()

This API does more than feature-detection, where it's a bit redundant, since the constructor already throws:

new BarcodeDetector({formats: ["foo"]});

VM93:1 Uncaught TypeError: Failed to construct 'BarcodeDetector': The provided value 'foo' is not a valid enum value of type BarcodeFormat. at :1:1

BarcodeDetector.getSupportedFormats() returns asynchronously a list of supported BarcodeFormats by the platform, not the full list of enums. See https://wicg.github.io/shape-detection-api/#dom-barcodedetector-getsupportedformats and https://web.dev/shape-detection/#featuredetection For instance, new BarcodeDetector({formats: ["qr_code"]}); would not throw but "qr_code" may not be available on some platforms. That's why this API was added.

beaufortfrancois commented 3 years ago

But if we prefer silent failure, trivial feature detection, and last-come-only-served behavior, did anyone consider:

navigator.mediaSession.onpause = () => { ... };
navigator.mediaSession.onstop = () => { ... };
navigator.mediaSession.onseekbackward = () => { ... };
navigator.mediaSession.onseekforward = () => { ... };
navigator.mediaSession.onprevioustrack = () => { ... };
navigator.mediaSession.onnexttrack = () => { ... };
navigator.mediaSession.onskipad = () => { ... };
navigator.mediaSession.onstop = () => { ... };
navigator.mediaSession.onseekto = () => { ... };
navigator.mediaSession.ontogglemicrophone = () => { ... };
navigator.mediaSession.ontogglecamera = () => { ... };
navigator.mediaSession.onhangup = () => { ... };

? This seems like the most idiomatic semantics for these 3 behaviors. And it won't throw on Chrome 77.

See https://github.com/w3c/mediasession/issues/148 to understand why we didn't go with this pattern.

beaufortfrancois commented 3 years ago

Having navigator.mediasession.actions to return a list of actions supported by the browser in this platform would also help websites to know whether an action is supported.

For instance, even though skipad is a valid enum, navigator.mediaSession.setActionHandler('skipad', _ => { /* ... */ }); doesn't throw but doesn't do anything on Safari.

jan-ivar commented 3 years ago

See #148 to understand why we didn't go with this pattern.

I don't mean EventHandlers. Just attributes. E.g.

callback ActionHandler = any ();

[Exposed=Window]
interface MediaSession {
  attribute ActionHandler? onpause;
  attribute ActionHandler? onstop;
  ...
}

would also help websites to know whether an action is supported.

if ('onpause' in navigator.mediaSession) { /* supported */ }
chrisn commented 3 years ago

Notes from a brief discussion at the 10 Aug 2021 Media WG meeting are here.

jernoble commented 3 years ago

Having navigator.mediasession.actions to return a list of actions supported by the browser in this platform would also help websites to know whether an action is supported.

For instance, even though skipad is a valid enum, navigator.mediaSession.setActionHandler('skipad', _ => { /* ... */ }); doesn't throw but doesn't do anything on Safari.

This is the kind of feature creep I would like to avoid. I most definitely don't want to add a .actions API that queries for whether the UA will ever call that action. Hypothetically speaking, "skipad" might never be called for a user with a USB keyboard, but might be called for a user with a Touch Bar. I wouldn't want the .actions to become a stalking horse for detecting what kind of keyboard a user is using.

The current use case for this feature is to avoid throwing an exception when setting a handler, not for detecting the circumstances under which a UA will call the handler.

beaufortfrancois commented 3 years ago

The current use case for this feature is to avoid throwing an exception when setting a handler, not for detecting the circumstances under which a UA will call the handler.

Let's focus on that then.

Does https://github.com/w3c/mediasession/issues/228#issuecomment-871985154 proposal looks good to you? I'm happy to send a spec PR.

domenic commented 3 years ago

I still don't understand why you'd add anything here. Did anyone have a rebuttal to my https://github.com/w3c/mediasession/issues/228#issuecomment-876549150 ?

yoavweiss commented 3 years ago

I thought https://github.com/w3c/mediasession/issues/228#issuecomment-886455386 covered it:

domenic commented 3 years ago

No, sorry, it did not cover that.

Using an enum that throws breaks less-tested browsers in case developers don't actually use try/catch when using the API with new additions they don't yet support.

The same is true for a new feature detection API, when developers don't use if. There's an explicit isomorphism here between if and try which you can use to show that the same argument applies to both.

An explicit feature detection mechanism coupled with silent failures for non-supporting browsers doesn't suffer from the same issue, and is therefore more resilient to future additions to the API.

Yes, but nobody (except you maybe?) is proposing a silent failure.

To be clear, my objection is specifically to a new API as long as failures continue being loud. If failures are silent, then a new feature detection API makes some sense (although IMO it's not clear what the use case would be). But there's no WG agreement for silent failure, and every time that's been proposed so far it's been rejected, from what I can see.

chrisn commented 3 years ago

I thought we were converging on silent failure, i.e., non-throwing setActionHandler(), based on @jan-ivar's https://github.com/w3c/mediasession/issues/228#issuecomment-887788017:

With that understanding, I agree silent failure seems appropriate.

We now have two proposals: from @beaufortfrancois in https://github.com/w3c/mediasession/issues/228#issuecomment-871985154:

and from @jan-ivar in https://github.com/w3c/mediasession/issues/228#issuecomment-889505257 (which would be a bigger change from what has already shipped).

domenic commented 3 years ago

I apologize for missing that part of https://github.com/w3c/mediasession/issues/228#issuecomment-887788017!

chrisn commented 3 years ago

Thanks @domenic. What are your thoughts, @jernoble and @jan-ivar?

jan-ivar commented 3 years ago

and from @jan-ivar in #228 (comment) (which would be a bigger change from what has already shipped).

Define bigger. Let me modify my (non-event-handler) proposal slightly to avoid it looking like event-handlers (which shouldn't have side-effects), so users don't see onplay and think addEventListener also works:

- navigator.mediaSession.onpause = () => { ... };
+ navigator.mediaSession.actions.pause = () => { ... };

Now the two proposal seem to be:

  1. if (!("pause" in navigator.mediaSession.actions)) console.log("not supported");
    
    navigator.mediaSession.actions.pause = () => { ... }); // silent
  2. if (navigator.mediaSession.actions?.pause != "supported") console.log("not supported");
    
    if (navigator.mediaSession.setActionHandler) {
     navigator.mediaSession.setActionHandler(pause, () => { ... }); // silent
    }

    Pros of 1: gives us getters for the previous action handlers set, for free. Cons of 1: Not entirely free: feature detection only works ahead of setting an action (which is when you should feature detect). Cons of 2: A lot of surface dedicated to feature detection we no longer need to use the API, yet no getter?

Web compat: both proposals will fail to detect support in Chrome today.

I vote for 1 since feature detection is no longer really necessary to use the API.

jernoble commented 3 years ago

@beaufortfrancois said:

...I'd like to propose adding a new actions readonly attribute to the MediaSession interface that would contain an array of MediaSessionAction supported by the UA.

Making that attribute readonly would prevent polyfills from adding support for new actions in existing UAs. Not a blocker as such, but this would be a reason to prefer @jan-yvar's "Option 1" vs. "Option 2".

beaufortfrancois commented 3 years ago

@jan-ivar I believe your proposal "2" does not reflect the one at https://github.com/w3c/mediasession/issues/228#issuecomment-871985154. It should look like this:

  1. if (navigator.mediaSession.actions.includes('pause')) {
     navigator.mediaSession.setActionHandler('pause', () => { ... }
    }

Since Media Session shipped in Chrome, Firefox, and Safari already, I believe changing drastically the shape of action handlers for the sake of bringing better feature detection is not worth it. That's me wearing my web developer hat here ;)

@jernoble We can remove readonly if that helps. I don't feel strongly about it. Your point is valid.