w3c / mediacapture-extensions

Extensions to Media Capture and Streams by the WebRTC Working Group
https://w3c.github.io/mediacapture-extensions/
Other
19 stars 14 forks source link

[Track Stats API] SameObject is a confusing API shape #112

Closed henbos closed 9 months ago

henbos commented 9 months ago

In order to resolve the GC concerns in #98, we now return one and the same object every time:

partial interface MediaStreamTrack {
  [SameObject] readonly attribute MediaStreamTrackVideoStats stats;
};

The API lends itself easily to mistakes like:

const s1 = track.stats;
/* Several seconds later... */
const s2 = track.stats;
const deltaFrames = s2.totalFrames - s1.totalFrames;  // Always 0 because s1 == s2!

I see two ways forward:

Proposal A Move the stats to MediaStreamTrack, then it would be much more obvious that you had to do this:

partial interface MediaStreamTrack {
  readonly attribute unsigned long long totalFrames;
  /* Other stats here */
};

const snapshotTotalFrames = track.totalFrames;
/* Several seconds later... */
const deltaFrames = track.totalFrames - snapshotTotalFrames;

Proposal B If on the other hand the GC can handle short-lived objects well enough not to impact decisions about API ergonomics, changing it to a dictionary would look like this:

partial interface MediaStreamTrack {
  MediaStreamTrackVideoStats stats();  // Only methods can return dictionaries, not attributes.
};

const s1 = track.stats;
/* Several seconds later... */
const s2 = track.stats;
const deltaFrames = s2.totalFrames - s1.totalFrames;  // Works!

Proposal C Leave as-is, track.stats being SameObject is a feature.

No strong preference from me, either works fine, but I think SameObject track.stats is a little confusing

Edit: Proposal D

track.stats.toJSON() is explicit and gets the job done. This is my preferred option.

henbos commented 9 months ago

@jan-ivar @padenot @youennf Thoughts?

henbos commented 9 months ago

PR #113 reflects Proposal A

youennf commented 9 months ago

Agreed SameObject attribute is a little confusing. I like that we group data in a dictionary named stats. It gives the intent (observing, not realtime processing) and it does not crowd the API surface if we need to extend it. I do not think optimising for GC is crucial in that case.

I would tend to go with proposal B.

henbos commented 9 months ago

Sure, here is also Proposal B: #114

jan-ivar commented 9 months ago

The API lends itself easily to mistakes like:


const s1 = track.stats;
/* Several seconds later... */
const s2 = track.stats;
const deltaFrames = s2.totalFrames - s1.totalFrames;  // Always 0 because s1 == s2!

JS has reference semantics for objects, so anyone committing this mistake has themselves to blame I feel.

From § 6.3. Use attributes or methods appropriately: "obj.attribute === obj.attribute must always hold. Returning a new value from an attribute getter each time is not allowed."

It seems a step backwards to have gotten this far only to decide to create a dictionary after all just to read some values.

I prefer C or A. cc @padenot

jan-ivar commented 9 months ago

We could make the attributes enumerable, if we believe a common use case will be to stash the lot of them. E.g.:

const s1 = {...track.stats};
/* Several seconds later... */
const s2 = {...track.stats};
const deltaFrames = s2.totalFrames - s1.totalFrames;  // works because s1 != s2
padenot commented 9 months ago

Additionally, have a look at the warning a tthe end of this section: https://html.spec.whatwg.org/#dom-media-buffered (this anti-pattern is unfortunately present in a number of HTMLMediaElement attributes).

henbos commented 9 months ago

We could make the attributes enumerable, if we believe a common use case will be to stash the lot of them.

Actually the main reason why I didn't like the interface was when I tried to do {...track.stats} in my WPTs I ended up with an empty copy, so I had to write a custom copy method for it. But if we can just add an [Enumerable] label to the attributes and that would allow copy via the {...track.stats}, then that would be my preferred option.

Proposal D: Make MediaStreamTrackVideoStats attributes enumerable

henbos commented 9 months ago

Additionally, have a look at the warning a tthe end of this section: https://html.spec.whatwg.org/#dom-media-buffered (this anti-pattern is unfortunately present in a number of HTMLMediaElement attributes).

This refers to:

Returning a new object each time is a bad pattern for attribute getters and is only enshrined here as it would be costly to change it. It is not to be copied to new APIs.

This argues that if we use an attribute rather than a method, we should use [SameObject]. FWIW proposal B changed it from an attribute to a method when doing the dictionary (attributes are not allowed to return dictionaries), but I like proposal D since that allows the app to chose whether to read or copy, which hopefully makes everyone happy.

youennf commented 9 months ago

There are cases where implementations might not need to pass per frame information to the process running JS, for instance:

In that sense, having a separate observer object for MediaStreamTrack on which to expose all stat attributes would solve a few issues:

IIRC, Jan-Ivar proposed this in the past, should we try to get a more concrete proposal E?

henbos commented 9 months ago

Re: proposal D, I can't figure out how to make the attributes enumerable - slapping on [Enumerable] on the attributes gives a WebIDL compile error.

Re: separate observer object (Proposal E), how is this different than Proposal C? MediaStreamTrackVideoStats is already a separate interface object whose only use case is observing the stats and exposing them with an attribute. In Chromium I lazily instantiate this object the first time the track.stats getter is called.

youennf commented 9 months ago

The main difference would be that the initialisation could be done asynchronously (via a ready promise attribute on the observer or an observer promise attribute on the track).

henbos commented 9 months ago

OK you mean something like?

const statsObserver = await track.statsObserver();
console.log(statsObserver.totalFrames);

Or

await track.statsObserver.ready();
youennf commented 9 months ago

Right, but as attributes, so await track.statsObserver or await track.statsObserver.ready. statsObserver is a bit long as well but seems ok.

henbos commented 9 months ago

I have mixed feelings about an API that requires explicitly initializing it.

What if initialization happens automatically by track.stats getter being invoked the first time, and we can make initialization async if we say that the track counters start from zero at the point of initialization. That means you won't have to expose any stats until the first task execution cycle after the getter was invoked, since the counters are cached at 0 anyway.

E.g.

await getUserMedia(30 fps);
Wait one second;
console.log(track.stats.totalFrames);  // Even though 30 frames were produced by the camera, 0 is returned.
Wait another second;
console.log(track.stats.totalFrames);  // 30 is logged even though the camera has produced 60 frames.
youennf commented 9 months ago

What if initialization happens automatically by track.stats getter being invoked the first time, and we can make initialization async if we say that the track counters start from zero at the point of initialization.

I thought about that initially but then I thought you would not really like it, as the web application might want to know when stats are actually flowing.

henbos commented 9 months ago

I can see that, maybe const observer = await track.stats; is better as to let the totalFrames actually be the total number of frames and not the delta since initialization, but still allow sync observer.totalFrames in non-async contexts.

henbos commented 9 months ago

But either way, @jan-ivar what is WebIDL for making the attributes iterable? I.e. {...stats} should copy them

jan-ivar commented 9 months ago
[Default] object toJSON();
jan-ivar commented 9 months ago

There are cases where implementations might not need to pass per frame information to the process running JS, for instance:

  • getUserMedia audio -> MediaRecorder

MediaRecorder usage seems quite low, and is generally considered an old way of doing things, and its spec appears to be languishing. So I don't see a lot of advantages in optimizing for that use case.

A user agent might still be able to optimize it however, if it really wanted to (as long as JS never touches track.stats), at the cost of giving an inaccurate initial value if it ever is touched (at which point the optimization fails anyway).

Maybe we could add a note about that?

I think it's going to be much more common for use cases to involve the content (aka render) process.

bradisbell commented 9 months ago

MediaRecorder usage seems quite low

Compared to what? The usage of Web Codecs seems to be practically nil in comparison.

For all of us making audio and video applications that encode client-side, there aren't a whole lot of reasonable options beyond these, and MediaRecorder is the only option I'm aware of that we can use on a variety of browsers.

I don't believe it matters that MediaRecorder's usage is low compared to the world of web pages loaded in Chromium. The usage is surely high for web applications in this genre.

I trust you all to design a reasonable spec, knowing that there are tradeoffs that need to be decided for the best platform quality long-term. However, please consider that some features that are small in usage might actually be massive in impact. A person using MediaRecorder may be streaming video to 10,000 people, but the usage impact only counts as one hit.

henbos commented 9 months ago
[Default] object toJSON();

This allows doing track.stats.toJSON() to get a copy, and thus solves the core of the issue: not having to write your own custom copy method to get a snapshot of all of the interface's attributes. However I noticed that this is not sufficient for {...track.stats} to work, and this still just returns {}. It would be nice if it also worked, but that is not mandatory since toJSON() gets the job done too.

Edit: Since an interface isn't one of the JSON types, perhaps it makes sense that {...track.stats} doesn't work, and toJSON is the preferred method anyway? I see other examples of simple interfaces with attributes and they all seem to just do toJSON too

henbos commented 9 months ago

There's two issues being discussed here:

In order to stay on course I've filed #116 as a separate discussion, allowing PR #115 to close this one.