Should WebCodecs be exposed in Window environments?

youennf commented 3 years ago

As a preliminary to https://github.com/w3c/webcodecs/issues/199, let's discuss whether it is useful to expose WebCodecs in Window environment. See the good discussions in https://github.com/w3c/webcodecs/issues/199 for context.

youennf commented 3 years ago

@chcunningham, can you describe the usecases you think will benefit from using WebCodecs in Window? It seems you are referring to MSE low latency but I would like to be sure.

chcunningham commented 3 years ago

In the previous issue I mentioned the use cases as:

... boutique use cases where using a worker is just extra hoops to jump through (e.g. pages that don't have much of a UI or pages that only care to encode/decode a handful of frames).

And I later clarified:

My argument is: for folks who do not need lots of codec io, or for which the main thread is doing little else besides codec io, the main thread is adequate. Requiring authors to use workers in this case just adds obstacles.

I used the word "boutique" to suggest that such uses need not fit into one of our well known categories. The web is a vibrant surprising place (and my own creativity is pretty limited). Can we agree that such uses will exist and they may meet one or both of the scenarios I gave above?

I didn't intend to say MSE low latency is in this camp. MSE low latency has lots of codec io and many sites in that category will have a busy main thread.

Let me grab a few other snips from that issue since we're moving here...

Keep in mind that WebCodecs explicitly does not do actual decoding/encoding work on the main (control) thread (be that window or worker). The work is merely queued there and is later performed on a "codec thread" (which is likely many threads in practice).

@aboba @padenot: opinions on the core issue? Is it reasonable to expose on Window for use cases where Window's main thread ability is sufficient?

padenot commented 3 years ago

I see it as just a control thread, to me, it's fine. I don't expect folks to do processing on the media, but I do expect developers to use this API in conjunction with the Web Audio API, WebGL or Canvas.

youennf commented 3 years ago

I do not have any issue with Window being the control thread. The current proposal is to surface media data in the control thread. This somehow conflates control and media threads. If you look at low-level APIs like VideoToolbox (which WebCodecs try to take inspiration from), the thread of the output data is not the thread where parameters (say bitrate) are set or where the input data is provided.

With @chcunningham, we agreed that using main thread as the output data thread is a potential issue:

web pages should use WebCodecs from a DedicatedWorker and that the above snippet has a potential memory issue.

If we look at the WebCodecs API surface, it is easy to write shims that expose codec APIs to window environments. The main advantage I see is that it tells authors what is probably best to do without forbidding them to do what they want to do.

chcunningham commented 3 years ago

The goal of the control vs codec thread separation was to ensure that implementers don't perform actual encoding/decoding work on the main thread. We can maintain that separation while still receiving outputs on the main thread.

The actual threads used under the hood aren't what we intend to describe. For example, in Chromium the VideoToolbox APIs are invoked in a completely different sandboxed process from where the web page is rendered. And, in that separate process, we expect that actually many threads are used.

With @chcunningham, we agreed that using main thread as the output data thread is a potential issue:

I don't agree that the main thread inherently creates a memory issue. The nuance from the other issue is important. There, you wrote:

Based on this understanding, if the controlling thread is spinning, the frames might stay blocked in the controlling thread task queue, which is a potential memory problem.

I agree* with the above statement, irrespective of the whether the thread is the main window thread vs the main dedicated worker thread.

* nit: depending on the implementation, it may be more of a performance problem rather than a memory problem. For Chromium, I think we have a finite pool of frames to allocate to a camera stream. If users fail to release (close()) the frames back to us, the stream simply stalls.

youennf commented 3 years ago

I don't agree that the main thread inherently creates a memory issue

Do you agree though that, on a worker thread, spinning the controlling thread is most certainly a bug that the web application can (and should fix)? But that this is not the case on main thread (i.e. some code outside of the control of the web application may randomly spin the web app controlling thread). If so, we can agree that it is safer for any application to not use main thread as 'media' thread.

So far, no use case has been brought that justifies taking this risk.

For Chromium, I think we have a finite pool of frames to allocate to a camera stream. If users fail to release (close()) the frames back to us, the stream simply stalls.

That is a very good point we should discuss. The current API design makes surfacing to the web all those OS/implementation details, which is something we should think of very carefully. This can lead to issues in terms of perf/memory/fingerprinting/interoperability.

chcunningham commented 3 years ago

Do you agree though that, on a worker thread, spinning the controlling thread is most certainly a bug that the web application can (and should fix)?

Generally yes.

But that this is not the case on main thread (i.e. some code outside of the control of the web application may randomly spin the web app controlling thread).

I replied to this point in the previous issue.

I agree that some apps don't know what's running on their page. Those apps don't fit the use case I gave....

To emphasize, I expect some sites will offer very focused experiences, free of third party scripts, ads, etc, and for which the window main thread plenty available.

So far, no use case has been brought that justifies taking this risk.

I offered scenarios in the comments above. In these scenarios, there is no memory/perf risk. Why is this not sufficient?

Maybe a concrete example would help. Imagine a simple site that streams a feed from a single security camera. This need not be a flashy big name site. It might even be someones hobby project. The UI for this site is just a box with the video feed. The site has no adds, no third party scripts. It's main thread is doing very little. There is ample room for the main thread manage fetching, codec io, and rendering of the video.

youennf commented 3 years ago

To emphasize, I expect some sites will offer very focused experiences, free of third party scripts, ads, etc, and for which the window main thread plenty available.

This is not really a use case for exposing to main thread, this is more a scenario where the issues I am pointing out may not happen (but see below).

First, the JS shim works equally well and does not bring any major drawback AFAIK. Do you agree?

Also, if the web site wants frame access, this is probably to do some fancy processing on each frame. A worker processing cost is probably neglectible at this point, and the fancy processing is probably best done in the worker anyway. If the page is not doing any per-frame processing and wants to really optimise things, I agree a worker might be overkill. Website will best be served by MediaStreamTrack+HTMLVideoElement.

The UI for this site is just a box with the video feed.

Why not using a MediaStreamTrack directly created by the UA from the decoder output then?

The site has no adds, no third party scripts. It's main thread is doing very little

Are you suggesting to restrict exposure of WebCodecs on window environments to only those safe cases?

In any case, let's say as a user I sent an email containing that website URL to a friend and the user clicks on the link from a web mail (say gmail). Depending on the web mail, the website and the UA, the website may be opened in the same process as the web mail, or in a process with other pages. This might be especially true on low-hand devices as a UA decision to save memory. Let's also say that user opened two pages of the same website, a simple one and a complex one which contains some third-party scripts. Website will not know whether the two pages are running in the same process or not.

To reliably select whether using a worker or not, a web developer will need to understand a lot of things and do extensive research. In practice, it will be difficult to get any guarantee across User Agents, OSes and devices.

Exposing WebCodecs solely to workers is a good hint to web developers that they should do their processing in workers.

sandersdan commented 3 years ago

I disagree with the premise that it's only correct to use WebCodecs in a worker even in the case the main thread is contended.

Offloading WebCodecs use can improve latency, but you can still get full WebCodecs throughput when controlling it from a contended thread. Low-latency use cases are not the only use cases that WebCodecs is intended for (if they were, we wouldn't have an input queue).

WebCodecs allows for low-latency use but I do not think we should require apps to use it that way.

First, the JS shim works equally well and does not bring any major drawback AFAIK. Do you agree?

Having to have a worker context around to be able to use the Chrome JS Console to experiment with WebCodecs would substantially frustrate my learning and debugging. And I'm an expert at this!

Depending on the web mail, the website and the UA, the website may be opened in the same process as the web mail, or in a process with other pages.

Main thread contention across sites is to me a quality of UA issue, and is substantially improved in recent history due to the widespread adoption of site isolation. Blink is currently experimenting with multiple render threads in a single process, which has the potential to resolve the remaining cases.

Exposing WebCodecs solely to workers is a good hint to web developers that they should do their processing in workers.

I think it's generally understood that the most direct solution to main thread contention is worker offload, so any sites that bother to collect performance metrics won't have any confusion here.

bradisbell commented 3 years ago

As a preliminary to #199, let's discuss whether it is useful to expose WebCodecs in Window environment.

For my projects, I can't think of a use case where I wouldn't use WebCodecs from Window.

I'm manipulating audio/video, with the video and canvas objects actually displayed on the page. Shuffling all that data off to a worker thread, just to then have it shuffled off to a codec on some other user-agent internal process or thread seems like unnecessary overhead, and is definitely a hassle for the developer.

I wholeheartedly agree with @chcunningham that there will be other use cases not imagined here.

dalecurtis commented 3 years ago

Unless TAG or another standards group has concluded that certain classes of APIs must be limited to worker contexts, I think the phrasing of the initial question is inverted. I.e., we should instead be discussing why wouldn't we expose this on window. We shouldn't apply restrictions without reason.

The only reason I can think is that we want to limit the ability of users to shoot themselves in the foot under certain specific low latency scenarios. While the reasons against seem multitude -- especially in the pain it would cause for common use cases and how it would impact first frame latency.

youennf commented 3 years ago

Low-latency use cases are not the only use cases that WebCodecs is intended for (if they were, we wouldn't have an input queue).

The driving use-cases I heard are low-latency MSE and WebRTC-like stacks. I will be interested in hearing more about the other use-cases, whether they need raw video frame fine-grained access...

Having to have a worker context around to be able to use the Chrome JS Console to experiment with WebCodecs would substantially frustrate my learning and debugging. And I'm an expert at this!

This seems like a usability issue. I fear that the same principle will end up pages using WebCodecs in main thread while they should not. WebCodec examples I saw are most of them main thread only even though they are dealing with real time data.

Blink is currently experimenting with multiple render threads in a single process, which has the potential to resolve the remaining cases.

These are all good points that might solve the issues I described. It is great to see this coming. Given the idea is to go very quickly with WebCodecs, I think it makes sense, for V1, to restrict the feature set to what is needed for sure. And progressively extend API after careful consideration.

It is always easy to extend an API to Window environment in the future. I do not see WebCodecs in Window environment as a P1 but as a P2.

youennf commented 3 years ago

For my projects, I can't think of a use case where I wouldn't use WebCodecs from Window.

That is interesting. Can you provide pointers to your applications? Window APIs do not necessarily have to be the same as worker APIs.

In WebRTC, we went to a model where APIs are Window only/mostly but do not give the lowest granularity. Things might change and we are discussing exposing the lowest granularity in Worker/Worklet environments, not necessarily in Windows environments.

youennf commented 3 years ago

The only reason I can think is that we want to limit the ability of users to shoot themselves in the foot under certain specific low latency scenarios.

Low latency scenario is one such example. A video decoder that is using a buffer pool might end up without available buffers in case they are all enqueued for delivery while there is thread contention, leading to potential decoding errors. Pages might more often end up in those cases on main thread, and these issues might not be obvious to debug given buffer pools might vary across devices.

Again, I am hopeful this can be solved. It seems safer though to gradually build the API. Exposing to Worker at first covers major know usecases and allows early adopter through workers or JS shims to start doing work in Window environment. This might help validating the model is right also for window environments.

dalecurtis commented 3 years ago

A video decoder that is using a buffer pool might end up without available buffers in case they are all enqueued for delivery while there is thread contention, leading to potential decoding errors. Pages might more often end up in those cases on main thread, and these issues might not be obvious to debug given buffer pools might vary across devices.

I agree buffer pools are an issue, but I feel that's orthogonal to window vs worker for a few reasons:

The window/worker distinction is meaningless on low end devices.
Buffer pool starvation generally doesn't lead to decoding errors, just a slow down in throughput - so again this is really only a concern for low latency cases. Is there a case that you're thinking of that causes errors?
Since starvation is not solved even in a worker, the right place to address buffer starvation is through tooling and API improvements. I.e., developer console messages upon starvation or when scenarios that may lead to starvation are detected (e.g., not calling close() on VideoFrames). API shape improvements can be of the form of mechanisms which indicate that hardware capped buffer pools shouldn't be used (e.g., accelerationPreference = 'deny').

Again, I am hopeful this can be solved. It seems safer though to gradually build the API. Exposing to Worker at first covers major know usecases and allows early adopter through workers or JS shims to start doing work in Window environment. This might help validating the model is right also for window environments.

I have trouble following this logic. I don't agree that limiting to a worker covers the main use cases. We'll certainly query all the developers in our OT though for feedback. Limiting to a worker will be detrimental to high frame rate rendering (transferControlToOffscreen would help this -- but is Chromium only) and time to first frame. Especially for single-frame media such a limit would dominate the total cost.

aboba commented 3 years ago

Youenn said:

A video decoder that is using a buffer pool might end up without available buffers in case they are all enqueued for delivery while there is thread contention

Pool exhaustion seems most likely to be caused by a memory leak (e.g. VideoFrame.close() not being called), possibly in conjuction with use of a downstream API for rendering (e.g. Canvas, MSTGenerator, WebGL or WebGPU). It seems that this problem can occur regardless of whether WebCodecs or other related APIs runs in a Worker. So to address pool exhaustion concerns, you'd probably want to require related APIs to automatically free buffers.

chcunningham commented 3 years ago

We sent an email to all participants in the Chrome origin trial. On the question of "should we keep WebCodecs in Window scope", the tally was 10 in favor, 6 neutral, 1 ambivalent, 1 opposed. Overall I think this shows a compelling case for maintaining window-exposed interfaces. Breakdown below.

Click the summaries below to see reply excerpts.

The opposed response argues for forcing developers into a pattern that frees the main thread.

@willmorgan of iproov.com wrote: > I am in favour of moving this to a DedicatedWorker scope and think that doing so would be a big tug in the right direction for freeing up the UI thread, and encouraging developers to adopt that pattern for many other things.

But many apps found no performance reason to use workers and highlight that workers creates additional complexity.

@koush of vysor.io wrote: > I am using web codec in the window scope just fine. For low latency as well (vysor).... > I actually tried a worker implementation and saw no appreciable performance gains, and the code became more complicated: had to use workers, off screen canvas, detached array buffer issues, etc. @etiennealbert of jitter.video wrote: > Our use case is to convert sequences of canvas generated images into a video. When a user triggers this kind of job it becomes the one and only thing in our interface, and we do nothing else until the final video file is ready. We don't do this for performance reasons but because this "export" job is an important job to the eyes of our users. @AshleyScirra of scirra.com wrote: > We also intend to use WebCodecs in future for transcoding media files. Again providing the API is async it seems to only be a hindrance to make the API worker-only, as this use case is not latency sensitive, and should already be doing the heavy lifting off-main-thread with an async API. @BenV of bash.video wrote: > We also have a smaller internal application for prototyping camera rendering effects that currently exclusively uses WebCodecs in the main thread, mostly just out of convenience. We're using TypeScript/Babel/Webpack and the process of setting up a worker causes enough friction that it was just easier to just do everything in the Window scope. All this application does is capture frames from the camera, manipulate them, and render the output so contention with UI updates is not really an issue so simplicity won out for this case. @bonmotbot of Google Docs wrote: > We are using the ImageDecoder API to draw animated GIF frames to canvas. We're using it in the Window scope currently, so this change would be breaking. From our perspective, adding a Worker to this use case would add complexity without much benefit.

Also, some apps desire to use WC in combination with other APIs that are only Window-exposed.

Mentioned examples include: Canvas, RTCDataChannel, WebAudio, input (e.g touch) events. Forcing apps to use DedicatedWorkers adds complexity to code that needs other Window-only APIs.

From the performance angle, Canvas is of particular note. It is the most common path for apps to paint VideoFrames. OffscreenCanvas is not yet shipped in Safari and Firefox, which means no way to paint directly from a worker. OffscreenCanvas may eventually ship everywhere, but its absence now adds complexity to using WebCodecs from workers and removes the theoretical performance benefit.

@surma of squoosh.app (Google) wrote: > The decoding code path of Squoosh runs on the main thread. If we need to use Wasm, we invoke that in a worker, but for native decoding we need to rely on Canvas (as OffscreenCanvas isn’t widely supported), so we have no choice. So the WebCodecs code in the PR runs on main thread as well. In my opinion, at least in the context of *Image*Decoder, I am convinced that the API should be offered on the main thread. The API is async anyway so decoding can (and should) happen on a different thread analogous to Image.decode(). Aside: the use of Canvas above is not unique to ImageDecoder. @AshleyScirra of scirra.com wrote: > Further it's already a pain how many APIs work in the Window scope but are not available in workers - adding APIs that are only available in workers and not in the window scope seems like it would just be making this even more of a headache. In general we want to write context-agnostic code that works in both worker and window mode. @jamespearce2006 of grassvalley.com wrote: > We are (or will be) using WebCodecs in a DedicatedWorker scope, and I think it makes sense to encourage this as the normal approach. However, given that WebAudio contexts are only available in the Window scope, I think requiring use of a dedicated worker may be too burdensome for some simpler audio-only applications.

Finally, even for apps that use WC in workers, Window-interfaces are useful for synchronous feature detection.

@BenV of bash.video wrote: > We do check for the existence of WebCodecs in the Window scope so that we can pick the appropriate rendering path synchronously based on browser features rather than having to spin up a DedicatedWorker and query it asynchronously.

youennf commented 3 years ago

There was some interesting feedback in particular from 'many apps found no performance reason' at last W3C WebRTC WG meeting. @aboba, would it be possible to have the web developer feedback here?

youennf commented 3 years ago

Also, some apps desire to use WC in combination with other APIs that are only Window-exposed.

This is a fair concern. Do we know which APIs are missing in workers?

youennf commented 3 years ago

Finally, even for apps that use WC in workers, Window-interfaces are useful for synchronous feature detection.

More and more feature detection are done asynchronously, for instance listing existing codec capabilities. It is very easy to write a small shim doing that as an async function.

youennf commented 3 years ago

Do we know which APIs are missing in workers?

Oh, I now see the list. For Canvas, there is OffscreenCanvas, is the issue to learn it or are there missing features?

About RTCDataChannel, it is now exposed in workers in Safari and there is a PR for it. As of WebAudio, this might be feasible once MediaStream/MediaStreamTrack can be exposed in workers, which is ongoing work.

surma commented 3 years ago

For Canvas, there is OffscreenCanvas, is the issue to learn it or are there missing features?

OffscreenCanvas is not widely supported, so it can only be used as a progressive enhancement. But I envision Web Codecs to also be useful with other, main-thread-only APIs like CSS Paint API or WebUSB.

chcunningham commented 3 years ago

There was some interesting feedback in particular from 'many apps found no performance reason' at last W3C WebRTC WG meeting. @aboba, would it be possible to have the web developer feedback here?

@youennf, each of those statements in my above comment is a clickable zippy that expands to show the developer feedback. LMK if this isn't what you meant.

chcunningham commented 3 years ago

@youennf, does the web developer feedback above persuade you to maintain window exposure for WebCodecs?

koush commented 3 years ago

For Canvas, there is OffscreenCanvas, is the issue to learn it or are there missing features?

OffscreenCanvas is not widely supported, so it can only be used as a progressive enhancement. But I envision Web Codecs to also be useful with other, main-thread-only APIs like CSS Paint API or WebUSB.

@surma That's how Vysor already works today (https://app.vysor.io/). As you mentioned, WebUSB is main thread only, so removing WebCodec from the main thread would increase complexity a good amount. As noted in @chcunningham's comment, worker webcodec was an avenue I tried and reverted since it just made things complicated without any noticeable performance benefit. I think it likely actually increased latency since I needed to copy the WebUSB array buffers before passing them along as detached buffers, though this is something I could probably refactor to work around but didn't want to bother.

jernoble commented 3 years ago

For Canvas, there is OffscreenCanvas, is the issue to learn it or are there missing features?

OffscreenCanvas is not widely supported, so it can only be used as a progressive enhancement.

Chair hat off; Implementor hat on

This is a non-sequitur. The UAs which haven't implemented OffscreenCanvas also haven't implemented WebCodecs.

But I envision Web Codecs to also be useful with other, main-thread-only APIs like CSS Paint API or WebUSB.

Performing I/O on the main thread is in-and-of-itself a bad idea, and this incompatibility should be mitigated by allowing WebUSB in a worker. And CSS Painting already reasonably executes in a Worklet and not the main thread.

Frankly, there is specification experience in performing high-priority APIs on the main thread; the original WebAudio ScriptProcessorNode was main-thread only, and it was objectively terrible. GC hiccups, heavy main thread JS, or even just complicated layout would destroy the performance of a ScriptProcessorNode-based audio graph, even with incredibly high buffer sizes and the resulting high latency. Those same issues–main-thread stalls, dropped frames, and resulting high latency–will happen with main-thread WebCodecs as well.

In my opinion, no site should ever use the (non-control) WebCodecs APIs from the main thread if they care at all about performance. And to my ears, none of the use cases given above are compelling.

jernoble commented 3 years ago

@chcunningham said:

Can we agree that such uses will exist and they may meet one or both of the scenarios I gave above?

Chair hat on

Generally, we do not make specifications by hypothesizing that use cases will eventually exist for the changes we propose. Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? (As far as I can tell, the responses to your survey have all effectively said "window-exposed is easier/simpler", which strikes me as a preference, not a use case.)

youennf commented 3 years ago

@youennf, does the web developer feedback above persuade you to maintain window exposure for WebCodecs?

From WebRTC WG meeting (https://lists.w3.org/Archives/Public/public-webrtc/2021May/0048.html), what I heard was that main thread processing is difficult due to GC for instance and might be disrupted by 100ms hiccups for instance. On the other hand, window API is nice for prototyping and for migration. @aboba, please correct if this is an inaccurate description.

From the information you gathered, it seems that worker development is hard and might be missing some APIs.

I think we should concentrate our energy in making the platform the best we can where it will be really optimal, i.e. workers. That means have more APIs in workers (spec work is ongoing for APIs I know of, not sure about WebUSB et al. It should be complemented by implementation work which is starting for some of these APIs, like WebRTC). That also means devoting time and effort to improve documentation, examples and even JS libraries to ease adoption.

Once we have the best video processing platform in workers, we can certainly discuss exposing such features to window.

jamespearce2006 commented 3 years ago

Presumably, any use case can be made to work with worker-scope-only web codecs. However, part of what makes an API successful is how easy it is for developers to use. WebAudio contexts are only available in the window-scope. For developers working on simple, audio-only use-cases, isn't there a risk that worker-only WebCodecs will be perceived as adding too much complexity?

On Thu, 3 Jun 2021 at 05:59, Jer Noble @.***> wrote:

@chcunningham https://github.com/chcunningham said:

Can we agree that such uses will exist and they may meet one or both of the scenarios I gave above?

Chair hat on

Generally, we do not make specifications by hypothesizing that use cases will eventually exist for the changes we propose. Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? (As far as I can tell, the responses to your survey have all effectively said "window-exposed is easier/simpler", which strikes me as a preference, not a use case.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/webcodecs/issues/211#issuecomment-853562883, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4YHGZNKUZ66H6BZUDUCITTQ4D4RANCNFSM4343POLA .

surma commented 3 years ago

@jernoble I totally agree with you that it would be better to use all these APIs in a worker, and that’s the future I want to work towards for the web platform.

That being said, I usually try and take a pragmatic approach because as of now, workers are annoyingly hard to use and have barely seen any adoption, despite being around since IE10 and being fully supported in every browser. But we still don’t have ES Module support in workers, making them hard to use/integrate in modern code bases, postMessage is a painful API that often needs to be work around etc. Limiting an API to workers only might significantly hinder its adoption and perceived usefulness.

Let me also quickly reiterate that me as an individual is mostly interested in ImageDecoder, where GC pauses and other concerns are not really an issue.

This is a non-sequitur. The UAs which haven't implemented OffscreenCanvas also haven't implemented WebCodecs.

I guess this is true, it is a non-sequitur. But we also know that UAs don’t implement proposals chronologically, but rather in prioritize by need/interest/agenda. So UAs not implementing OffscreenCanvas (after many years) does not mean that Web Codecs will remain unimplemented for the same amount of time.

youennf commented 3 years ago

Let me also quickly reiterate that me as an individual is mostly interested in ImageDecoder, where GC pauses and other concerns are not really an issue.

My concern is mostly with video codecs which are doing repetitive tasks with big data. I agree ImageDecoder has different requirements.

AshleyScirra commented 3 years ago

Transcoding media files is not latency sensitive, which is an important use case (for us) that seems reasonable to call from the main thread, especially if the API itself is async.

As I said in my previously quoted feedback, it's already a real development headache how many APIs are supported only in window and not on worker. It makes it really difficult to write context-agnostic code that can work in either. Please don't compound the problem by starting to add APIs that are supported only in workers and not on window. In my opinion APIs should always be supported on both wherever possible.

chcunningham commented 3 years ago

This is a non-sequitur. The UAs which haven't implemented OffscreenCanvas also haven't implemented WebCodecs.

This concern comes directly from user feedback, quoted above. That feedback does not hinge on Safari shipping WebCodecs.

Additionally, Safari, as represented by yourself and Youenn, have raised several concerns and about main thread performance. I think it is fair to evaluate those concerns and proposals in the context of how they would actually play out when implemented alongside other (un)shipped APIs.

Frankly, there is specification experience in performing high-priority APIs on the main thread; the original WebAudio ScriptProcessorNode was main-thread only, and it was objectively terrible.

The deadlines for audio rendering are much much tighter than for codec IO. Especially for codec IO that is not at all latency sensitive, as several real users have suggested.

Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? (As far as I can tell, the responses to your survey have all effectively said "window-exposed is easier/simpler", which strikes me as a preference, not a use case.)

IMO, the use cases from quoted OT users are concrete. Beyond simplicity, arguments were also presented for no appreciable performance gain (see @koush quote) as well as cases where real time performance is not desired as @AshleyScirra mentions above. Why are these not compelling?

jernoble commented 3 years ago

Again, chair hat off here

This is a non-sequitur. The UAs which haven't implemented OffscreenCanvas also haven't implemented WebCodecs.

This concern comes directly from user feedback, quoted above. That feedback does not hinge on Safari shipping WebCodecs.

That concern should be directed at hypothetical future UAs that implement WebCodecs but not OffscreenCanvas. IMO, it's super weird to change the design of a future web specification because UAs haven't fully implemented an existing, current web specification. That's putting the cart very much before the horse.

Frankly, there is specification experience in performing high-priority APIs on the main thread; the original WebAudio ScriptProcessorNode was main-thread only, and it was objectively terrible.

The deadlines for audio rendering are much much tighter than for codec IO. Especially for codec IO that is not at all latency sensitive, as several real users have suggested.

I would point out that ScriptProcessorNode was already operating on "codec IO" latency timeframes, and still encountered constant dropouts. GC delays, main thread jank, and just complicated layout can occupy the main thread for full seconds, not just the tens of milliseconds needed to swamp normal audio output. An operation as simple as the user resizing the browser window will completely stall any main thread graphics pipeline.

Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? Might I suggest that coming up with concrete use cases for window-exposed WebCodecs will help settle this issue? (As far as I can tell, the responses to your survey have all effectively said "window-exposed is easier/simpler", which strikes me as a preference, not a use case.)

IMO, the use cases from quoted OT users are concrete. Beyond simplicity, arguments were also presented for no appreciable performance gain (see @koush quote) as well as cases where real time performance is not desired as @AshleyScirra mentions above. Why are these not compelling?

I requested use cases with Chair at on; but for clarity I'm answering this question with my Implementer hat on

As I said above, most of those are not use cases; those are preferences. Even in the case of non-realtime export, e.g., main thread hiccups can stall the decoder thread, leading to longer encode times and worse end-user performance. The preference stated above is "I don't care about that, the main thread is easier," and I don't find that compelling.

A concrete use case actually was stated above: using WebCodecs in conjunction with WebUSB requires dispatching to and from the main thread. However, even this concrete use case I feel comfortable (again in my own opinion) dismissing: WebUSB's questionable design decision should be fixed by WebUSB, and not compounded by new specifications.

AshleyScirra commented 3 years ago

Our use case is to trancode media files, or import animated images via ImageDecoder, in which case our app will basically be doing nothing other than showing a progress bar while it works. In this case main thread jank should be negligible as the user is just sitting and waiting for the process to complete. So in this case it seems to me that only providing the API in a worker doesn't add anything, and just makes development more complicated.

Besides, isn't the decoder thread running separately to the main thread anyway? It's not clear to me why main thread jank is any concern (for non-latency-critical stuff at least) if the API is asynchronous and does the actual heavy lifting in a separate thread already. If the API actually blocks the JS thread, then I would have thought the API should be redesigned to be async so the browser can move the work to another thread anyway. Even in a worker, the JS thread might be busy doing something else. So I don't understand where main thread jank even comes in to play for offline processing type cases. If the API is supported in both window and worker, and someone really cares a lot about latency, then the option to use a worker is still there.

bradisbell commented 3 years ago

@jernoble From an implementer's perspective, why does it matter if the user sends data to a codec from the main thread? From the developer perspective, I'd expect this all to be async and done on other threads. It seems far more efficient to send data from where it's coming from rather than passing it all off to a buffer to be sent to a worker somewhere to then be sent to the codec.

chcunningham commented 3 years ago

IMO, it's super weird to change the design of a future web specification because UAs haven't fully implemented an existing, current web specification.

I think @surma makes a persuasive counter argument above.

As I said above, most of those are not use cases; those are preferences.

Each of those preferences is coming from a place of using WebCodecs in a particular use case. Many of the quotes, including replies from @koush and @AshleyScirra, clearly mention the use case alongside the preference. To make it super clear: the use case from @koush is low latency streaming (vysor.io) and the use case from @AshleyScirra game editing (scirra.com), described in more detail above.

The preference stated above is "I don't care about that, the main thread is easier," and I don't find that compelling.

IMO, the users who have said they "don't care about that" have made well reasoned arguments. I find them compelling. Additionally, your comment overlooks a super important point from @koush which I would summarize as "workers did not perform better than window, even for my low latency use case". Here is his full quote once again.

@koush of vysor.io wrote:

I am using web codec in the window scope just fine. For low latency as well (vysor).... I actually tried a worker implementation and saw no appreciable performance gains, and the code became more complicated: had to use workers, off screen canvas, detached array buffer issues, etc.

koush commented 3 years ago

Agreed that WebUSB should support worker contexts and that the API design shouldn't be influenced by current gaps in other specs.

The audio worklet comparison (an API also used by Vysor), IMO, doesn't track, because the worklets need to respond to process(...) calls when requested, as quickly as possible. This differs from a decoder where opaque data is fed to the system it is received: push vs pull. Furthermore, no processing is done on the encoded/compressed data. It's only parsed and sent. The main thread perf issues don't exist for passing encoded video buffers around vs processing raw audio buffers. In terms of main thread load, encoded video is 60 samples per second, while raw audio is tens of thousands. Not to mention that delayed video frames are imperceptible, but delayed audio samples are jarring.

Not every use case for feeding a WebCodec decoder is going to have a background worker fetching the data. So API consumers are then boxed into an implementation pattern where they have to pass data from the main thread to a worker thread, before they can send it to the decoder, just to render that frame to a canvas. Or introduce an unnecessary amount of complexity to fetch the data in a worker thread.

Workers are appropriate if raw frames decoded yuv frames are being processed in javascript before being sent along elsewhere, as that would be analogous to the AudioWorklet. But I think the primary use case is going to just be decoding video to a canvas.

Incidentally, some of this discussion would be moot if WebCodecs had an option to just let implementers change this:

to this:

No callback, no message posts. Just an to API render straight to the canvas (which is what we want 99% of the time with a decoder), and the browser can figure out the best way to do that.

chcunningham commented 3 years ago

No callback, no message posts. Just an to API render straight to the canvas (which is what we want 99% of the time with a decoder), and the browser can figure out the best way to do that.

I've filed #266 to track that request. I'd like to split off that request to help us focus this discussion on window-main thread performance concerns with the current API. To emphasize your point above: using WebCodecs in the Window context is working well for your use case (low latency streaming), and using Workers offered no clear performance benefit.

jernoble commented 3 years ago

@bradisbell said:

@jernoble From an implementer's perspective, why does it matter if the user sends data to a codec from the main thread?

From an implementer's POV (again, Chair hat off), I think you may be confusing who is a "user" here. The user is the person sitting in front of the screen, not the author of the website. And the user isn't choosing whether to send data codec from a main thread or a worker thread, but the user is definitely the party that's impacted by that choice. It's this end-user's experience that I'm primarily defending here.

From the developer perspective, I'd expect this all to be async and done on other threads. It seems far more efficient to send data from where it's coming from rather than passing it all off to a buffer to be sent to a worker somewhere to then be sent to the codec.

Oh it's definitely the case that the actual decoding is happening on a background thread. But your intuition about efficiency doesn't match up with my own experience as an implementer. In my experience, it's far, far more efficient to move and enqueue coded data to a background thread to feed the decoder than it is to do the same from the main thread. An implementor can spin up entirely new OS threads according to the system load, to ensure that a given worker is not blocked by IO, GC, or long running JS operations. Even for something as "small" as enqueuing new, coded frames to the decoder, having this operation take place on a background thread means the decoder is always well fed and cared for, and constantly has something to be decoded. Especially for hardware codecs, just activating the silicon for the hardware decoder has a runtime cost, and a decoder running dry (but still active) will use up a user's battery. Even for SW codecs, it's more efficient to have the processor running full tilt, then dropping to a lower power usage level, than it is for the workload to be bursty.

jernoble commented 3 years ago

@koush said:

The audio worklet comparison (an API also used by Vysor), IMO, doesn't track, because the worklets need to respond to process(...) calls when requested, as quickly as possible.

I want to push on this point a bit, because I think it's a misunderstanding of my point in the ScriptProcessorNode example. It was already the case that ScriptProcessorNode by default operates on a much looser latency requirements for the very reason that main thread jank makes lower latencies impossible. By default, JS has ~42ms to fill its ScriptProcessorNode buffer, equivalent to a 24fps movie. So it actually has similar characteristics to video, though it's true that the script in question might be doing heavy math to fill that buffer.

This differs from a decoder where opaque data is fed to the system it is received: push vs pull. Furthermore, no processing is done on the encoded/compressed data. It's only parsed and sent. The main thread perf issues don't exist for passing encoded video buffers around vs processing raw audio buffers. In terms of main thread load, encoded video is 60 samples per second, while raw audio is tens of thousands. Not to mention that delayed video frames are imperceptible, but delayed audio samples are jarring.

Frankly, this is shortsighted. The main thread can stall out for hundreds of milliseconds. This isn't the case of a single frame being tens of milliseconds late, but an entire pipeline running dry for up to a second, which will definitely be noticeable. A small investment (marshaling to a background thread to parse and enqueue samples) will have a real performance benefit on heavily loaded or relatively low-powered devices. The fact that developers are blithely declaring "works on my machine" actually makes me more concerned, not less, about making this API available on the Window.

koush commented 3 years ago

The fact that developers are blithely declaring "works on my machine" actually makes me more concerned, not less, about making this API available on the Window.

@jernoble For what it's worth, I've already shipped WebCodecs as the default decoder to several million users (in both Chrome, which includes low power ChromeOS devices, and as a native Electron app). Previous decoders were WebAssembly and PNacL/Pepper.

jernoble commented 3 years ago

@koush said:

For what it's worth, I've already shipped WebCodecs as the default decoder to several million users (in both Chrome, which includes low power ChromeOS devices, and as a native Electron app). Previous decoders were WebAssembly and PNacL/Pepper.

Again, in my experience, this is shortsighted. You've targeted a single runtime, in an environment where you can control the entire application. And if this API was only exposed to Window in Electron, I wouldn't make a fuss (though I'd still annoyingly pontificate about threads and best practices and whatnot).

youennf commented 3 years ago

which is what we want 99% of the time with a decoder

Agreed. A MediaStreamTrack as output might be a natural choice. This would allow to share code and JS APIs with other track producers (getUserMedia, RTCPeerConnection). There is ongoing work to build such an API in WebRTC WG (easily, efficiently and safely transform MediaStreamTrack video content with various tools such as WebGL, WebGPU, WASM...).

AshleyScirra commented 3 years ago

I don't think the demands of one use case (latency critical things) should dictate the design of the API for all use cases. For example what if you are doing latency critical networking with WebSockets? That does not justify making WebSockets available in only workers, no matter how much better some particular web apps would be if they did their networking in a worker. It's available in both, since sometimes the main thread is actually just fine, and it's up to the developer to decide. IMO it's the browser's job to provide the tools in as flexible a way as possible, so developers can best decide and innovate with the toolset provided.

One more point: we make a web game engine that is capable of running entirely in a worker via OffscreenCanvas. In that case the worker thread is also heavily contended with running an entire game engine. So in this case the assumption "workers have less contention and so are a better place to use this API" is wrong: it's still a bad decision to use WebCodecs for latency critical stuff on that thread. Making WebCodecs only available in workers does not save us from making a bad decision to use it on that thread anyway. Sure, the main thread can jank for hundreds of milliseconds, but so can workers, if they have work to do. To me this illustrates why trying to prescribe what developers ought to be doing with an API is wrong: it doesn't actually prevent making bad decisions, and in many ways can end up acting only as an impediment.

chcunningham commented 3 years ago

FYI: Our TAG review has an update from @cynthia on this issue

Since the users have spoken, it seems like it would make sense to expose this to Window - unless there is strong opposition from the group. (Even if there is opposition, there seems to be a clear preference here so I think it's worth negotiating towards the direction of Window exposure.)

chcunningham commented 3 years ago

For those following along, here are the minutes from our discussion on this issue in today's WG call. A new meeting (details) is scheduled for next Tuesday to continue discussion and cover issues we didn't get to.

Near the end @chrisn proposed the following potential resolution:

I would forward putting forward a proposed resolution along these lines: with supporting documentation, we would move forward with window and worker. We'd be guiding towards most appropriate usage.

I am supportive of this outcome. @jernoble @youennf would you also be supportive? If so, we could send PRs accordingly to close this discussion and use next week's meeting to focus on the other issues.

chrisn commented 3 years ago

This is the proposed resolution I'm suggesting:

Allow WebCodecs APIs in both DedicatedWorker and Window contexts
Add spec text (informative, so as a Note) to advise developers to prefer DedicatedWorker context for real-time (+ other appropriate) use cases
Ensure sample code and developer documentation (e.g., MDN, web.dev, etc) demonstrate and emphasize use in DedicatedWorker context for real-time (+ other appropriate) use cases

I appreciate some of that documentation is outside the WG’s scope, but should be achievable through input from WG members.

chcunningham commented 3 years ago

Thanks @chrisn. That SGTM

Also, please note my proposed implications for related issue #199.

jan-ivar commented 3 years ago

we make a web game engine that is capable of running entirely in a worker via OffscreenCanvas. In that case the worker thread is also heavily contended ... Making WebCodecs only available in workers does not save us from making a bad decision ... To me this illustrates why ... it doesn't actually prevent making bad decisions, and in many ways can end up acting only as an impediment.

@AshleyScirra The bar for API design that avoids common pitfalls isn't that it must prevent all bad decisions. We make decisions about defaults all the time, which are not preventative, yet clearly have an impact on the shape of the web.

But here, unfortunately, the "default" thread is the main thread, and the pitfall is someone relying on this default when they shouldn't, because "workers are hard" (a point made by people on both sides), and because it "seems to work" (doesn't degrade reliably across browsers and platforms). We don't worry about someone already versed in workers, as creating another worker shouldn't be an impediment for them.

w3c / webcodecs

Should WebCodecs be exposed in Window environments? #211