ennuicastr / libavjs-webcodecs-polyfill

A polyfill for the WebCodecs API. No, really.
77 stars 8 forks source link

Support of MediaStreamTrackProcessor, MediaStreamTrackGenerator and native rendering? PR welcome? #9

Closed martenrichter closed 1 year ago

martenrichter commented 1 year ago

Hi, first thanks for the polyfill and also libav.js. In one of my projects, I have been using the features such as WebTransport, and WebCodecs for audio and video quite intensively. Now, comes the point to support also some legacy browsers such as Firefox and Safari.

The polyfill is probably a good start, but I am missing some features. (This is actually not a feature request, but more a request if in the future PRs would be welcomed.):

This a very rough sketch. Does it make sense to include something like this in your polyfill? Are you open to a PR? Or should I do stuff like this somewhere else?

Yahweasel commented 1 year ago

As a Firefox user, I don't appreciate my browser of choice being called a “legacy” browser ;)

I briefly considered including MediaStreamTrackProcessor and MediaStreamTrackGenerator when I initially created this polyfill, but I ultimately decided against it with the justification to myself that they're not actually part of the WebCodecs specification. They are sufficiently related that I think it's perfectly reasonable for them to be in the same polyfill, so I would be unopposed to PRs, but am unlikely to add anything of the sort myself. Alternatively, a separate, compatible polyfill might make more sense, since it is a separate specification. I'm ambivalent.

The MediaSource extensions are more relevant but also, AFAIK, hyper-experimental even by these standards? I haven't looked at it in a while. But, yes, this is the right place for such things, and yes, of course, libav.js supports muxing and demuxing in all the usual formats (indeed, I regularly pair this polyfill with libav.js muxers and demuxers).

martenrichter commented 1 year ago

As a Firefox user, I don't appreciate my browser of choice being called a “legacy” browser ;) I use also Firefox most of the time, but for everything if media and higher-speed rendering, I have to go to the chromium branch.

I briefly considered including MediaStreamTrackProcessor and MediaStreamTrackGenerator when I initially created this polyfill, but I ultimately decided against it with the justification to myself that they're not actually part of the WebCodecs specification. They are sufficiently related that I think it's perfectly reasonable for them to be in the same polyfill, so I would be unopposed to PRs, but am unlikely to add anything of the sort myself. Alternatively, a separate, compatible polyfill might make more sense, since it is a separate specification. I'm ambivalent.

Ok, let see, I will definitely implement something like this. Once I have it, can show it to you and then convert it, if you want to include it. In anyway your polyfill will be the backbone of my compatibility layer for codecs.

The MediaSource extensions are more relevant but also, AFAIK, hyper-experimental even by these standards? Not really, I have looked into the spec and all major browsers are on board for a while. And much longer into production than the webcodecs, but of course due to the need to remux and maybe with some delay, not as good as webcodecs....

I haven't looked at it in a while. But, yes, this is the right place for such things, and yes, of course, libav.js supports muxing and demuxing in all the usual formats (indeed, I regularly pair this polyfill with libav.js muxers and demuxers).

Do you have a pointer how to handle output of the muxer, when it comes to fragmenting? Alternatively, one can use https://www.npmjs.com/package/mux.js#manual-build but it would be limited to mp4 compatible codecs. So one has to use webm-muxer as well... It may be simpler to implement, but I did not read too deep in libavformat interface for fragment output.

Yahweasel commented 1 year ago

To be honest, I have no idea what "fragment" means in this context as I haven't used these interfaces. libav needs a bit of massaging with MP4 to make streamable output, because MP4 isn't a naturally streamable format, but in general, for formats where it's possible, libav uses a one-in-one-out system: you pass in an encoded frame, it writes the chunk (fragment?) of the file that corresponds to that frame.

https://github.com/Yahweasel/libav.js/blob/master/tests/test-muxing-device.js and https://github.com/Yahweasel/libav.js/blob/master/tests/test-muxing-device.js demonstrate muxing and demuxing respectively in a streaming format with libav.js.

martenrichter commented 1 year ago

This with the fragments is as follows: For all the modern streaming services, there is actually not one big file but many files, which contain chunks or fragments for a specific period of time or language for audio. For mp4 for example, it is specified which objects should be contained in each of the fragments (or segments or chunks). See for example here: https://ffmpeg.org/ffmpeg-formats.html#mov_002c-mp4_002c-ismv the options for fragmented output or here: https://ffmpeg.org/ffmpeg-formats.html#dash-2 or for webm https://ffmpeg.org/ffmpeg-formats.html#webm_005fchunk .

So and the media source extension require exactly these formats as input. Since the MSE was specially written for streaming services, which use more segmented http transfer than actual streaming.

For using MSE the muxer should supply directly these fragments, so that these can be added to MSE. Otherwise additional parsing needs to be done, and it is not the normal way of output to just one file.

Yahweasel commented 1 year ago

Aha, I see!

Well, ffmpeg obviously supports fragmented formats, and libav.js is just a compilation of ffmpeg, but it looks like it might be a bit tricky to snag the individual files. I think your best bet would be to use a custom filesystem API in the underlying Emscripten ( https://emscripten.org/docs/api_reference/Filesystem-API.html ) to catch each individual file. Alternatively, perhaps I should implement a version of the device file I already have that presents a directory, so any file in that directory is shuttled to the user. Alternatively alternatively, perhaps libav.js isn't the ideal solution; the major benefit it provides is simply that it supports everything. A custom library for a specific purpose may be better suited for that specific purpose.

martenrichter commented 1 year ago

I am not sure if a custom file system api, would be good idea, since it will add delay and it may be that the muxer require to write the full file at once. I have to figure out, if a interface for in-memory output exists and if the muxer allows live streaming. But at least for the mp4 the js lib I posted above seems to be a better fit. For webm then libs I found so far lack fragmenting...., the bad thing is that probably the codecs are tied to specific containers...

Yahweasel commented 1 year ago

Naw, the file system API doesn't work that way. It just intercepts, e.g., write calls. It's like FUSE. It would add no relevant delay.

martenrichter commented 1 year ago

OK this sounds better, but if the code thinks it is a file, it may seek and write certain headers in the end of muxing the whole stream and this is an actual problem. if live streaming should be used. The problem is that these files are often prepared offline, and that I have to figure out, if I can tap into an interface for live fragmenting. It is either already part of ffmpeg itself or not.

Yahweasel commented 1 year ago

Yeah, that I simply don't know. libav.js's system is perfectly good for streamed data, but fragmented is a bit more specific than that, and I simply don't know how it works.

martenrichter commented 1 year ago

I have checked it, everything is very file-system-specific. I think I will stick with the javascript muxer, most of them seem to be a good fit for the purpose. I will come back once I have some prototypes.

martenrichter commented 1 year ago

Ok first prototype of MediaStreamTrackProcessor is done: https://github.com/fails-components/lectureapp/blob/avstuff/src/webcodecs-ponyfills.js and could be moved out of the main project. However, it turned out, that the stream of VideoFrames can not be transferred due to the cloning of Videoframes. So if you want to use it, you must implement your own stream, that passes and transfers the stream's VideoFrames to the worker. Of course, it is even worse since safari does not support transferable streams (at least as it is written in the docs).

The good news is, that Safari seems to have an experimental version of the video part of WebCodecs, which removes for me the pressure to write a polyfill for MSE...

martenrichter commented 1 year ago

Ok, I am dropping, for now, the idea of extending your polyfill. I just get the libav.js package not working with my module webpack-based setup on the browser side. I am now trying to just get an opus decoder/encoder as a fallback. But I may change my mind, later.

Yahweasel commented 1 year ago

Noted.