Vanilagy / webm-muxer

WebM multiplexer in pure TypeScript with support for WebCodecs API, video & audio.
https://vanilagy.github.io/webm-muxer/demo
MIT License
197 stars 12 forks source link

Support multiple audio tracks #36

Closed yume-chan closed 1 month ago

yume-chan commented 3 months ago

Maybe it's only me who need this, so I'm OK if this PR is not accepted.

I added support for multiple audio tracks, including:

  1. User defined audio track numbers/IDs.
  2. User defined audio track names.
  3. Changes to the chunk sorting algorithm to support any number of tracks.

With 3. it's trivial to also support multiple video tracks, but I really don't know who needs that...

My biggest concern is that the API change makes it harder to use for users who only need one audio track, because they need to always specify audio track numbers (in options and in addAudioChunk). Maybe:

Vanilagy commented 3 months ago

This is lovely, thanks for the PR! I knew multiple tracks were coming, especially with audio, since this is often used for multiple languages in movies, but I'm sure there are other uses for it too. Perhaps multiple video tracks are also useful for the same reason - as a kid, I watched a lot of Pixar movies, and they would translate on-screen text (like a book) into the selected audio language.

Anyhow, I'll of course have to read through this PR more thoroughly and test it (I only skimmed over the code for now, but looks promising), but I wouldn't be opposed to adding this feature.

You recognized that the biggest challenge here is likely the API. Let me ramble about this: For a longer time, I have been toying with the idea of creating a more "meta" library that uses my muxers under the hood. Thing FFmpeg but without FFmpeg, only WebCodecs. "Meta" in the sense that it also takes away the effort of having to set up encoders manually, but you can give it a canvas, webcam stream, microphone stream, audio file, whatever, directly, since I see a lot of people in Issues struggling with WebCodecs usage issues instead of library issues.

If I did this, I'd probably switch from an "object configuration"-style API to a more programmatic one, where you'd call methods like .addVideoTrack on the muxer. This API change would then effortlessly support multiple tracks of any media, be it audio, video, or even subtitles. And then adding chunks would be a method on the track, not on the muxer, making the associating unambiguous. The question is if it's worth changing to that API or if I should keep it somewhat backwards compatible. I personally don't mind breaking things, since people can just stay on the old version, and migrating wouldn't take long anyway. So, I need to see!

yume-chan commented 3 months ago

since this is often used for multiple languages in movies,

I considered that, but IMO the possibility to produce a multi-language video in browsers is pretty low.

This use case also needs the language element, I can add that.

but I'm sure there are other uses for it too

In my case, I want to record desktop audio and microphones (maybe multiple microphones, if the device has them) at the same time, and into separated audio tracks (for editing later).

With multiple video tracks, it can record a whole video conference, with each speaker's camera and microphone separated into their own tracks.

where you'd call methods like .addVideoTrack on the muxer.

I like this design. Actually, the code in my project already uses a wrapper that creates individual track objects for adding chunks to them. I think it very similar to what you are talking. I think it can also allocate track numbers automatically.

This design also doesn't make using only one video and one audio tracks more complex.

I think Matroska doesn't allows adding tracks after starting, so it might need a start method to write tracks?

Vanilagy commented 3 months ago

Gonna take a while 'til I properly get to this PR, so feel free to use your own fork or something in the meantime so you're not blocked building whatever you need to build.

Vanilagy commented 1 month ago

I think I'll close this PR for now. I think the general idea is good, it's just that I would use a different API for it which would fundamentally change the entire API of this library. Since I like to keep mp4-muxer in sync with this library API-wise, I'd also need to adjust that lib. Since I have plans to unite the two in the future anyways, I will implement multiple tracks then. In which case I'll use your implements efforts here as reference!

Again, feel free to run your own fork until then, nothing wrong with that.