shaka-project / shaka-player

JavaScript player library / DASH & HLS client / MSE-EME player
Apache License 2.0
7.04k stars 1.33k forks source link

play an mkv video with multi audio #6254

Closed iptvsmartws closed 6 months ago

iptvsmartws commented 6 months ago

Hello,

I am exploring the possibility of utilizing Shaka Player for developing an application for WebOS and Tizen platforms. The primary function of this application would be to stream Video On Demand (VOD) content directly from a Media Source Extensions (MSE) source in MKV format. Additionally, some of the VOD content will feature multiple audio tracks (e.g., English, French, German).

Could you please inform me whether Shaka Player supports this functionality? Moreover, I would appreciate guidance on whether there are any specific configurations or additional steps required to achieve this.

Thank you for your assistance.

joeyparrish commented 6 months ago

Shaka Player is a web-based player. Container format support on the web is generally up to the web platform. The player fetches a segment of video, but it is fed to the browser, and the container is parsed by the browser.

No browser supports MKV natively. So your only option for playing MKV natively is if the parsing happens at the application layer (in JS). Shaka Player doesn't do this.

Shaka Player does have support for transmuxing in common streaming scenarios. Transmuxing is parsing one type of container, like MPEG2-TS, and reconstructing it into another type of container the browser understands, like MP4. Nobody does web-based streaming in MKV, so we don't have an MKV transmuxer.

Further, web-based streaming formats are generally segmented. Broadly speaking, each chunk is parseable and decodeable independently of every other chunk. A manifest or playlist gives the player information about each segment, and the player decides what to fetch and when. MKV does not, as far as I know, have a segmented version appropriate for streaming.

So to play an MKV with Shaka Player, you would need a way to fetch it in segments (which may not exist), and you'd need to write a transmuxer to parse MKV (we don't have this) and generate MP4 for the browser (we do have this).

If you want to play MKV with any other web-based player or with your own code interfacing to MediaSource Extensions, you will probably need roughly the same set of things: segmented format, MKV parser, MP4 generator.

You might also be interested in the WebM format. It's a variant (maybe subset or profile?) of MKV designed for web streaming, and there is a segmented version of it. It is natively supported in some browsers, but not all of them.

Does this answer your question?

iptvsmartws commented 6 months ago

Thank you for the detailed explanation regarding the challenges of playing MKV files through Shaka Player on web platforms. I appreciate the insights into the limitations of container format support and the intricacies of transmuxing and segmented streaming.

I plan to deploy my application on Tizen and WebOS smart TVs, utilizing their native web browsers. My understanding is that these environments might offer more flexibility compared to standard web browsers on PCs or mobile devices. Could you please advise if the Tizen and WebOS browsers mitigate some of the limitations mentioned, especially regarding container format support and streaming capabilities?

Additionally, I realized that the discussion did not cover the aspect of multi-audio support, which is crucial for my application as we intend to offer VOD content with multiple audio tracks (e.g., English, French, German). Does Shaka Player support the management of multiple audio tracks in streaming scenarios, and if so, are there specific considerations or configurations required to enable this feature effectively on platforms like Tizen and WebOS?

Looking forward to your guidance on these points.

joeyparrish commented 6 months ago

Could you please advise if the Tizen and WebOS browsers mitigate some of the limitations mentioned, especially regarding container format support and streaming capabilities?

Not that I know of. In my experience, smart TV browsers are like regular browsers, but 7 years out of date with respect to standards and with a custom media pipeline that nobody understands or can debug because it's closed source. For example: a certain version of Tizen TVs will mute the audio of any stream whose internal media timestamps exceed some arbitrary number like 2**48 milliseconds. 🤷 I would bet money that you're not going to find that they support MKV out of the box. You can check the Tizen docs, though, or write a small app that logs to the screen the results of MediaSource.isTypeSupported('video/x-matroska; codecs="avc1.42E01E,mp4a.40.2"') That is a baseline H264+AAC codec string.

Additionally, I realized that the discussion did not cover the aspect of multi-audio support, which is crucial for my application as we intend to offer VOD content with multiple audio tracks (e.g., English, French, German). Does Shaka Player support the management of multiple audio tracks in streaming scenarios, and if so, are there specific considerations or configurations required to enable this feature effectively on platforms like Tizen and WebOS?

Yes Shaka supports multiple audio tracks and languages, but what you need to understand is that the industry generally does not do this through something akin to a single MKV with multiple audio tracks in it. In general, video and audio are separate, and we only fetch segments from the currently-active audio track/language. A single file is a terrible streaming format, and so DASH and HLS have been created to contain the metadata and recommend how to separate the content into streams and segments.

For an example of multi-lingual DASH content, check out this clip in our demo:

https://shaka-player-demo.appspot.com/demo/#asset=https://storage.googleapis.com/shaka-demo-assets/angel-one/dash.mpd;audiolang=en

You can change the audio language on the fly in the UI, which causes the player to switch to another audio stream. There are audio tracks for English, French, Spanish, German, and Italian. Each of those languages is a separate file containing only audio. This is how streaming works today.

You can look at the MPD (DASH manifest) file to see how it is organized: https://storage.googleapis.com/shaka-demo-assets/angel-one/dash.mpd

Here's a simplified snippet from that file showing how audio for Spanish and German are described:

    <AdaptationSet id="4" contentType="audio" lang="es">
      <Representation id="4" codecs="mp4a.40.2" mimeType="audio/mp4">
        <BaseURL>audio_es_2c_128k_aac.mp4</BaseURL>
      </Representation>
    </AdaptationSet>
    <AdaptationSet id="5" contentType="audio" lang="de">
      <Representation id="5" codecs="mp4a.40.2" mimeType="audio/mp4">
        <BaseURL>audio_de_2c_128k_aac.mp4</BaseURL>
      </Representation>
    </AdaptationSet>

With this, we can fetch segments of the current audio stream, and not waste bandwidth on anything we won't decode and use.

If we had to stream everything from a single file, to get the same efficiency, we would need an extremely detailed map showing which byte ranges belong to which time ranges of which MKV tracks/languages.

In general, these "demuxed" streams (one track per file) are the way to go.

For a personal project, you can totally buck the trend if you want to. You can even fork Shaka Player and build your own streaming format based on a single MKV and some extra metadata. There are plugin interfaces for everything you would want to customize, or you can modify the player library directly.

But I would strongly advise you to consider the possibility that the multi-hundred-billion-dollar streaming industry has worked out a good way to stream that is relatively cheap and easy. I'm sure further innovations and improvements will come, but all of the streaming formats today have these common traits that solve basic problems: metadata formats that are easy to parse; separate streams for each video resolution, bitrate, and audio language; and short chunks/segments.

There are a ton of open source tools available for you to produce DASH & HLS content (the two dominant formats today) for streaming, including our own Shaka Packager: https://github.com/shaka-project/shaka-packager We also have a higher-level project than that called Shaka Streamer that will combine FFmpeg and Shaka Packager to ingest any input format (like MKV), separate your streams, and produce output in DASH, HLS, or both, ready to be streamed to any number of players and platforms: https://github.com/shaka-project/shaka-streamer If you don't like our tools, there are many many open-source tools out there that are solid and well-respected in the industry.

shaka-bot commented 6 months ago

Closing due to inactivity. If this is still an issue for you or if you have further questions, the OP can ask shaka-bot to reopen it by including @shaka-bot reopen in a comment.