WebPlatformForEmbedded / WPEWebKit

WPE WebKit port (downstream)
213 stars 139 forks source link

[wpe-2.28] Dolby Audio channels not reported correctly #1103

Closed muralitharanperumal2 closed 1 year ago

muralitharanperumal2 commented 1 year ago

We are doing MIT Xperts tests for Hbbtv and this failure is in 'Multi-Codec in mp4'.

Analyzing the test stream (mp4_multi_codec.mp4) via ffprobe: ffprobe -i mp4_multi_codec.mp4 -show_streams -select_streams a:0

[STREAM] index=1 codec_name=eac3 codec_long_name=ATSC A/52B (AC-3, E-AC-3) profile=unknown codec_type=audio codec_time_base=1/48000 codec_tag_string=ec-3 codec_tag=0x332d6365 sample_fmt=fltp sample_rate=48000 channels=6 channel_layout=5.1(side) bits_per_sample=0 dmix_mode=-1 ltrt_cmixlev=-1.000000 ltrt_surmixlev=-1.000000 loro_cmixlev=-1.000000 loro_surmixlev=-1.000000 id=N/A r_frame_rate=0/0 avg_frame_rate=0/0 time_base=1/48000 start_pts=0 start_time=0.000000 duration_ts=3890688 duration=81.056000 bit_rate=192000 max_bit_rate=N/A bits_per_raw_sample=N/A nb_frames=2533 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 TAG:creation_time=2013-04-30T14:24:14.000000Z TAG:language=eng TAG:handler_name=sound handler [SIDE_DATA] side_data_type=Audio Service Type [/SIDE_DATA] [/STREAM]

It reports correctly 6 channels (Dolby surround sound):

But the test fails as the expected 'no. of channels' mismatches.

I dug a bit in the wpe webkit and seems like gstreamer is not reporting back the necessary capabilities for the audio stream:


0:00:01.043510428 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00m           decodebin gstdecodebin2.c:1521:copy_sticky_events:<'':decodepad2>[00m store sticky event caps event: 0xb19060b0, time 99:99:99.999999999, seq-num 91, GstEventCaps, caps=(GstCaps)"audio/x-eac3\,\ framed\=\(boolean\)true\,\ rate\=\(int\)48000\,\ channels\=\(int\)2";
0:00:01.043536636 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00;01;31;44m            GST_PADS gstpad.c:5320:store_sticky_event:<'':decodepad2>[00m notify caps
0:00:01.043621593 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00m             playbin gstplaybin2.c:4638:autoplug_continue_cb:<playbin0>[00m continue autoplugging group 0x7e7a4328 for '':decodepad2, audio/x-eac3, framed=(boolean)true, rate=(int)48000, channels=(int)2: 1
0:00:01.043659259 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00m        uridecodebin gsturidecodebin.c:1734:proxy_autoplug_continue_signal:<uridecodebin0>[00m autoplug-continue returned 1
0:00:01.043688967 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00m           decodebin gstdecodebin2.c:1416:gst_decode_bin_autoplug_continue:<decodebin0>[00m autoplug-continue returns TRUE
0:00:01.043742174 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00;01;34m            GST_CAPS gstpad.c:2737:gst_pad_get_current_caps:<qtdemux0:audio_0>[00m get current pad caps audio/x-eac3, framed=(boolean)true, rate=(int)48000, channels=(int)2
0:00:01.043796923 [31m 5832[00m 0x7e7a2d70 [37mDEBUG  [00m [00m             playbin gstplaybin2.c:4263:autoplug_factories_cb:<playbin0>[00m factories group 0x7e7a4328 for '':decodepad2, audio/x-eac3, framed=(boolean)true, rate=(int)48000, channels=(int)2```

So, We need to understand why it's not able to read the gst audio pads sink (audio/x-ac3) capabilities correctly. 
eocanha commented 1 year ago

For reference, the test seems to be http://itv.mit-xperts.com/hbbtvtest/dolby/detail.php?id=mp4_multi_codec and the video file seems to be http://d35ru53tpwgj08.cloudfront.net/content/mitXperts/mp4/mp4_multi_codec.mp4

muralitharanperumal2 commented 1 year ago

Yes, that's correct.

eocanha commented 1 year ago

I haven't been able to make the test app work, because apparently it asks if the browser supports the "application/vnd.hbbtv.xhtml+xml" mimetype, and it doesn't (at least not the development version I have here, from https://github.com/WebPlatformForEmbedded/WPEWebKit). I guess your version of WebKit must be patched to support hbbtv.

Anyway, next thing I tried was to setup a basic <video> tag pointing to the location of mp4_multi_codec.mp4. It plays fine on a Raspberry Pi 3, but that's only because the avdec_eac3 software audio decoder is plugged, so it produces audio/x-raw and can be properly plugged to the defult raw audio sink used on the Pi. See attached pipeline dump.

0 00 07 571266976-media-player-0 PAUSED_PLAYING dot

I would need to check your pipeline, but in your case it can't be created because the negotiation fails.

One possible solution that I can think about is to customize the filter-caps parameter of GstAutoAudioSink (see again my pipeline) to accept not only raw audio, but also encoded ac3. What exact audio sink are you using in your platform? Are you able to get any ac3 video playing?

muralitharanperumal2 commented 1 year ago

I captured the pipeline details. Can you please check if there is something interesting to look at. I have very limited knowledge on gstreamer pipeline.

multi_codec_playback_PAUSED_PLAYING

eocanha commented 1 year ago

Ok. First thing I see is that your audiosink isn't based on autoaudiosink. This means that it's forced, and that's good, because that way you discard any potential problem with autoaudiosink not finding your custom audio sink.

However, from this dump I also see that the video playback is also working out of the box because, as in the case of the Raspberry Pi audio sink, your sink advertises that it supports audio/x-raw (that is, decoded audio, not ac3) and an avdec_aac software decoder is used to generate that raw audio. Note that a third stream (this time encoded in ac3) is being discarded by the inputselector1 element.

I'm not sure about who controls that inputselector1. As far as I know, that kind of setup is usually used to select one audio track among many (usually one audio language among many), and I haven't ever seen that selector being used to select among different audio encodings (raw and ac3). I mean I haven't seen it (by myself), not that it doesn't work, but I don't usually work with ac3.

Your GstAmlHalAsink audio sink might be able to understand ac3, looking at the audio-indicator="AAC" parameter. First of all, do you have any way to set "AC3" or something similar to that parameter and see if the audiosink automatically chooses audio/x-eac3 caps for its sink (input) pads? It depends in your sink (we would need to see the source code to understand what it's capable of or not). If we're lucky, "convincing" the sink to expose the right audio/x-eac3 caps might force the ac3 stream in inputselector1.

Those audio/x-eac3 caps I talk about are the ones you can see in your pipeline dump in the data path that goes from multiqueue0:src_1 to proxypad4.

If that doesn't work, you might try to modify the code that creates your custom audiosink and use a bin holding a capsfilter element (set to accept audio/x-eac3) linked with your GstAmlHalAsink. That would force the audio/x-eac3 caps for sure. It would be nice to check it, at least as a test to see how it behaves.

An alternative thing that could be tried would be to add some javascript to the page (at least experimentally or even via the web inspector) to query the available audio tracks. If you get that 2 audio tracks are available (so the ac3 one is detected as an audio track on its own, much like if it was a different language even if it's not), maybe you can manually switch to the second audio track (the ac3 one) in JavaScript and see how the pipeline tries to reconfigure.

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : can you provide some more logs from the problematic case?

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia Can you please advise what sort of logging you are looking for?

  1. WPE webkit logging - can you please be specific about the log channel to use to get the relevant logging
  2. gstreamer logging
pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 :

  1. WEBKIT_DEBUG="Media, MediaSource, Network"
  2. GST_DEBUG="*:7"

I know that there will be a lot of logs in such case but it will also provide a lot of information.

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia Please find the attached console log with the requested logging. console_dolby_1.zip

Please let me know if you need any further info.

pgorszkowski-igalia commented 1 year ago

The main problem is in gstreamer, it does not handle 'ec-3' and 'dec3' boxes in stsd container:

0:01:13.458134042    97 0x7dbe0b90 INFO                 qtdemux qtdemux.c:13403:qtdemux_parse_trak:<qtdemux0> unhandled type ec-3
0:01:13.458164501    97 0x7dbe0b90 INFO                 qtdemux qtdemux.c:13408:qtdemux_parse_trak:<qtdemux0> type ec-3 caps audio/x-eac3, framed=(boolean)true

Your gstreamer is in version 1.18.5:

0:00:00.000202417    41 0x7ec1ce00 INFO                GST_INIT gst.c:586:init_pre: Initializing GStreamer Core Library version 1.18.5

ffprobe handles these boxes that is why number of channels are 6 in ffprobe case.

pgorszkowski-igalia commented 1 year ago
  [trak] size=8+3151
    [tkhd] size=12+80, flags=7
      enabled = 1
      id = 2
...
          [stsd] size=12+53
            entry_count = 1
            [ec-3] size=8+41
              data_reference_index = 1
              channel_count = 2
              sample_size = 16
              sample_rate = 48000
              [dec3] size=8+5
                data_rate = 192
                complexity_index_type_a = 0
                [00] = fscod=0, bsid=16, bsmod=0, acmod=7, lfeon=1, num_dep_sub=0, chan_loc=0

This is how it looks in mp4_multi_codec.mp4 and the 'x-eac3' audio . Even if in ec-3 box there is channel_count=2 it should be adjusted with data from [dec3] box. This is a parsing and adjusting channel_layout in ffmpeg: https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mov.c#L833

pgorszkowski-igalia commented 1 year ago

It seems that support for 'ec-3' and 'dec3' boxes in stsd container is not fully implemented in latest gstreamer so we will need to added it there and then provide a patch for you to fix the missing functionality in your version of gstreamer.

muralitharanperumal2 commented 1 year ago

Thanks @pgorszkowski-igalia Is there any ETA for the proposed WPE webkit patch?

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : patch (in gst-plugins-good) is almost ready, but not sure how much time review will take.

muralitharanperumal2 commented 1 year ago

Thanks @pgorszkowski-igalia. Is there a way for me to test it locally by integrating it to our WPE webkit 2.28?

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : the patch is in review: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5019

In the meantime I can prepare a patch which you can use for your local test, but keep in mind that the patch is not for WPEWebKit but for gst-plugins-good.

muralitharanperumal2 commented 1 year ago

Thanks @pgorszkowski-igalia

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : here is a diff for gst-plugins-good version 1.18.5: isomp4_fix_eac3_channel_count_handling_in_demuxer_gst_plugins_good_1.18.5.txt

pgorszkowski-igalia commented 1 year ago

Please let me know if it fixes your problem

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia I couldn't apply the patch as it is but did it manually (please find the attached gst_files.zip). The test still fails with the patch. Please find the attached log (console_d4.log) for the same. console_d4.zip gst_files.zip

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : and you get the same error with the test case? From logs I see that in case of eac-3 the number of channels is correct:

0:00:32.138226342    99 0x7e944130 DEBUG               GST_CAPS gstpad.c:2737:gst_pad_get_current_caps:<'':decodepad2> get current pad caps audio/x-eac3, framed=(boolean)true, rate=(int)48000, channels=(int)6, channel-mask=(bitmask)0x0000000000000000

I don't know how your code gather information about audio streams, I think it is outside of WPEWebKit, I suspect it is done in your code responsible for implementation of the "getCurrentActiveComponents" function of the OIPF specification.

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : is code responsible for hbbtv/oipf part public and I can analyze it?

muralitharanperumal2 commented 1 year ago

Apologies @pgorszkowski-igalia for the delayed response. I am looking at the application code that retrieves these info. At the moment, that code is not public. I will check with the team concerned and update you with the findings.

pgorszkowski-igalia commented 1 year ago

@muralitharanperumal2 : do you have any feedback? can we close this ticket?

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia Once again apologies. I have shared the logs with the patch with the partner who knows the code that retrieves the channel info. I will update you with their findings.

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia In here https://github.com/WebPlatformForEmbedded/WPEWebKit/blob/wpe-2.38/Source/WebCore/html/track/TrackBase.h, I could see only a few properties are set. Once, we successfully parse the audio decoder caps and identify the numChannels for audio, Can you please explain how do these properties get populated in AudioTrackList and that in turn retrieves via getCurrentActiveComponents?

We do the following when we call getCurrentActiveComponents to extract the audio properties:

const audioTracks = videoElement.audioTracks; if ((returnAllComponentTypes || componentType === this.COMPONENT_TYPE_AUDIO) && audioTracks) { for (let i = 0; i < audioTracks.length; ++i) { const audioTrack = audioTracks[i]; if (!onlyActive || audioTrack.enabled) { components.push({ // AVComponent properties componentTag: audioTrack.id, pid: parseInt(audioTrack.id), type: this.COMPONENT_TYPE_AUDIO, encoding: audioTrack.encoding ? audioTrack.encoding.split('"')[1] : undefined, encrypted: audioTrack.encrypted, // AVAudioComponent properties language: audioTrack.language ? audioTrack.language : 'und', audioDescription: audioTrack.kind === 'alternate' || audioTrack.kind === 'alternative', audioChannels: audioTrack.numChannels, }); } } }

This is what we do in the test that fails: function selectComponentsStage2(index, vc) { var shouldBe, activevc, intType = vid.COMPONENT_TYPE_AUDIO; try { activevc = vid.getCurrentActiveComponents(intType); } catch (e) { showStatus(false, 'error while calling getCurrentActiveComponents('+intType+')'); return false; } if (activevc && activevc.length!==0 && (!hbbtv12 || index<0)) { showStatus(false, 'getCurrentActiveComponents returned a non-empty array after unselecting all components'); return false; } if (index>=0 && index<vc.length) { shouldBe = vc[index]; try { vid.selectComponent(shouldBe); } catch (e) { showStatus(false, 'cannot select component '+index+' = '+vc[index]); return false; } setTimeout(function() { var i = getActiveComponentIdx(); if (i===-2) { showStatus(false, 'error while calling getCurrentActiveComponents('+intType+') after selecting component'); } else if (i===-1) { showStatus(false, 'getCurrentActiveComponents returned invalid component after selecting desired component'); } else if (i===false) { // error already displayed } else if (i===index) { showStatus(true, 'component should now be selected.'); } else { showStatus(false, "Active component: "+expected[i].displayname+", expected: "+expected[index].displayname); } }, 2000); setInstr('Waiting for component selection to finish...'); return; } showStatus(true, 'component should now be selected.'); }

Also, please let me know which log channel do I need to enable in Webkit to get related info. Thanks.

pgorszkowski-igalia commented 1 year ago

AudioTrack interface(https://html.spec.whatwg.org/multipage/media.html#audiotrack) does not include such attributes like numChannels or encoding. To get these data from AudioTrack you have to add it and implement in your clone of the WebKit. They cannot be added in official version because it is not compliant with the specification.

In newer version of WPE (2.38) there is new experimental feature("Track Configuration API") which provides possibility to get number of channels from track configuration. https://github.com/WebPlatformForEmbedded/WPEWebKit/blob/wpe-2.38/Source/WebCore/html/track/AudioTrack.idl#L36

So in theory in 2.38 the number of channels of audio track can be taken from: audioTrack.configuration.numberOfChannels

I have checked in upstream and it seems to work there.

muralitharanperumal2 commented 1 year ago

Thanks @pgorszkowski-igalia We use wpe-2.28.7 version. The vanilla version of wpe-2.28.7 doesn't populate 'numChannels' configuration in AudioTrackList (that we rely on from application). From your explanation, what I can understand is we have to modify wpe webkit 2.28.7 to add support for retrieving numChannels from Gstreamer? for eg. to get num of audio tracks from gstreamer, we do this 'g_object_get(element, "n-audio", &numTracks, nullptr);'? Can you please confirm?

muralitharanperumal2 commented 1 year ago

@pgorszkowski-igalia I captured the pipeline paused->playing with the patch, if we see the inputselector1 (audio), only audio/x-raw is active and not the other one. Any insight? media_player_paused_playing

pgorszkowski-igalia commented 1 year ago

I think you can base how it is done in 2.38: https://github.com/WebPlatformForEmbedded/WPEWebKit/blob/wpe-2.38/Source/WebCore/platform/graphics/gstreamer/AudioTrackPrivateGStreamer.cpp#L88

eocanha commented 1 year ago

I'm currently trying to port a subset of the patch that implements AudioTrackConfiguration to wpe-2.28, and looking for a way to fill the numberOfChannels in there when using playbin2 (the default in 2.28, while upstream it's only implemented for playbin3).

eocanha commented 1 year ago

@muralitharanperumal2, could you check if commit https://github.com/WebPlatformForEmbedded/WPEWebKit/commit/9fbb8b9d44de87ceafb40f5950b2d67e75ece3cb from branch https://github.com/WebPlatformForEmbedded/WPEWebKit/commits/eocanha/eocanha-142 (based on wpe-2.28), providing AudioTrackConfiguration (codec, sampleRate, numberOfChannels and bitrate) would provide the info that you require for your use case?

If that's the case, I can integrate it in the official wpe-2.28 branch.

muralitharanperumal2 commented 1 year ago

@eocanha Thanks so much for the patch. Recently (about a week), we moved to wpe-2.38 and I guess these changes are already present in that version and we dont require this patch any more? Please advise.

eocanha commented 1 year ago

Yeah, in principle the patch shouldn't be required there, because the original https://github.com/WebPlatformForEmbedded/WPEWebKit/commit/6a9975246838 commit is already included in wpe-2.38.

Can we close this Issue or is still anything pending about it?

muralitharanperumal2 commented 1 year ago

@eocanha thanks. I think we can close this. If needed we can reopen it. Thanks.