WebPlatformForEmbedded / WPEWebKit

WPE WebKit port (downstream)
210 stars 135 forks source link

[wpe-2.38][MSE] Full TrackQueue leak in YT cert WidevineMultiMediaKeysSession #1290

Closed asurdej-comcast closed 2 months ago

asurdej-comcast commented 4 months ago

Continuation of https://github.com/WebPlatformForEmbedded/WPEWebKit/issues/1285 for memory related issues in YT cert tests

I've spotted that WidevineMultiMediaKeysSession (EME YT cert) test leaks memory still despite disabling media controls (hardcoded in media element). In this case logs show that ~HTMLMediaElement(), ~MediaSource(), ~SourceBuffer() are all destroyed correctly but SourceBufferPrivateGStreamer and its internal AppendPipeline are not. Only two appendpielines are destroyed (one audio and one video), and 15 video are not.

The reason is that MultiMediaKeysSession test appends 16 SourceBuffers with single video track each. Webkit creates 16 AppendPipeline(s) and every AppendPipeline emits single track called "V0" (all tracks are called the same). WebKitMediaSource keeps all tracks in unique key/value map: source->priv->streams.set(track->trackId(), WTFMove(stream)); so only one video track will be stored and all the rest is dropped. SourceBuffers are filled with video data that eventually makes MediaSourceTrackGStreamer's TrackQueue full that tirggers callback registration when ready for more data:

    track->notifyWhenReadyForMoreSamples([protector = Ref { *this }, this, trackId]() mutable {
        RunLoop::main().dispatch([protector = WTFMove(protector), this, trackId]() {
            if (!m_hasBeenRemovedFromMediaSource)
                provideMediaData(trackId);
        });
    });

Note that it increases SourceBufferPrivate ref (protector) creating circular ref between SourceBufferPrivateGStreamer and MediaSourceTrackGStreamer. WebKitMediaSource will never flush those tracks as it only keeps single video "V0" track. So the callback will be never triggered. As a result video TrackQueue full of data is never destroyed. Here is a place that webKitMeidaSource will flush its tracks/streams triggering "redyForMoreSamples" noti. But this happens only for stored streams (so single "A0" and single "V0"):

        while (!source->priv->streams.isEmpty())
            webKitMediaSrcTearDownStream(source, source->priv->streams.begin()->key);
        break; 

Trying to break above refs, I've also spotted that there is another circular ref in WebKitMediaSource created in webKitMediaSrcEmitStreams func: WebKitMediaSrcPad keep ref to Stream while Stream kees ref to WebKitMediaSrcPad:

struct WebKitMediaSrcPadPrivate {
    RefPtr<Stream> stream;
};

...
struct Stream : public ThreadSafeRefCounted<Stream> {
...
    GRefPtr<GstPad> const pad;
...
}

This one is not significant as all audio/video data is usually removed but still.

asurdej-comcast commented 4 months ago

@eocanha Can you please also check this one in addition to #1285

eocanha commented 4 months ago

Adding this to my to-do after #1285.

asurdej-comcast commented 2 months ago

Hi @eocanha, #1285 has been resolved but this one is still visible in YT tests. can you please take a look

calvaris commented 2 months ago

Enrique is not taking a look, I am.

modeveci commented 2 months ago

Hi @asurdej-comcast , For your information, this ticket requires a rework on keystore mechanism, @calvaris has been working on this in parallel of 1285 task. My mistake I did not reflect that to here, but @emutavchi is aware of the need of rework. As soon as there is a patch/commit, we will let you know. Kind regards, Ozgur

asurdej-comcast commented 2 months ago

@calvaris @modeveci Thank you for your updates. That's completely fine for me, just wanted to make sure if this ticket was not forgotten among others. Thanks

calvaris commented 2 months ago

Pushed to 2.38 as c329306775ac and to 2.42 as 80dcbbcf69a9 to the calvaris/wpe-2.42/upstream branch (that is pending to be merged to 2.42).

calvaris commented 2 months ago

@modeveci I noticed that test 6 systematically fails in the reference box because the framework is notifying twice after each update so the test thinks it has double number of keys than the expected. It works well in the pi, other than some network issue or stress in the main thread (too many things need to happen there in 15s).