microsoft / FFmpegInterop

This is a code sample to make it easier to use FFmpeg in Windows applications.
Apache License 2.0
1.27k stars 308 forks source link

MSFT: 46744581 - Add a byte stream handler for FFmpegInteropMSS #305

Closed brbeec closed 6 months ago

brbeec commented 7 months ago

Why is this change being made?

We're adding a byte stream handler for FFmpegInteropMSS. The byte stream handler can be registered with the source resolver so that MF will attempt to create an FFmpegInteropMSS from a byte stream during source resolution.

What changed?

This change adds FFmpegInteropByteStreamHandler, a runtime class that implements the IMFByteStreamHandler interface.

I updated to stdcpplatest to use std::move_only_function, a new C++ 23 feature, for the new MFWorkItem class. It works with lambdas that capture move-only types, unlike std::function which requires a CopyConstructible Callable. We should switch to stdcpp23 once it's available.

How was the change tested?

I validated FFmpegInteropByteStreamHandler using a prototype in-proc WME package in Ogg media playback and property handler scenarios.

brabebhin commented 7 months ago

Hi @brbeec

If my understanding is correct, this makes ffmpeginterop be magically created inside the MF topology when using things like

https://learn.microsoft.com/en-us/uwp/api/windows.media.core.mediasource.createfromstream?view=winrt-22621#windows-media-core-mediasource-createfromstream(windows-storage-streams-irandomaccessstream-system-string)

right?

brbeec commented 7 months ago

Hi @brbeec

If my understanding is correct, this makes ffmpeginterop be magically created inside the MF topology when using things like

https://learn.microsoft.com/en-us/uwp/api/windows.media.core.mediasource.createfromstream?view=winrt-22621#windows-media-core-mediasource-createfromstream(windows-storage-streams-irandomaccessstream-system-string)

right?

@brabebhin Yup! MF invokes the source resolver any time there's a need to resolve a URI or stream to an MF media source (IMFMediaSource).

Source resolution doesn't occur when a WinRT media source is created (e.g. by calling MediaSource::CreateFromStream), but rather when the WinRT media source is "opened" (e.g. by calling MediaSource::OpenAsync). MF will automatically open a WinRT media source when it becomes necessary to do so.

An app can register the FFmpegInteropMSS byte stream handler via MediaExtensionManager::RegisterByteStreamHandler, which internally calls MFCreateMediaExtensionActivate then MFRegisterLocalByteStreamHandler. Local byte stream handlers will take precedence over system byte stream handlers during source resolution.

brabebhin commented 7 months ago

Thank you @brbeec for the explanation.

@lukasf we should probably support this as well.

lukasf commented 7 months ago

First I thought that this would allow us to do things such as dynamically changing media types, which would have been awesome. The lack of media type switching is a major pain point in MediaStreamSource. But I see that the implementation just returns the MSS as IMFMediaSource. So this won't help us here. We'd need to go much more downlevel for that. Then I don't see any real benefit of implementing ByteStreamHandler.

Well, maybe this would allow us to delay the crearion of the MSS in MediaPlaybackList scenarios. MediaBinder does not support MediaStreamSource, so all MSS sources need to be created right away before adding a MediaPlaybackItem. But MediaBinder has overloads for a stream with content type, which could be used with this implementation to delay create the MSS.

brabebhin commented 7 months ago

I think the primary advantage is making adoption of the library easier. If some folks already use the standard MediaSource APIs, then they can switch to ffmpeginterop without having to rewrite their media handling code, the disadvantage is that it makes configuration and advanced features (which we have a lot more compared to Microsoft) somewhat harder to use. But this may be enough for some apps.

brbeec commented 7 months ago

First I thought that this would allow us to do things such as dynamically changing media types, which would have been awesome. The lack of media type switching is a major pain point in MediaStreamSource.

@lukasf I don't think it's documented, but MSS should support dynamic media type changes.

When a MediaStreamSourceSampleRequest is completed the MSS checks to see if the MediaStreamSourceSampleRequest.StreamDescriptor EncodingProperties were modified. If there's any changes, then the stream will send an MEStreamFormatChanged event with the new media type that's created by calling MFCreateMediaTypeFromProperties() with the IMediaEncodingProperties.Properties. This event will be propagated to the downstream nodes and MF will renegotiate the topology if needed.

We have the following logic to trigger dynamic format changes in SampleProvider::GetSample():

// Update the stream's encoding properties if there were any dynamic format changes
if (!formatChanges.empty())
{
    IMediaStreamDescriptor streamDescriptor{ request.StreamDescriptor() };
    IMediaEncodingProperties encProp{ nullptr };

    switch (m_stream->codecpar->codec_type)
    {
    case AVMEDIA_TYPE_AUDIO:
        encProp = streamDescriptor.as<IAudioStreamDescriptor>().EncodingProperties();
        break;

    case AVMEDIA_TYPE_VIDEO:
        encProp = streamDescriptor.as<IVideoStreamDescriptor>().EncodingProperties();
        break;

    case AVMEDIA_TYPE_SUBTITLE:
        encProp = streamDescriptor.as<ITimedMetadataStreamDescriptor>().EncodingProperties();
        break;

    default:
        WINRT_ASSERT(false);
        THROW_HR(E_UNEXPECTED);
    }

    MediaPropertySet encodingProperties{ encProp.Properties() };
    for (const auto& [key, value] : formatChanges)
    {
        encodingProperties.Insert(key, value);
    }

    FFMPEG_INTEROP_TRACE("Stream %d: Dynamic format change", m_stream->index);
}

I'll file a work item on our internal backlog to add some documentation for this (osgvsowi/47779291). Apologies if I misunderstood your problem and this doesn't help you.

lukasf commented 7 months ago

@brbeec That's super interesting, thank you for the information!

I think in the AudioEncodingProperties docs there once was a hint that the properties can be changed anytime. I tried back then, but it never seemed to have any effect. Later the docs were changed and the hint was removed, so I thought it was just wrong info.

Back then, I changed the EncodingProperties of the MediaStreamDescriptor we set in the MSS. I wonder why it did not work. I think I tried with both audio and video. The descriptor instance is the same that gets passed in for SampleRequested, so it should have worked? Anyways, I will definitely try this again when I have some time.

Right now, we could really use this, to dynamicall enable and disable HDR when the render target display changes. It would also be useful when adding FFmpeg effects during playback, which can cause a format change. Right now, we will transform back to the old format using swscaler or resampling, but it is a waste of resources of course.