Open olafal0 opened 1 month ago
Thanks for submitting this. Glad to see there is a way to make this work with decodebin3. The added logic to upscale the video to the largest layer would however break an existing functionality where we drop layers that are bigger than the source, and match the biggest layer size to the source if smaller. You can see this logic in the filterAndSortLayersByQuality
function.
It is important that this functionality is not lost in most cases as upscaling is wasteful, and can lead to degraded quality, by decreasing the amount of bits available per macro bloc to encode no extra details.
Does gstreamer with decodebin3 provide any way to get the expected list of variants from the manifest, anywhere in the pipeline? If not, we may want to ensure that the upscaling code is only triggered on multivariant sources.
Ah, makes sense. Unfortunately I didn't find a way to access the list of variants—the manifest is obviously being parsed, and I can see them in the gstreamer logs, but STREAM_COLLECTION
messages didn't contain other variants in my tests. Same with the decodebin3 select-stream
signal.
We could potentially recalculate layer sizes and change the caps
property of the capsfilter when decodebin3 changes variants, since we definitely have that information. Then layer sizing can remain the same, just with updates when the source resolution changes. I'm not sure how the downstream elements will handle that, but I'll try it out.
Update: changing the caps on the capsfilter does work, and avoids upscaling. This does introduce a separate issue, however: we can't discard layers when the source is too small, since those layers need to exist for use later. As an example: an HLS stream is started, and defaults to 320x180. We then use the layers:
LOW: 320x180
MEDIUM: 320x180
HIGH: 320x180
Later, hlsdemux2
selects a higher-resolution stream, the queue in the video output bin is notified of the new caps, we recalculate layer sizes, and then change the caps of the capsfilter. Now, the layers are:
LOW: 480x270
MEDIUM: 980x540
HIGH: 1280x720
The downside is, of course, that if the video remains 320x180, then we're pushing 3 duplicate streams for the lifetime of the input.
It would be best if we could skip creating lower layers when they're duplicates, and then create them when needed. I'll look into that next. Maybe we could block output bins that aren't needed yet?
Update: changing the caps on the capsfilter does work, and avoids upscaling. This does introduce a separate issue, however: we can't discard layers when the source is too small, since those layers need to exist for use later. As an example: an HLS stream is started, and defaults to 320x180. We then use the layers:
LOW: 320x180 MEDIUM: 320x180 HIGH: 320x180
Later,
hlsdemux2
selects a higher-resolution stream, the queue in the video output bin is notified of the new caps, we recalculate layer sizes, and then change the caps of the capsfilter. Now, the layers are:LOW: 480x270 MEDIUM: 980x540 HIGH: 1280x720
The downside is, of course, that if the video remains 320x180, then we're pushing 3 duplicate streams for the lifetime of the input.
It would be best if we could skip creating lower layers when they're duplicates, and then create them when needed. I'll look into that next. Maybe we could block output bins that aren't needed yet?
Thanks for looking into this further. The livekit protocol doesn't allow changing the layers after initial publication. However, it is possible to:
So, indeed, one approach would be to block the output of the layers that are duplicates of the smaller ones, and change the dimensions of layers dynamically as needed.
I'm also curious: what is the behavior of the x264enc gstreamer module when the caps change on its sink pad? The underlying x264 encoding library doesn't support changing video size on the fly. Does the gstreamer module recreate an encoder context as needed on caps change?
Fixes https://github.com/livekit/ingress/issues/310. Allows multi-variant HLS streams to work for URL ingresses. This fixes two issues:
1.
decodebin3
handles variant selection automatically, but sends thenotify::caps
signal when the source resolution changes. This caused the pipeline to attempt to create a new video output bin and link it to the input bin's video src pad, which fails. Then, the whole pipeline fails.WebRTCSink.AddTrack
creates acapsfilter
for each layer with caps:fmt.Sprintf("video/x-raw,width=%d,height=%d", layer.Width, layer.Height)
. The width and height used for these layers come from the video's initial resolution, which can be very low if it's a low-bitrate variant. So, even if a higher-resolution variant is selected, it will be scaled back down to whatever it was at first.Changes:
I've tested with this HLS file: https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8 (Note that this may still fail on main, since it contains subtitle tracks, and fails with
unsupported mime type (application/x-subtitle-vtt) for the source media
. As a workaround, addingapplication/x-subtitle-vtt
tosupportedMimeTypes
inpkg/media/urlpull/source.go
fixes this, and variant selection works correctly.)