membraneframework / membrane_core

The core of the Membrane Framework, advanced multimedia processing framework
https://membrane.stream
Apache License 2.0
1.28k stars 36 forks source link

mp4_plugin: CMAF sample duration sometimes negative #864

Open samrat opened 4 days ago

samrat commented 4 days ago

I'm seeing HLS segments output by the HTTP Adaptive Stream plugin where there is a huge PTS & DTS jump:

❯ cat muxed_header_video_track_part_0.mp4 muxed_segment_0_video_track.m4s > concat.mp4

❯ ffprobe -v quiet -select_streams v:0 -show_entries packet=pts_time,dts_time,duration_time,pos,time_base -of csv=p=0 concat.mp4
1.633529,1.600195,0.033333,2274
139811.800195,139811.666797,0.033333,3389 <---- big jump here
139811.733496,139811.700163,0.033333,3461
279621.833464,279621.766764,0.033333,3531 <---- and here
279621.900228,279621.800130,0.033333,3601

Looks like the problem comes up because the durations in https://github.com/membraneframework/membrane_mp4_plugin/blob/2c439e6332be36b6100c926a642ed1d34d0b4664/lib/membrane_mp4/muxer/cmaf.ex#L725 are sometimes negative.

The video is coming from an HLS stream with MPEG-TS segments so the pipeline is:

HLS -> MPEG-TS demuxer -> H264 parser -> ffmpeg decoder -> swscale -> ffmpeg encoder -> HLS sink

The problem seems to arise at the ffmpeg decoder/encoder part because when I remove it, the problem goes away.

Here is the relevant section of my pipeline:

        bin_input(:video)
        |> child({:parser, "video_track"}, %Membrane.H264.Parser{
          output_stream_structure: if(transcode?, do: :annexb, else: :avc1),
          repeat_parameter_sets: true
        })
        |> child(:parsed_video_tee, Membrane.Tee.Parallel)
        |> then(fn parser ->
          if transcode? do
            parser
            |> child(:decoder, %Membrane.H264.FFmpeg.Decoder{use_shm?: true})
            |> child(:converter, %Membrane.FFmpeg.SWScale.Converter{
              output_height: @max_video_height,
              use_shm?: true
            })
            |> child(:encoder, %Membrane.H264.FFmpeg.Encoder{})
          else
            parser
          end
        end)
        |> via_in(Pad.ref(:input, :video),
          options: [
            encoding: :H264,
            track_name: "video_track",
            segment_duration: segment_duration_for(:H264),
            partial_segment_duration: nil
          ]
        )
        |> get_child(:hls_sink_bin)
varsill commented 1 day ago

Hello @samrat ! Could you provide the input HLS stream that causes the trouble? I've tried to reproduce it with some example HLS and I didn't encounter the problem. I suspect that there might be some problem with the timestamps in the input HLS stream (but I am not sure why the problem goes away when you remove the transcoding part). Just as a blind guess - as an experiment you can try setting the max_b_frames: 0 option in the Membrane.H264.FFmpeg.Encoder to disable B-frames (and therefore to make sure that PTS always match DTS) as I suspect that a mismatch between these two might be the reason of the problem