PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.
https://pyav.basswood-io.com/
BSD 3-Clause "New" or "Revised" License
2.43k stars 359 forks source link

Functional Difference In Data Stream between AV 11.0.0 and AV 12.0.0 #1376

Closed RaubCamaioni closed 4 months ago

RaubCamaioni commented 4 months ago

Overview

Functional change in data stream returned data.
I am using PYAV to separate KLV data from mpegts stream.
PYAV 12.0.0 returns different start bytes then PYAV 11.0.0

Expected behavior

Consistent parsing behavior between PYAV versions unless indicated in change log.

Actual behavior

Change in data stream parsing behavior resulting in different start bytes in data stream.

Investigation

Uninstall the latest version of PYAV 12.0.0 and install PYAV 11.0.0 Running code that prints the starting 20 bytes and end 20 bytes of data stream for each packet.

Reproduction

Compare the start bytes of an mpegts stream that contains data streams.

import av
from pathlib import Path

def main(video: Path):
    with av.open(str(video)) as container:
        stream = container.demux()

        for packet in stream:

            if packet.stream.type == "data":
                print(f"packet pts: {packet.pts}")
                print(f"start bytes: {bytes(packet)[:20]}")  # different start bytes
                print(f"end bytes: {bytes(packet)[-20:]}")  # same end bytes

            if packet.stream.type == "video":
                pass

if __name__ == "__main__":
    from argparse import ArgumentParser

    parser = ArgumentParser()
    parser.add_argument("-i", "--input", type=str)
    args = parser.parse_args()
    main(Path(args.input))

Example output:

pyav-11.0.0:
packet pts: 42811911
start bytes: b'\xfa\xab\x941\xbb\x11J\xaa\xbd\xc6\xfa\x9e\xb1\xb6$Z\x82\x01s\x82'
end bytes: b'\x82P2\x02\x0c\xfd\x82P6\x01\x01\x82P7\x01\x01\x01\x02\x8f:'

pyav-12.0.0:
packet pts: 42811911
start bytes: b'\x11J\xaa\xbd\xc6\xfa\x9e\xb1\xb6$Z\x82\x01s\x82P&\x08\x00\x05'
end bytes: b'\x82P2\x02\x0c\xfd\x82P6\x01\x01\x82P7\x01\x01\x01\x02\x8f:'

Versions

PyAV v12.0.0 library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-mediafoundation --disable-videotoolbox --enable-fontconfig --enable-gmp --enable-gnutls --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libxcb --enable-libxml2 --enable-lzma --enable-zlib --enable-version3 --enable-libx264 --disable-libopenh264 --enable-libx265 --enable-libxvid --enable-gpl library license: GPL version 3 or later libavcodec 60. 31.102 libavdevice 60. 3.100 libavfilter 9. 12.100 libavformat 60. 16.100 libavutil 58. 29.100 libswresample 4. 12.100 libswscale 7. 5.100



## Research

Assuming this issue is from 12.0.0, there is no current info on the change in behavior. 
The datastreams used to be directly parsable when passed to klvparser: https://github.com/paretech/klvdata
Now a varaible number of start bytes needs to be removed to properly parse data stream. 
WyattBlue commented 4 months ago

This is a valid change for a major. We don't (and can't) guarantee that a data stream has the same data when we change ffmpeg version.