dholroyd / h264-reader

Rust reader for H264 bitsream syntax
Apache License 2.0
72 stars 25 forks source link

MVP example parsing NALs #60

Closed ralfbiedert closed 6 months ago

ralfbiedert commented 1 year ago

I tried using your lib to parse some H.264 stream. Let's say I have these NALs:

image

And I have a function which splits them into NAL units like so

let h264_data = include_bytes!("videos/multi_512x512.h264");

// each `nal` will be [ 00 00 01 xx xx xx ... ]
for nal in nal_units(h264_data) { }

I'd now like to get all encoded SPS / PPS / IDR / ... information bits, for example like so:

let sps = SeqParameterSet::from_bits(bits)?;
dbg!(sps.log2_max_frame_num_minus4);

Would it be possible to add some absolute MVP examples to the project (and / or pose some code here) that outlines what the preferred way of getting these data is? I did a few attempts with AnnexBReader::accumulate, but was struggling to get an example to run without it panicking. For example, depending on what code I actually run I would get

called `Result::unwrap()` on an `Err` value: UnknownSeqParamSetId(ParamSetId(0))
thread 'f' panicked at 'called `Result::unwrap()` on an `Err` value: UnknownSeqParamSetId(ParamSetId(0))', tests\parse_nal.rs:32:93

or

called `Result::unwrap()` on an `Err` value: RbspReaderError(ReaderErrorFor("finish", Custom { kind: WouldBlock, error: "reached end of partially-buffered NAL" }))
thread 'f' panicked at 'called `Result::unwrap()` on an `Err` value: RbspReaderError(ReaderErrorFor("finish", Custom { kind: WouldBlock, error: "reached end of partially-buffered NAL" }))', tests\parse_nal.rs:19:60
stack backtrace:

but don't quite understand why. For reference, here's an example I used:


#[test]
fn f() {
    let h264_data = include_bytes!("videos/multi_512x512.h264");

    let mut reader = AnnexBReader::accumulate(|nal: RefNal<'_>| {
        let context = Context::new();
        let nal_unit_type = nal.header().unwrap().nal_unit_type();
        let bits = nal.rbsp_bits();

        match nal_unit_type {
            UnitType::SeqParameterSet => {
                let sps = SeqParameterSet::from_bits(bits).unwrap();
                dbg!(sps.log2_max_frame_num_minus4);
            }
            _ => {} // _ => NalInterest::Ignore,
        }

        NalInterest::Ignore
    });

    for nal in nal_units(h264_data) {
        reader.push(nal);
    }
}
karcsesz commented 7 months ago

I've managed to make it work like this:

// First we have to create a context to keep track of SPS and PPS NALs that we receive. It *needs* to be persistent through all parsing
let mut stream_context = Context::new();

// Then we prepare an AnnexBReader to handle the parsed data
let mut reader = AnnexBReader::accumulate(|nal: RefNal<'_>| {
    // We only ever want to parse complete NALs.
    // You can filter for the specific types of NALs you're
    // interested in and NalInterest::Ignore the rest here.
    //
    // If a NAL is incomplete, trying to read its data will result in a WouldBlock.
    if !nal.is_complete() {
        return NalInterest::Buffer;
    }

    // Parse the NAL header, so we know what the NAL type is
    let nal_header = nal.header().unwrap();
    let nal_unit_type = nal_header.nal_unit_type();

    // Decode the NAL types that we're interested in
    match nal_unit_type {
        UnitType::SeqParameterSet => {
            let data = SeqParameterSet::from_bits(nal.rbsp_bits()).unwrap();
            // Don't forget to tell stream_context that we have a new SPS.
            // If you want to handle it separately, you can clone the struct before passing along,
            // But if you only care about it when a slice calls for it, you don't have to handle it here.
            stream_context.put_seq_param_set(data);
        }
        UnitType::PicParameterSet => {
            // Same as when parsing an SPS, except it borrows the stream context so it can pick out
            // the SPS that this PPS references
            let data = PicParameterSet::from_bits(&stream_context, nal.rbsp_bits()).unwrap();
            // Same as with an SPS, tell the context that we've found a PPS
            stream_context.put_pic_param_set(data);
        }
        // Let's handle a random slice type too to see how that works
        UnitType::SliceLayerWithoutPartitioningIdr => {
            let mut bits = nal.rbsp_bits();
            // We can parse the slice header, and it will give us:
            let (header, // The header of the slice
                seq_params, // A borrow of the SPS...
                pic_params // ...and PPS activated by the header
            ) = SliceHeader::from_bits(&stream_context,
                                       &mut bits, // takes a mutable borrow so the body parser can continue from where this ended
                                       nal_header).unwrap();
            // I don't think any slice data parsers are implemented right now.
        }
        other => {println!("Unhandled {other:?}")}
    }
    NalInterest::Ignore
});

// Push data. Doesn't have to be aligned in any way. You can push multiple times for a single NAL, or send an entire file in at once.
reader.push(data);
// If we're sure that the entire current NAL has been pushed, then we can call this to signal
// that the parser should immediately stop waiting for a new NAL marker.
reader.reset();
dholroyd commented 6 months ago

Thanks for the code @karcsesz - I expanded on it a bit to create examples/dump.rs.

groovybits commented 6 months ago

Thank you for this! I need to integrate it into my attempt, which works I think but I suspect I didn't do this completely 100% accurately...

https://github.com/groovybits/rsllm/blob/main/src/mpegts.rs#L432

I haven't hooked it up yet, it's attempting to combine AI with the MpegTS "probe" for higher level decisions / analysis potentially (custom models). Had implemented it first here which is hooked up and a pcap into this and the mpegts/scte35 readers of yours...

https://github.com/groovybits/rscap/blob/main/src/bin/probe.rs#L834

This was tricky to figure out, yet seems to get the data, I haven't finished these to a point of using the data but it does print it out and seems correct for the most part (I think a few issues come from needing to refactor it with this example logic in dump.rs you posted!)

Thank you!!!