oneam / h264bsd

A simple h264 software decoding library
Other
194 stars 52 forks source link

Modify Stream buffer over time for big input video files #22

Open MarkusNoll opened 2 years ago

MarkusNoll commented 2 years ago

Hi there!

Thanks for this great project. I ported it yesterday to a microcontroller-platform with quite acceptable performance. However I have some constraints regarding RAM memory here.

The examples are making a malloc() of the full-size of the test-videos. So that the whole h264-video is basically in RAM.

How to efficiently deal with this library when having quite big videos and only buffer small amounts in RAM?

I double checked here at this point: u32 result = h264bsdDecode(&dec, byteStrm, len, 0, &readBytes); It returns me basically how much it was reading from the byte-stream, but it doesn't give me information in advance how much data it will consume. I guess this depends on frame-type it was processing. But this makes it very hard to efficiently rebuffer data from e.g. an SD-Card to the stream.

Do you have any hints for this?

Thanks!

oneam commented 2 years ago

The library will either read a portion of an H.264 Annex B byte stream, or a single NAL (network abstraction layer) unit. That second mode means it's possible to feed it a byte buffer with a single NAL unit (excluding any start code) and expect it to consume the full buffer.

This Stack Overflow answer has some details on NAL units and start codes: https://stackoverflow.com/a/24890903/660982

In order to feed individual NAL units, you need to have some other code keeping track of them. That job is usually left up to the container library (like MP4file, RTP, or something custom)

Heath123 commented 1 year ago

Is there not a way to just have the library request more data when it runs out?

oneam commented 1 year ago

Not really. The library is low level for decoding pictures from the stream. It doesn't handle timing, wrapping each Network Access Layer unit in some other format or other thing a streaming library would provide. It only handles decoding one image at a time.

So you give it a NAL unit and it will either give you an image or nothing (if there is no new image).

The Annex B decoder is only slightly higher level, where you can give it a bunch of NAL units (separated by 0x0001) and it will decode the first NAL unit and return the location where it stopped.

If you want something higher level for streaming you should look at FFMPEG. It provides the unwrapping, timing, and other functionality for streaming.