Add encoding impl - Githubissues

jelmansouri-legion commented 2 years ago

Still considered WIP. The outputted NALs seem valid.

ralfbiedert commented 2 years ago

This is great, thank you so much.

I need some time later in my IDE to get a feeling for the typed API, but I'll merge this since those that are minor issues (if at all) which we can address afterwards.

ralfbiedert commented 2 years ago

I updated a few things and made a new release (0.1.17):

Changes:

created a formats module which ideally should contain all decoder / encoder conversions
changed enums like EVideoFrameType to actual enum, as FFI c_int types are 'infections' and make cross compilation more brittle

Things I noticed but haven't touched:

the encoder output only emits IDR right now, no SPS / PPS stream headers; it thus fails to decode its own output.
it would be nice to have the RGB / YUV conversion non-allocating, but I'm not sure yet what the best way to do that is.

jelmansouri-legion commented 2 years ago

I will do a what_goes_around_comes_back_around the other way to validate that the minimal impl works both ways For RGB -> YUV I tried to make it at least allocate once, the idea is to reuse the buffer (if threading is needed, a converter per thread would be necessary, but obviously it's quite costly still). So to give you a bit of context my use case is the following, I wanted a CPU fallback/debugging path to hardware encoding in a realtime 3D streaming application. Overall still wrapping my head around h.264. So really glad that I stumbled into your crate. Thanks!

jelmansouri-legion commented 2 years ago

One thing that I wanted to ask is the YUV color space used, currently I know the Color space I used doesn't match the reverse transformation. Should we align on BT.601 or BT.709?

ralfbiedert commented 2 years ago

Disclaimer, my knowledge of H.264 is limited and I'm on my way to work, but some tips / thoughts I found helpful:

Generally, I highly recommend always writing out packets and using a tools like H264BSAnalyzer.exe or friends, as encoders sometimes emit surprising packets when configured wrong which then break decoding:

About the encoder output, I think there should actually be 3 API calls: 1) Getting the first IDR including SPS + PPS, 2) only getting SPS+PPS headers and 3) only getting IDR (which is what you've done). Reason is, depending on your use case you might want to store / use / emit the headers separately, as they can be (AFAIK) passed to the decoder between random packets for stream recovery. Most people would probably just want 1) though, which is also how NVIDIA and most Android encoders seem to return encoded frames, at least the ones I've tested.
Threading in OpenH264 is brittle. I've tried it on the decoder, and it immediately crashed on some platforms I've tried it on, which is why I marked the call unsafe. I'm torn about the threading in this crate: I would like to avoid any Rust-specific threading, if possible, and only expose OpenH264's as it's unclear if one is even allowed to call them multi-threaded anyway. That said, there might be some tricks to improve threading compatibility in our structs, but I'd have to think about that a bit more.
Color space conversions: Haven't thought about that unfortunately.

ralfbiedert / openh264-rs

Add encoding impl #2