ralfbiedert / openh264-rs

Idiomatic Rust wrappers around OpenH264.
66 stars 32 forks source link

Question: 4-byte NAL prefixes after the first #9

Closed repnop closed 2 years ago

repnop commented 2 years ago

Hi again :D

So related to the issue that we had #7, we've discovered that indeed the encoder from the Android phone that @BlackHoleFox and I have (Galaxy S9+) is sending us two NALs per TCP packet: the first seems to be an Access Unit Delimiter (AUD) packet, and the second contains the actual image data. This seems like it may be relegated to Samsung's phone model(s?) as the Pixel we tested on doesn't seem to send the AUD NAL, so for our application its something we need to take into account on our application's end and filter out the non-{image, SPS, PPS} NALs so that openh264 doesn't explode. Which is simple enough with the nal_units function you've (thankfully!) provided us, except that the second NAL in the TCP packet also has a 4-byte 00 00 00 01 NAL prefix, instead of a 3-byte, which the nal_units function doesn't handle quite correctly since it only expects a 3-byte 00 00 01 NAL header after the first one. So my question is: do you know if the second NAL header we're receiving is invalid as per the format specification, or does nal_units need adjusted to correctly handle multiple 4-byte 00 00 00 01 NAL prefixes? In the case of the latter, I'm more than happy to submit a PR fixing it :+1:

Thanks again for all your help, this crate has simplified our build process a ton and we're really grateful to have these bindings available :)

ralfbiedert commented 2 years ago

the second NAL in the TCP packet also has a 4-byte 00 00 00 01 NAL prefix, instead of a 3-byte, which the nal_units function doesn't handle quite correctly since it only expects a 3-byte 00 00 01 NAL header after the first one.

Maybe I misunderstand, but are you saying that nal_units does 1) not recognize your second 00 00 01 block, or 2) it recognizes it but you think that's wrong and / or the decoder produces a subsequent error?

Case 1) would be an issue; case 2), to the degree I understand H264, should be fine. Looking at Section B.1.1 of the H264 specification (see below), and comparing this Stackoverflow answer, 00 00 01 seems to be the NAL separator ("start code prefix"), which sometimes might have a leading zero (which could as well be a trailing zero from a previous packet).

In other words, for the purpose of feeding NALs into OpenH264, segmenting by 00 00 01 should be sufficient as the worst that could happen is OpenH264 sees a trailing 00 on the isolated package, which AFAICT it seems to handle quite well.

So my question is: do you know if the second NAL header we're receiving is invalid as per the format specification,

Without having seen the stream and if the first packet got handled appropriately, it would appear the initial 00 of that block is the first packet's trailing_zero_8bits, while the remaining 00 00 01 are the second packet's start_code_prefix_one_3bytes.

or does nal_units need adjusted to correctly handle multiple 4-byte 00 00 00 01 NAL prefixes?

See above, I'd argue segmenting for 00 00 01 for the purpose of feeding into OpenH264 seems to be ok, as that is the start_code_prefix_one_3bytes the decoder actually expects.

If you still get issues despite proper splits along 00 00 01 my guess is the NAL itself is unusual and OpenH264 doesn't know how to handle it, which I've also seen a few times. Usually that fixes itself if you just keep sending more NALs, ignoring a limited series of decode errors. I agree though from a Rust / correctness perspective it's not ideal, so filtering them manually might be a better option.

Appendix B.1.1

image

repnop commented 2 years ago

In other words, for the purpose of feeding NALs into OpenH264, segmenting by 00 00 01 should be sufficient as the worst that could happen is OpenH264 sees a trailing 00 on the isolated package, which AFAICT it seems to handle quite well.

This was my main concern, I wasn't sure if the trailing zero would cause any irregularities in the decoding process, but if openh264 is fine with it, then I don't have any issues :+1: I'll close this for the time being, and reopen if I encounter any issues where the trailing zero actually matters, thanks for the quick response!