ThibaultBee / StreamPack

SRT/RTMP/RTMPS live streaming libraries for Android
https://thibaultbee.github.io/StreamPack/index.html
Apache License 2.0
175 stars 67 forks source link

Add AUD before every nal unit #67

Closed jksiezni closed 1 year ago

jksiezni commented 1 year ago

This change is to improve compatibility with the MPEGTS spec, that says:

2.14.1 "Each AVC access unit shall contain an access unit delimiter NAL Unit"

From what I tested, this change does not appear to fix anything, but I only checked it with ffmpeg and OBS. There might be other software, that take those AUDs into consideration. For example, AUDs can be used to count frames in TS: https://www.ramugedia.com/how-count-avc-h-264-video-frames-in-transport-stream-file-

ThibaultBee commented 1 year ago

Hi, Thanks for the PR 🙏

Just a quick answer before a deeper analyze, the AUD already prefixes key frame (IDR) (a bit further in the code), so you are adding it twice. AUD does not prefix the other frames. So I think adding AUD is the correct way but not with this implementation.

ThibaultBee commented 1 year ago

From the AVC spec, an AVC access unit is defined as:

access unit: A set of NAL units always containing exactly one primary coded picture. In addition to the primary coded picture, an access unit may also contain one or more redundant coded pictures or other NAL units not containing slices or slice data partitions of a coded picture. The decoding of an access unit always results in a decoded picture.

As far as I understand, a full gop (with I, B and P frames) is an AVC access unit. So adding the AUD before each key frame is valid (as it was already the case). There is no need to add AUD before the other frames.

Do you agree with the behavior before your PR? If yes, I won't unfortunately merge your PR.

jksiezni commented 1 year ago

I am not an expert in this domain, but just compared an output of the StreamPack with ffmpeg. That is how I learned about missing AUDs in the stream. The ffmpeg inserts AUD units for every frame not only IDR, thus I came to a conclusion that StreamPack is lacking something here.

ffmpeg:

$ ffplay -v debug ffmpeg.ts
[h264 @ 0x7f43c803bb00] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c803bb00] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7f43c803bb00] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 @ 0x7f43c803bb00] nal_unit_type: 5(IDR), nal_ref_idc: 3
[h264 @ 0x7f43c8ab3f00] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c8ab3f00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c8aae880] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c8aae880] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c8ab8940] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c8ab8940] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c868c480] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c868c480] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c8685940] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c8685940] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c810fa40] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c810fa40] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c89122c0] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c89122c0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c892ef80] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c892ef80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7f43c894bc40] nal_unit_type: 9(AUD), nal_ref_idc: 0
[h264 @ 0x7f43c894bc40] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1

StreamPack:

$ ffplay -v debug StreamPackTest.ts
[h264 @ 0x7fa35010a840] nal_unit_type: 9(AUD), nal_ref_idc: 0f=0/0   
[h264 @ 0x7fa35010a840] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7fa35010a840] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 @ 0x7fa35010a840] nal_unit_type: 5(IDR), nal_ref_idc: 3
[h264 @ 0x7fa35060bd80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506c4800] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35068ba00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350650700] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35069e080] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35064ca00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350047ec0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506228c0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350692740] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506180c0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350040600] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35010c880] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35010a840] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35060bd80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506c4800] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35068ba00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350650700] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35069e080] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35064ca00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350047ec0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506228c0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350692740] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506180c0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350040600] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35010c880] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35010a840] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35060bd80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa3506c4800] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa35068ba00] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
[h264 @ 0x7fa350650700] nal_unit_type: 9(AUD), nal_ref_idc: 0f=0/0   
[h264 @ 0x7fa350650700] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7fa350650700] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 @ 0x7fa350650700] nal_unit_type: 5(IDR), nal_ref_idc: 3

As I noted earlier, all software I know had no issues with StreamPack's stream, but I wanted it to be on par with ffmpeg. If you find it wrong, then you can close my PR. Frankly speaking, I would prefer the current approach which is simply quicker, if only it is ok with the spec.