onvif / specs

ONVIF Network Interface Specifications
Other
325 stars 87 forks source link

Media signing, support for B frames is not consistent #413

Closed ocampana-videotec closed 3 months ago

ocampana-videotec commented 3 months ago

Section 5.2 states

The procedure of signing a video stream and producing SEI frames, can be described by a set of rules.

  • The NAL Unit types below must be hashed Picture NAL Units, that is, slices of IDR-, P- and B-frames. All SEI frames generated according to this specification, if not including a signature.

Section 5.4 states

A GOP is defined as all frames between two IDR frames, including the first IDR. Frames between these IDRs are prediction frames and can be either P- or B-frames.

Therefore, a reader assumes that B frames are supported. But section 5.7.7 states

This optional field contains the hash list, for a complete GOP; I- through P-frames.

I think that this last statement should be changed to I- through P and B-frames. Moreover, I suggest adding an example with B frames and hashing. I would assume the order for hashing successive frames is dictated by the order the encoder emits NAL units, but it is never explicitly written.

Supposing we have a GOP of 4 frames, being the first encoded as I, the second and third as B and the fourth as P, the hash of the I frame would be used to rehash the hash of the P frame, and the result would be used to rehash the second frame and the result would be used to rehash the third frame. But it is not written or shown as an example anywhere.

bjornvolcker commented 3 months ago

I agree that one can never be too explicit. Even though Section 5.4 states

A GOP is defined as all frames between two IDR frames, including the first IDR. Frames between these IDRs are prediction frames and can be either P- or B-frames. Without loss of generality, both of these are throughout the text denoted P-frames.

I agree that by 5.7.7 it is already forgotten and adding an extra "B-frame" to the sentence makes sense.

It is also a good suggestion to add an example with B-frames to explicitly show that the decoding order decides the hashing order. I will make necessary changes to the specification.

sujithhanwha commented 3 months ago

Closing this issue based on 4/18 VEWG telco discussion.