ReFormationPro / Gstreamer-H264-Encryption

Experimental GStreamer plugin for encrypting / decrypting H264 streams with AES
https://medium.com/@oguzhanoztaskin/gstreamer-h264-encryption-plugin-cf2c7b43c383
GNU General Public License v3.0
5 stars 1 forks source link

If `x264enc`::`threads` > 1 - `h264decrypt` failed to decrypt slice nal unit #4

Closed inobelar closed 2 months ago

inobelar commented 2 months ago

Greetings! I'm trying to use 'slightly tuned for performance' x264enc, and noticed, that setting thread property value greater then 1, causes (after ~3-15 seconds) crash of h264decrypt. Single-threaded x264enc works well for a long time.

Here is my gstreamer pipeline:

$ GST_PLUGIN_PATH=/home/vuser/work/Gstreamer-H264-Encryption/build \
  DISPLAY=:0.0 \
  GST_DEBUG=3 \
  gst-launch-1.0 --verbose \
    videotestsrc do-timestamp=true pattern=ball ! \
        "video/x-raw, width=640, height=480, framerate=30/1" ! \
    timeoverlay ! \
    x264enc \
        tune=zerolatency \
        speed-preset=ultrafast \
        bitrate=1000 \
        byte-stream=true \
        threads=2 ! \
    h264encrypt \
        iv-seed=1869052520 \
        key=01234567012345670123456701234567 \
        encryption-mode=aes-cbc ! \
    h264decrypt \
        key=01234567012345670123456701234567 \
        encryption-mode=aes-cbc ! \
    avdec_h264 ! \
    videoconvert ! \
    ximagesink \
        sync=false

Environment:

STDOUT LOG (Execution ended after 0:00:15 seconds) Setting pipeline to PAUSED ... 0:00:00.112587847 2439 0x7bae10000b70 FIXME default gstutils.c:4088:gst_element_decorate_stream_id_internal: Creating random stream-id, consider implementing a deterministic way of creating a stream-id Pipeline is PREROLLING ... /GstPipeline:pipeline0/GstVideoTestSrc:videotestsrc0.GstPad:src: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive /GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:src: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive /GstPipeline:pipeline0/GstTimeOverlay:timeoverlay0.GstPad:src: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive Redistribute latency... /GstPipeline:pipeline0/GstX264Enc:x264enc0.GstPad:sink: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive /GstPipeline:pipeline0/GstTimeOverlay:timeoverlay0.GstPad:video_sink: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive /GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:sink: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive /GstPipeline:pipeline0/GstX264Enc:x264enc0.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono /GstPipeline:pipeline0/GstH264Encrypt:h264encrypt0.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono /GstPipeline:pipeline0/GstH264Decrypt:h264decrypt0.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono Redistribute latency... /GstPipeline:pipeline0/avdec_h264:avdec_h264-0.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono /GstPipeline:pipeline0/GstH264Decrypt:h264decrypt0.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono /GstPipeline:pipeline0/GstH264Encrypt:h264encrypt0.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high-4:4:4, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, interlace-mode=(string)progressive, colorimetry=(string)bt601, chroma-site=(string)jpeg, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono /GstPipeline:pipeline0/avdec_h264:avdec_h264-0.GstPad:src: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)bt601, framerate=(fraction)30/1 /GstPipeline:pipeline0/GstVideoConvert:videoconvert0.GstPad:src: caps = video/x-raw, width=(int)640, height=(int)480, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, format=(string)BGRx, colorimetry=(string)1:1:16:4 /GstPipeline:pipeline0/GstXImageSink:ximagesink0.GstPad:sink: caps = video/x-raw, width=(int)640, height=(int)480, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)30/1, format=(string)BGRx, colorimetry=(string)1:1:16:4 /GstPipeline:pipeline0/GstVideoConvert:videoconvert0.GstPad:sink: caps = video/x-raw, format=(string)Y444_10LE, width=(int)640, height=(int)480, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)bt601, framerate=(fraction)30/1 Redistribute latency... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock Redistribute latency... 0:00:15.458054239 2439 0x7bae10000b70 ERROR h264decrypt h264_decrypt.c:276:gst_h264_decrypt_decrypt_slice_nalu: Encrypted block size (3439) is not a multiple of AES_BLOCKLEN (16). Not attempting to decrypt. 0:00:15.458641773 2439 0x7bae10000b70 ERROR h264decrypt h264_decrypt.c:200:gst_h264_decrypt_process_slice_nalu: Failed to decrypt slice nal unit 0:00:15.459033570 2439 0x7bae10000b70 ERROR h264encryptionbase h264_encryption_base.c:309:gst_h264_encryption_base_transform: Subclass failed to parse slice nalu 0:00:15.459609642 2439 0x7bae10000b70 WARN basesrc gstbasesrc.c:3177:gst_base_src_loop: error: Internal data stream error. 0:00:15.459902516 2439 0x7bae10000b70 WARN basesrc gstbasesrc.c:3177:gst_base_src_loop: error: streaming stopped, reason error (-5) ERROR: from element /GstPipeline:pipeline0/GstVideoTestSrc:videotestsrc0: Internal data stream error. Additional debug info: ../libs/gst/base/gstbasesrc.c(3177): gst_base_src_loop (): /GstPipeline:pipeline0/GstVideoTestSrc:videotestsrc0: streaming stopped, reason error (-5) Execution ended after 0:00:15.266864337 Setting pipeline to NULL ... Freeing pipeline ...
ReFormationPro commented 2 months ago

Odd. I checked for race conditions but it seems that my plugin's buffer processing functions are always called from the same thread. Furthermore, I think there is a lock present at a higher level - GST_OBJECT_LOCK - so it should not have been an issue.

I checked for payload sizes and saw that encryption payloads sizes are multiples of 16 but decryption payload size is 1 byte lesser than a multiple of 16 when the error occurs. This occurs at ciphertext level. Only variable that affects ciphertext length is emulation prevention bytes and there was none of them in the ciphertext in my tests. So I do not suspect them.

Perhaps there is some off-by-one error somewhere and setting thread count to a higher value enables it.

It is odd why it only happens in this condition so far though.

ReFormationPro commented 2 months ago

Hi there, can you check "pr4-test-fix" branch? I applied a test fix for the issue. It should help with pr5 as well.

The issue was when ciphertext ends with a null byte, it was not parsed / it was ignored by the decryptor's nal parser. H.264 RFC section B.1.1 says this is "trailing_zero_8bits" at the end of NAL unit and they are discarded.

I added a 0x80 byte to the end of each ciphertext and removed it at the decryptor before decrypting to fix it.

inobelar commented 2 months ago

Hi! I tested your branch and I can say that the fixes solve the problems in both issues (https://github.com/ReFormationPro/Gstreamer-H264-Encryption/issues/4, https://github.com/ReFormationPro/Gstreamer-H264-Encryption/issues/5)!

Can you also add accompanying comments in new branch, with useful "why this and not otherwise" information? For example:

  1. https://github.com/ReFormationPro/Gstreamer-H264-Encryption/blob/13f1039bfb038677983c962711224a674d575aa3/gst-h264-encryption/src/h264_encryption_base.h#L36

    /* The ciphertext may end with a null byte, so we always add a non-null 
     * marker-byte at the end. (0x80 value is arbitrary)
     *
     * Without it (when ciphertext ends with a null byte), it was not 
     * parsed / it was ignored by the decryptor's nal parser. 
     * H.264 RFC section B.1.1 says this is "trailing_zero_8bits" at the end 
     * of NAL unit and they are discarded.
     */
    #define CIPHERTEXT_END_MARKER 0x80 
  2. https://github.com/ReFormationPro/Gstreamer-H264-Encryption/blob/13f1039bfb038677983c962711224a674d575aa3/gst-h264-encryption/src/h264_encryption_base.c#L329

    // Enlarge buffer by 1 byte, its needed to store extra 'non-null' marker-byte
    // NOTE: Is it correct explanation ? :)
    if (G_UNLIKELY(!gst_buffer_resize_range(outbuf, 0, -1, 0, dest_offset))) { 
    • It fount it very 'tricky' (to use -1 as length), after looking into gst_buffer_resize_range signature:
      gboolean gst_buffer_resize_range (GstBuffer * buffer, guint idx, gint length, gssize offset, gssize size)
  3. https://github.com/ReFormationPro/Gstreamer-H264-Encryption/blob/13f1039bfb038677983c962711224a674d575aa3/gst-h264-encryption/src/h264_encrypt.c#L273-L274

    • It was input_size + 40 + AES_BLOCKLEN + 30, now input_size + 40 + AES_BLOCKLEN + 210 Can you add clarification comments about 40 and 210 constants ? (and why 30 is not enough, and must used 210, or explain heuristics)
  4. Also, as I can see, similar to CIPHERTEXT_END_MARKER constant 0x80 used in the next cases:

  5. Additionally (its not related to commits in the new branch), I found unclear the next section: https://github.com/ReFormationPro/Gstreamer-H264-Encryption/blob/697e19eaa0b438bf8c664b2675a7013146ae4d47/gst-h264-encryption/src/h264_encryption_plugin.h#L52-L56

Thanks for your amazing work & quick responses!

ReFormationPro commented 2 months ago

Thank you for your kind comments.

  1. Choice of 0x80 is arbitrary. 0x00 cannot be used and I think under certain conditions 0x01-0x03 might be parsed as special byte sequences thus those values shall not be used. H.264 uses bit padding for alignment and this plugin uses bit padding for encryption. Since I apply bit padding at byte level, all my paddings start with 0x80 byte and I chose to use that. Perhaps I should have chosen another value to disambiguate which is padding and which is a marker or maybe this is a good choice because both padding and marker and non-data.
  2. I wanted to use gst_buffer_resize but it did not return success / failure so I replaced it with gst_buffer_resize_range. Reference: https://github.com/Xilinx/gstreamer/blob/master/gst/gstbuffer.c#L1572
  3. I need to allocate an output buffer before parsing the input buffer. Padding, insertion of emulation prevention bytes, IV SEI insertion, ending marker takes space. IIRC, one IV SEI, SLICE NAL unit many ending markers and paddings, expectedly many emulation prevention bytes where expectation is directly proportional to total payload size (or input size for ease) are inserted per buffer. I needed to increase the allocation size because using x264enc with threads > 1 demanded larger buffer output buffer size. I just increased it as needed.
  4. It is the aforomentioned bit padding. Since I apply byte-level padding, only the first bit of first padding byte is 1 thus a sequence of 0x80 0x00... is produced. No need to name the constant but a comment for "bit padding" can be added to the function.
  5. GST_H264_ENCRYPT_IV_SEI_UUID requires 16 bytes because of SEI format. AES_BLOCKLEN defines the size of IV, and IV is the payload of this SEI, hence the size of SEI depends on AES_BLOCKLEN. Signature is used to bypass parsing the SEI for speed-up. GStreamer API parses the entire SEI and loads its members into dynamic array and that takes time and space. To quickly understand if a SEI NAL unit is of interest to the plugin, I check some of the first bytes. Those bytes are NAL unit type (0x06 SEI NAL unit), SEI payload type (0x05 Unregistered User Data), SEI payload size (0x20 = 16 bytes for SEI UUID and 16 bytes for IV when AES_BLOCKLEN = 16). SEI UUID follows SEI payload size. AES blocklen cannot be changed for now, so I hard coded 0x20 there.

I can add comments, sure. I am looking at other projects right now and as there is not a bug here, I am giving a break to this.

ReFormationPro commented 2 months ago

Closing this issue as pr4-test-fix is merged to the main. I also added some comments based on your remarks. Thanks for your help!