wbond / asn1crypto

Python ASN.1 library with a focus on performance and a pythonic API
MIT License
335 stars 140 forks source link

Wrong encoding for EncryptedContentInfo.encrypted_content field #222

Closed JRomainG closed 2 years ago

JRomainG commented 2 years ago

Description

Generating ASN.1 using the .dump function of an EncryptedContentInfo object seems to yield invalid data.

Version

I tried with the latest version from pip:

>>> import asn1crypto
>>> asn1crypto.__version__
'1.4.0'

as well as by cloning the master branch from this repository. The bug seems to be present in both versions.

Reproducing

I encountered this bug when using EncryptedContentInfo, and derived a minimal example using this class. I did not check if the same bug happens for other classes though.

from asn1crypto.cms import EncryptedContentInfo

from cryptography.hazmat.primitives.padding import PKCS7
from cryptography.hazmat.primitives.ciphers import modes, Cipher
from cryptography.hazmat.primitives.ciphers.algorithms import AES

def encrypt(data):
    symkey = AES(b"\0" * 16)
    iv = b"\0" * 16

    cipher = Cipher(symkey, modes.CBC(iv))
    encryptor = cipher.encryptor()

    padder = PKCS7(AES.block_size).padder()
    padded = padder.update(data) + padder.finalize()

    ciphertext = encryptor.update(padded) + encryptor.finalize()
    return symkey, iv, ciphertext

symkey, iv, ciphertext = encrypt(b"Hello, World!")
eci = EncryptedContentInfo({
    'content_type': 'data',
    'content_encryption_algorithm': {
        'algorithm': 'aes128_cbc',
        'parameters': iv,
    },
    'encrypted_content': ciphertext,
})

print("Hex dump:", eci.dump().hex())

print("Debug:")
eci.debug()

The output of this script is the following:

Hex dump: 303c06092a864886f70d010701301d060960864801650304010204100000000000000000000000000000000080108652626463653fecd2edf9db746a27f3
Debug:
  asn1crypto.cms.EncryptedContentInfo Object #140707195704176
    Header: 0x303c
      constructed universal tag 16
    Data: 0x06092a864886f70d010701301d060960864801650304010204100000000000000000000000000000000080108652626463653fecd2edf9db746a27f3
      Field "content_type"
        asn1crypto.cms.ContentType Object #140707196890128
          Header: 0x0609
            primitive universal tag 6
          Data: 0x2a864886f70d010701
            Native: data
      Field "content_encryption_algorithm"
        asn1crypto.algos.EncryptionAlgorithm Object #140707189407216
          Header: 0x301d
            constructed universal tag 16
          Data: 0x0609608648016503040102041000000000000000000000000000000000
            Field "algorithm"
              asn1crypto.algos.EncryptionAlgorithmId Object #140707182449376
                Header: 0x0609
                  primitive universal tag 6
                Data: 0x608648016503040102
                  Native: aes128_cbc
            Field "parameters"
              asn1crypto.core.OctetString Object #140707182536928
                Header: 0x0410
                  primitive universal tag 4
                Data: 0x00000000000000000000000000000000
                  Native: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
      Field "encrypted_content"
        asn1crypto.core.OctetString Object #140707182449232
          Header: 0x8010
            primitive context tag 0 (implicitly tagged)
          Data: 0x8652626463653fecd2edf9db746a27f3
            Native: b"\x86Rbdce?\xec\xd2\xed\xf9\xdbtj'\xf3"

Parsing the dumped value using openssl shows a missing field (encrypted_content):

$ echo '303c06092a864886f70d010701301d060960864801650304010204100000000000000000000000000000000080108652626463653fecd2edf9db746a27f3' | xxd -r -p | openssl asn1parse -inform DER
    0:d=0  hl=2 l=  60 cons: SEQUENCE          
    2:d=1  hl=2 l=   9 prim: OBJECT            :pkcs7-data
   13:d=1  hl=2 l=  29 cons: SEQUENCE          
   15:d=2  hl=2 l=   9 prim: OBJECT            :aes-128-cbc
   26:d=2  hl=2 l=  16 prim: OCTET STRING      [HEX DUMP]:00000000000000000000000000000000
   44:d=1  hl=2 l=  16 prim: cont [ 0 ]        

Using this parser seems to show that the field is indeed not properly encoded:

SEQUENCE (3 elem)
  OBJECT IDENTIFIER 1.2.840.113549.1.7.1 data (PKCS #7)
  SEQUENCE (2 elem)
    OBJECT IDENTIFIER 2.16.840.1.101.3.4.1.2 aes128-CBC (NIST Algorithm)
    OCTET STRING (16 byte) 00000000000000000000000000000000
  [0] (16 byte) 8652626463653FECD2EDF9DB746A27F3

whereas it should probably be:

SEQUENCE (3 elem)
  OBJECT IDENTIFIER 1.2.840.113549.1.7.1 data (PKCS #7)
  SEQUENCE (2 elem)
    OBJECT IDENTIFIER 2.16.840.1.101.3.4.1.2 aes128-CBC (NIST Algorithm)
    OCTET STRING (16 byte) 00000000000000000000000000000000
  [0] (1 elem)
    OCTET STRING (16 byte) 8652626463653FECD2EDF9DB746A27F3

Would that be accurate or is there a misunderstanding on my part?

MatthiasValvekens commented 2 years ago

That output actually looks correct to me. This is the definition in RFC 5652:

 EncryptedContentInfo ::= SEQUENCE {
        contentType ContentType,
        contentEncryptionAlgorithm ContentEncryptionAlgorithmIdentifier,
        encryptedContent [0] IMPLICIT EncryptedContent OPTIONAL }

The tag on the encryptedContent field is marked as implicit, which means that the "inner" universal OCTET STRING tag on EncryptedContent is not actually encoded in BER. The [0] tag takes its place.

But I'm mostly going off my memory of X.690 here, so take that with a grain of salt.

JRomainG commented 2 years ago

Thanks for the answer @MatthiasValvekens! I performed some further tests, and it seems like you are indeed right. It seems like openssl asn1parse simply doesn't show the content of the implicit field. Other commands, like openssl pkcs7, do show it correctly though.