P1sec / pycrate

A Python library to ease the development of encoders and decoders for various protocols and file formats; contains ASN.1 and CSN.1 compilers.
GNU Lesser General Public License v2.1
381 stars 132 forks source link

BIT STR CONTAINING constraint referencing OPTIONAL info object field #139

Closed benmaddison closed 3 years ago

benmaddison commented 3 years ago

I am seeing an issue where an X.509 cert created and serialized to DER using the cryptography library and then de-serialized by pycrate is broken.

I think that the behaviour I am seeing is a bug, but my ASN.1/DER knowledge is not good enough to be sure!

In particular, I am using the Certificate type that ships with pycrate in the PKIX1Explicit-2009 module.

The signature component of Certificate is a BIT STRING with a containing constraint on SIGNATURE-ALGORITHM.&Value:

Certificate  ::=  SIGNED{TBSCertificate}

SIGNED{ToBeSigned} ::= SEQUENCE {
 toBeSigned           ToBeSigned,
 algorithmIdentifier  SEQUENCE {
     algorithm        SIGNATURE-ALGORITHM.&id({SignatureAlgorithms}),
     parameters       SIGNATURE-ALGORITHM.&Params({SignatureAlgorithms}
                          {@algorithmIdentifier.algorithm}) OPTIONAL
 },
 signature BIT STRING (CONTAINING SIGNATURE-ALGORITHM.&Value(
                          {SignatureAlgorithms}
                          {@algorithmIdentifier.algorithm}))
}

The &Value field is marked OPTIONAL on the SIGNATURE-ALGORITHM class definition, and is only present for algorithms where the signature has an complex structure (i.e. not just a simple bit-string):

SIGNATURE-ALGORITHM ::= CLASS {
    &id             OBJECT IDENTIFIER UNIQUE,
    &Value          OPTIONAL,
    &Params         OPTIONAL,
    &paramPresence  ParamOptions DEFAULT absent,
    &HashSet        DIGEST-ALGORITHM OPTIONAL,
    &PublicKeySet   PUBLIC-KEY OPTIONAL,
    &smimeCaps      SMIME-CAPS OPTIONAL
} WITH SYNTAX {
    IDENTIFIER &id
    [VALUE &Value]
    [PARAMS [TYPE &Params] ARE &paramPresence ]
    [HASHES &HashSet]
    [PUBLIC-KEYS &PublicKeySet]
    [SMIME-CAPS &smimeCaps]
}

All the certificates that I am currently working with use sha256WithRSAEncryption, which doesn't contain the &Value field in its info object instance definition:

sa-sha256WithRSAEncryption SIGNATURE-ALGORITHM ::= {
   IDENTIFIER sha256WithRSAEncryption
   PARAMS TYPE NULL ARE required
   HASHES { mda-sha256 }
   PUBLIC-KEYS { pk-rsa }
   SMIME-CAPS { IDENTIFIED BY sha256WithRSAEncryption }
}
sha256WithRSAEncryption  OBJECT IDENTIFIER  ::=  { pkcs-1 11 }

I think that the correct behaviour in this case should be to de-serialize the signature component as a simple BIT STRING.

However, from the debugging I have done, it appears that pycrate does the following:

Sometimes the buffer cannot be decoded successfully, and the exception handler falls back to storing the value as a plain BIT STRING. Everything works fine in this case.

However, sometimes the buffer does contain legal DER bytes. This is the case where things break:

When this occurs, the object can no longer be re-serialized again. This might be an issue with serialization of (_unk_xyz, ...) values.

In this case, I think that the correct behaviour should be to distinguish between the cases where: a) A matching entry was not found on the lookup-table; vs. b) A matching entry was found, but the required field was not present.

In a) the current behaviour is the only real option. In b) this should raise an exception in _decode_ber_cont(...) in order to trigger the exception handler __from_ber_buf

@p1-bmu, please let me know if you agree, and I will attempt a fix.

p1-bmu commented 3 years ago

Could you please provide me with 2 or 3 certificates that exhibit this issue, so that I can test it by myself to figure it out more easily ?

benmaddison commented 3 years ago

Sure.

Attaching 3 files:

certs.tar.gz

The difference is that the contents of signature in the "bad" ones is legal (but meaningless) DER, whereas the "good" one isn't.

To recreate, you can do something like:

>>> import glob
>>> from rpkimancer.asn1.mod import PKIX1Explicit_2009
>>> C = PKIX1Explicit_2009.Certificate
>>>
>>> for path in glob.glob("certs/ca_*.cer"):
...   with open(path, "rb") as f:
...     der = f.read()
...   C.from_der(der)
...   print(f"{path}: {der == C.to_der()}")
...   C.reset_val()
...
OPEN._decode_ber_cont: Certificate.toBeSigned.signature.parameters, unable to retrieve a table-looked up object
OPEN._decode_ber_cont: Certificate.algorithmIdentifier.parameters, unable to retrieve a table-looked up object
OPEN._decode_ber_cont: Certificate._cont_signature, unable to retrieve a table-looked up object
certs/ca_bad_2.cer: False
OPEN._decode_ber_cont: Certificate.toBeSigned.signature.parameters, unable to retrieve a table-looked up object
OPEN._decode_ber_cont: Certificate.algorithmIdentifier.parameters, unable to retrieve a table-looked up object
OPEN._decode_ber_cont: Certificate._cont_signature, unable to retrieve a table-looked up object
certs/ca_bad_1.cer: False
OPEN._decode_ber_cont: Certificate.toBeSigned.signature.parameters, unable to retrieve a table-looked up object
OPEN._decode_ber_cont: Certificate.algorithmIdentifier.parameters, unable to retrieve a table-looked up object
BIT_STR.__from_ber_buf: signature, CONTAINING object decoding failed
certs/ca_good.cer: True

Let me know if you have questions.

p1-bmu commented 3 years ago

Is there any reason why you use from_der() and to_der() while your files seem to be cer encoded (have a .cer suffix) ?

benmaddison commented 3 years ago

They're DER encoded. .cer is the assigned file extension for resource certificates in the RPKI, so that's what my code writes to.

You can check that they are well formed with openssl x509 -inform DER -text -in <path>.

Depending on your openssl version you might not be able to decode some of the RFC3779 extensions.

p1-bmu commented 3 years ago

OK, thanks for the examples. https://github.com/P1sec/pycrate/commit/7f3647e4bfa166c9eca3f8c5bb168bf7da0c39ad : this should fix your issue, this only applies to BIT STRING and OCTET STRING with CONTAINING constraint, without touching the OPEN type and other part of the runtime.

Let me know if this is OK.

benmaddison commented 3 years ago

OK, thanks for the examples.

No problem

7f3647e : this should fix your issue, this only applies to BIT STRING and OCTET STRING with CONTAINING constraint, without touching the OPEN type and other part of the runtime.

Let me know if this is OK.

Thanks! Will test properly this morning.

Looking through the changes, it looks like this should work. The reason that I suggested using custom exceptions in #140 was so that the calling function could tell the difference between the following cases:

  1. No entry found in the LUT - maybe an error (depending on the extensibility of the set)
  2. An entry was found, but did not contain the (mandatory) field - error
  3. An entry was found, but did not contain the OPTIONAL field - not an error

Mostly this is just used for logging, so maybe not such a big deal...

benmaddison commented 3 years ago

I have tested, and can confirm that this appears to fix the issue.

The logging issue is a bit annoying, but not a big deal for now.

Are you open to moving to the logging module, rather than printing to STDOUT, or is there a specific reason you chose this method? It shouldn't be hard to replace the asnlog helper with something a little more friendly.

p1-bmu commented 3 years ago

I never checked how the logging module works, and was never really interested in spending time to implement it library wide (unfortunately). There are few logging facilities in pycrate which are just print()ing things, and not only in the ASN.1 runtime I suppose. This is enough to me, but if you feel a more complete / robust solution for logging throughout the library is required, supported by 2 or 3 well-defined and use-cases, this may change my mind :)

For ASN.1 objects in the ASN.1 runtime, you can use ASN1Obj._SILENT = True (with ASN1Obj imported from pycrate_asn1rt.asnobj) in order to silent all warnings.

benmaddison commented 3 years ago

I never checked how the logging module works, and was never really interested in spending time to implement it library wide (unfortunately).

I assumed that was the reason. In a library, it's mostly as simple as:

import logging

log = logging.getLogger(__name__)

In each module, and then calling log(...) instead of print(...)

There are few logging facilities in pycrate which are just print()ing things, and not only in the ASN.1 runtime I suppose. This is enough to me, but if you feel a more complete / robust solution for logging throughout the library is required, supported by 2 or 3 well-defined and use-cases, this may change my mind :)

The biggest issue that it creates is that the messages end up on STDOUT. For example, if a CLI tool prints an object in JER, and a user wants to pipe the output into a JSON query tool like jq, then the logging output will break it.

At the moment, I am wrapping all my calls into the pycrate API, so that any writes to STDOUT are redirected into my logger.

This has a few disadvantages:

Perhaps you could point me towards the other logging functions in pycrate, so that I can get an idea of the amount of work involved?

For ASN.1 objects in the ASN.1 runtime, you can use ASN1Obj._SILENT = True (with ASN1Obj imported from pycrate_asn1rt.asnobj) in order to silent all warnings.

Yup, I saw that. I don't want to loose the logs altogether, which is why I went with redirection wrapper.

Perhaps we should move this to a new issue?

p1-bmu commented 3 years ago

Yes, I am realizing I have various print() throughout the library that could be normalized into a proper logging. I will create a new issue for this.