aws / aws-lc

AWS-LC is a general-purpose cryptographic library maintained by the AWS Cryptography team for AWS and their customers. It іs based on code from the Google BoringSSL project and the OpenSSL project.
Other
398 stars 118 forks source link

Remove custom PKCS7 ASN1 functions, add new structs #1726

Closed WillChilds-Klein closed 2 months ago

WillChilds-Klein commented 3 months ago

Issues:

Resolves #CryptoAlg-2493

Description of changes:

This PR is the first in a series implementing Ruby's supported subset of the PKCS7 standard. To do this, we remove AWS-LC's customized ASN.1 serialization logic and delegate to asn1.h and asn1t.h. We also update the various PKCS_type_is_* no-op functions to actually check the input's type.

The PKCS7 tests currently contain a test signedData structure that is encoded using BER. Because our ASN1_get_object currently disallows indefinite-length BER, we need to modify d2i_PKCS7 to detect and convert BER (if present) into DER before parsing. This allows us to retain backwards compatibility with BER-encoded PKCS7 objects. Please see the appendix for more discussion on this topic.

The next PR in this series will implement the various getters, setters, and allocation functions needed to support CRruby with testing for each. Please see PR #1780 for an idea of what that will look like.

Call-outs:

Source edits conform to OpenSSL's implementation, inlining some of OpenSSL's common subroutines.

Prior struct definitions in public pkcs7.h are preserved for backwards compatibility and expanded for the rest of this series. Whether we can make these existing definitions opaque should be a point of discussion. New structure definitions are made opaque.

Also, because AWS-LC previously implemented BER/DER parsing semi-manually and cached the encoded value in PKCS7 structs, there may exist performance differences due to the new lack of caching. Similarly, BER parsing is now slightly less efficient because we need to make 2 passes over the input (once to convert from BER to DER, once to parse the DER).

Testing:


Appendix: Indefinite-length BER in PKCS7

TL;DR -- Reading RFC 2315, it looks like encrypted is the only content type should be affected, but the RFC imposes no restrictions around how the top-level ContentInfo SEQUENCE shall be encoded.

Support for indefinite-length BER was removed (see also here) in the general ASN.1 parser macros (although BoringSSL has retained support for it in their minimal PKCS7 implementation, as it used the CBS API for parsing).

Per section 5 of RFC 2315 defining PKCS7:

   The syntax is general enough to support many different content types.
   This document defines six: data, signed data, enveloped data,
   signed-and-enveloped data, digested data, and encrypted data. ...

   There are two classes of content types: base and enhanced.  Content
   types in the base class contain "just data," with no cryptographic
   enhancements. Presently, one content type is in this class, the data
   content type. ...

   The document is designed such that the enhanced content types can be
   prepared in a single pass using indefinite-length BER encoding, and
   processed in a single pass in any BER encoding. Single-pass operation
   is especially helpful if content is stored on tapes, or is "piped"
   from another process. One of the drawbacks of single-pass operation,
   however, is that it is difficult to output a DER encoding in a single
   pass, since the lengths of the various components may not be known in
   advance. Since DER encoding is required by the signed-data, signed-
   and-enveloped data, and digested-data content types, an extra pass
   may be necessary when a content type other than data is the inner
   content of one of those content types.

Since indefinite length is forbidden in DER and data is not an enhanced content type, the last paragraph excerpted above implies that only enveloped and encrypted content types can utilize indefinite BER. However, [section 7]() states:

        2.   When a ContentInfo value is the inner content of
             signed-data, signed-and-enveloped-data, or digested-data
             content, a message-digest algorithm is applied to the
             contents octets of the DER encoding of the content field.
             When a ContentInfo value is the inner content of
             enveloped-data or signed-and-enveloped-data content, a
             content-encryption algorithm is applied to the contents
             octets of a definite-length BER encoding of the content
             field.

So, it would seem that the only content type that can be encoded in indefinite length BER is the encrypted type.

Interestingly, the NSS certificate previously included in our PKCS7 tests contains a signed content encoded in indefinite-length BER (note that length value 0x80 indicates indefinite length).

Screenshot 2024-07-30 at 1 05 29 PM

Unfortunately, this appears to be a valid PKCS7 object per the RFC because the RFC only imposes definite-length encoding on the contents that are operated on cryptographically, not the top-level structures themselves.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

codecov-commenter commented 3 months ago

Codecov Report

Attention: Patch coverage is 51.47059% with 33 lines in your changes missing coverage. Please review.

Project coverage is 78.36%. Comparing base (353228b) to head (578d967).

Files Patch % Lines
crypto/pkcs7/pkcs7_asn1.c 58.53% 17 Missing :warning:
crypto/pkcs7/pkcs7_x509.c 20.00% 16 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1726 +/- ## ========================================== + Coverage 78.34% 78.36% +0.01% ========================================== Files 581 582 +1 Lines 97343 97330 -13 Branches 13960 13951 -9 ========================================== + Hits 76268 76273 +5 + Misses 20455 20436 -19 - Partials 620 621 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.