openssl / openssl

TLS/SSL and crypto library
https://www.openssl.org
Apache License 2.0
24.98k stars 9.95k forks source link

OpenSSL misinterprets BER constructed BIT STRINGs #12810

Open davidben opened 3 years ago

davidben commented 3 years ago

(ASN.1 snippets use der-ascii syntax, from https://github.com/google/der-ascii.)

Summary

OpenSSL's decoding of constructed BIT STRINGs does not interoperate with the specification or other implementations. It injects extra bits into the resulting value due to flipping the order of padding and concatenation.

Details

BER, unlike DER, has a number of overcomplicated encodings for streaming, notably constructed strings. String types are normally encoded as primitive elements, such as OCTET_STRING { "hello, world" }. BER additionally allows a constructed string where the outer element is constructed and inner elements are concatenated:

[OCTET_STRING CONSTRUCTED] {
  OCTET_STRING { "hello, " }
  OCTET_STRING { "world" }
}

(This, when combined with indefinite-length encoding, lets the encoder stream a large string without knowing the length ahead of time, though none of the uses in OpenSSL actually rely on streaming.) The rules for most strings are described in 8.7.3 of X.690 (08/2015). OpenSSL implements this in asn1_collect.

BIT STRINGs are different. A BIT STRING is a sequence of bits, possibly not a multiple of 8 of them. They are encoded by appending 0 <= N <= 7 bits up to multiple of 8. N is then prepended as a byte. Thus BIT_STRING { `00abcd` } is a 16-bit BIT STRING (N = 0), and BIT_STRING { `04abc0` } is a 12-bit BIT STRING (N = 4).

BIT STRINGs can also be constructed, and OpenSSL runs the same asn1_collect logic for them. https://github.com/openssl/openssl/blob/master/crypto/asn1/tasn_dec.c#L731-L756

But constructed BIT STRINGs work differently. See 8.6.3 and 8.6.4 of X.690 (08/2015). Each segment has its own leading byte, so a streaming encoder does not need to commit to the length mod 8. (Although only the final segment can have padding bits. This means those bytes are redundant, but a streaming receiver doesn't know the segment is the last one.) OpenSSL instead flips the order: it concatenates first, then resolves padding. That is, the 12-bit string BIT_STRING { `04abc0` } can be encoded as:

[BIT_STRING CONSTRUCTED] {
  BIT_STRING { `00ab` }
  BIT_STRING { `04c0` }
}

But OpenSSL misinterprets it as the 24-bit string BIT_STRING { `00ab04c0` }. Instead, OpenSSL accepts the following as the 12-bit input:

[BIT_STRING CONSTRUCTED] {
  BIT_STRING { `04ab` }
  BIT_STRING { `c0` }
}

This is not a valid BER BIT STRING encoding at all because c0 is greater than 7. See also the test program below. It compares OpenSSL's behavior with that of NSS, which seems to handle it correctly.

Recommendation

Since this BIT STRING handling doesn't match spec or other implementations, and will misinterpret valid inputs (quite risky in a security-sensitive library), OpenSSL should either switch to the correct handling, or reject constructed BIT STRINGs altogether. I would suggest the latter (i.e. don't support constructed BIT STRINGs). BER is unnecessarily complicated, and this bug means constructed BIT STRINGs aren't usable in anything which needs to work with existing OpenSSLs anyway.

As far as I can tell, OpenSSL will never produce constructed BIT STRINGs, so I don't think the current behavior is particularly indicative compatibility-wise. Indeed most uses of cryptography like X.509 require DER, so using a BER parser at all is too loose. PKCS#7 and PKCS#12 blobs sadly do often use BER, but BIT STRINGs only carry public keys and signature values, neither of which are large enough to bother streaming. (I would argue the entirety of PKCS#7 and PKCS#12 aren't worth streaming in the first place. Indeed OpenSSL doesn't stream them.) This is probably why no one's noticed this before.

Test program

``` /* Input: 030300abcd OpenSSL: num_bits=16 abcd NSS: num_bits=16 abcd Input: 2308030200ab030200cd OpenSSL: num_bits=24 ab00cd NSS: num_bits=16 abcd Input: 2307030200ab0301cd OpenSSL: num_bits=16 abcd NSS: invalid input Input: 030304abc0 OpenSSL: num_bits=12 abc0 NSS: num_bits=12 abc0 Input: 2308030200ab030204c0 OpenSSL: num_bits=24 ab04c0 NSS: num_bits=12 abc0 Input: 2307030204ab0301cd OpenSSL: num_bits=12 abc0 NSS: invalid input Input: 0303000000 OpenSSL: num_bits=16 0000 NSS: num_bits=16 0000 Input: 23080302000003020000 OpenSSL: num_bits=24 000000 NSS: num_bits=16 0000 Input: 230703020000030100 OpenSSL: num_bits=16 0000 NSS: num_bits=8 00 */ #include #include #include #include #include static void print_bytes(const unsigned char *in, size_t len) { for (size_t i = 0; i < len; i++) { printf("%02x", in[i]); } } static void test_openssl(const unsigned char *in, size_t len) { ASN1_BIT_STRING *str = d2i_ASN1_BIT_STRING(NULL, &in, len); if (!str) { printf("OpenSSL: invalid input\n"); return; } assert(str->flags & ASN1_STRING_FLAG_BITS_LEFT); int num_bits = 8 * ASN1_STRING_length(str) - (str->flags & 0x7); printf("OpenSSL: num_bits=%d ", num_bits); print_bytes(ASN1_STRING_get0_data(str), ASN1_STRING_length(str)); printf("\n"); ASN1_STRING_free(str); } static void test_nss(const unsigned char *in, size_t len) { SECItem input; input.type = siBuffer; input.data = (unsigned char *)in; input.len = len; SECItem output; output.type = siBuffer; output.data = NULL; output.len = 0; if (SEC_ASN1DecodeItem(NULL, &output, SEC_ASN1_GET(SEC_BitStringTemplate), &input) != SECSuccess) { printf("NSS: invalid input\n"); return; } /* NSS returns BIT STRINGs as SECItems whose length is measured in bits * rather than bytes. */ printf("NSS: num_bits=%u ", output.len); print_bytes(output.data, (output.len + 7) / 8); printf("\n"); PORT_Free(output.data); } static void test(const unsigned char *in, size_t len) { printf("Input: "); print_bytes(in, len); printf("\n"); test_openssl(in, len); test_nss(in, len); printf("\n"); } int main() { // BIT_STRING { `00abcd` } static const unsigned char kDER0[] = {0x03, 0x03, 0x00, 0xab, 0xcd}; test(kDER0, sizeof(kDER0)); // Valid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `00ab` } // BIT_STRING { `00cd` } // } static const unsigned char kBER0[] = {0x23, 0x08, 0x03, 0x02, 0x00, 0xab, 0x03, 0x02, 0x00, 0xcd}; test(kBER0, sizeof(kBER0)); // Invalid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `00ab` } // BIT_STRING { `cd` } // } static const unsigned char kInvalidBER0[] = {0x23, 0x07, 0x03, 0x02, 0x00, 0xab, 0x03, 0x01, 0xcd}; test(kInvalidBER0, sizeof(kInvalidBER0)); // BIT_STRING { `04abc0` } static const unsigned char kDER1[] = {0x03, 0x03, 0x04, 0xab, 0xc0}; test(kDER1, sizeof(kDER1)); // Valid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `00ab` } // BIT_STRING { `04c0` } // } static const unsigned char kBER1[] = {0x23, 0x08, 0x03, 0x02, 0x00, 0xab, 0x03, 0x02, 0x04, 0xc0}; test(kBER1, sizeof(kBER1)); // Invalid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `04ab` } // BIT_STRING { `cd` } // } static const unsigned char kInvalidBER1[] = {0x23, 0x07, 0x03, 0x02, 0x04, 0xab, 0x03, 0x01, 0xcd}; test(kInvalidBER1, sizeof(kInvalidBER1)); // BIT_STRING { `000000` } static const unsigned char kDER2[] = {0x03, 0x03, 0x00, 0x00, 0x00}; test(kDER2, sizeof(kDER2)); // Valid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `0000` } // BIT_STRING { `0000` } // } static const unsigned char kBER2[] = {0x23, 0x08, 0x03, 0x02, 0x00, 0x00, 0x03, 0x02, 0x00, 0x00}; test(kBER2, sizeof(kBER2)); // Invalid BER constructed version of the above: // [BIT_STRING CONSTRUCTED] { // BIT_STRING { `0000` } // BIT_STRING { `00` } // } static const unsigned char kInvalidBER2[] = {0x23, 0x07, 0x03, 0x02, 0x00, 0x00, 0x03, 0x01, 0x00}; test(kInvalidBER2, sizeof(kInvalidBER2)); } ```

levitte commented 3 years ago

(I would argue the entirety of PKCS#7 and PKCS#12 aren't worth streaming in the first place. Indeed OpenSSL doesn't stream them.)

The history is that we implemented streaming exactly because there were reports of very large PKCS#7 messages

davidben commented 3 years ago

Ah, I see, there's some S/MIME-specific streaming filtering logic in some separate BIO and only does streaming encode of particular fields via callbacks. I don't see a streaming decode, though perhaps I've missed it.

Regardless, this bug report is about the core ASN.1 parser decoder, which isn't resumable, and the S/MIME-specific logic does not appear to select any of the BIT STRING fields, just a few OCTET STRINGs, as one would expect.

nhorman commented 1 month ago

closing for inactivity. If there is any desire to pick this back up, please feel free to reopen

davidben commented 1 month ago

This bug is still present, and I see you all previously judged it "severity: important". Is there more information you need on it? The information I wrote in September 2020 was pretty detailed, and included a suggested remediation. The fix should be quite simple, as it's just a matter of making the parser reject constructed BIT STRINGs. A fuller fix would be to parse them correctly, but seeing as OpenSSL has never parsed them correctly (i.e. it would misinterpret them with the wrong value), rejecting it is the more secure option.

davidben commented 1 month ago

(Note that reopening bugs is not something that external folks, myself included, can do.)

nhorman commented 1 month ago

I'm happy to re-open the issue for you. I'm honestly not sure if there is any additional information needed. The only thing I'm sure of is that, even though the issue might be present, and marked as important, no one has considered it important enough to address in the last 4 years, which makes me question the label. But if you're interested in continuing it, I'll gladly add it to our review queue to consider

davidben commented 1 month ago

I mean, I filed the bug because I noticed it while working on our parser, and fixed BoringSSL accordingly. I don't have any immediate plans to fix it in OpenSSL, but I figured it was worth taking the time to write it up for you all. Completely misinterpreting some input seemed worth passing along.

If you all now think this is working as intended and not worth fixing, I mean, I obviously think that's not a good stance for a security-critical library, but ah well. 😛 I replied because it sounds like this was closed not because you all took a stance here, and more out of closing bugs that you all forgot about.

nhorman commented 1 month ago

I can understand and appreciate you taking the time to open these bugs, its a service I thank you for.

that said, its my job here to figure out where to best spend our limited time and effort. We currently have over 2000 issues in this repository, many of which have been largely dormant for multiple years. That clearly doesn't mean they aren't important (this one pretty clearly is), but it does imply that, independent of importance, they aren't receiving any attention for any of a number of possible reasons.

Closing the issue I think is a great way to test the relative importance to both reporters and developers. If an issue has sat without update for a long time (for some definition thereof), closing it acts as a test to differentiate those issues that people actually care about. In this case, the test worked quite well (IMO). i closed it and you immediately indicated that you were still watching this issue. Even though you don't intend to fix it in openssl, you've indicated that you care about it, and its on our radar again, so I think its a win. @t8m has just marked it a help wanted issue, to garner more community involvement, and I've got it on my list for our next planning session, so....good :).

In the inverse case, if theres a similar issue which has been dormant, and a closure results in no response, well, the issue might still be important philosophically, but no one is paying enough attention to it to make forward progress. As such closing the issue makes our backlog much easier to manage, as it clarifies those issues for which there is active participation.

I suppose I can do this in two phases, since you noted that you were unable to reopen the issue, applying an inactive label to the issue, and closing it at a later date if there is no response. That might be a better approach.