gilles-peskine-arm commented 1 year ago

The mbedtls fuzzer running on OSS-Fuzz has found a memory leak in X.509 code (private link). It was introduced between aaa26f25be091f482eff8698c9619293acfc27e3 and 97edeb4fb8a9890a8511363f55d5f27f29c4577a, and the only pull request that touches X.509 in that range is #6866.

To reproduce: build with Asan. The default configuration works. Run:

programs/fuzz/fuzz_x509crt clusterfuzz-testcase-minimized-fuzz_x509crt-6666050834661376

=================================================================
==29797==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4a0d5e in __interceptor_calloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:77:3
    #1 0x58586b in mbedtls_x509_get_subject_alt_name_ext mbedtls/library/x509.c:1259:25
    #2 0x57ef23 in x509_get_authority_key_id mbedtls/library/x509_crt.c:676:15
    #3 0x57dea3 in x509_get_crt_ext mbedtls/library/x509_crt.c:1010:28
    #4 0x576775 in x509_crt_parse_der_core mbedtls/library/x509_crt.c:1270:15
    #5 0x576775 in mbedtls_x509_crt_parse_der_internal mbedtls/library/x509_crt.c:1362:11
    #6 0x5771b8 in mbedtls_x509_crt_parse_der mbedtls/library/x509_crt.c:1399:12
    #7 0x5771b8 in mbedtls_x509_crt_parse mbedtls/library/x509_crt.c:1433:16
    #8 0x4de101 in LLVMFuzzerTestOneInput mbedtls/programs/fuzz/fuzz_x509crt.c:20:11
    #9 0x505ab3 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    #10 0x4f1212 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    #11 0x4f6abc in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    #12 0x51fff2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #13 0x79ed253f5082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16

clusterfuzz-testcase-minimized-fuzz_x509crt-6666050834661376.gz

I can reproduce the failure on my machine. I haven't tried to analyze it.

Goals of this issue:

[x] Investigate and fix the leak.
[x] There won't be a changelog message since the bug is in unreleased code. (Unless the bug turns out to be in older code, and was just made easier to find by the new code.) But we should acknowledge OSS-Fuzz by including the sentence “Credit to OSS-Fuzz.” in a commit message.
[x] Add a unit test that is a non-regression test for this leak. It can be the test case found by the fuzzer, or preferably a simpler test case constructed manually.
[x] Review the code for any similar issue (e.g. if the bug is that return should have been goto cleanup in one function, check other return statements introduced around the same time) and fix and test them.

mprse commented 1 year ago

I have a problem with reproducing this bug. I'm not familiar with OSS-Fuzz. fuzz_x509crt app provides only a fuzz target.

I started with development branch and default config:

przemek@przemek-VirtualBox:~/mbedtls$ ASAN_CFLAGS='-O2 -Werror -fsanitize=address,undefined -fno-sanitize-recover=all'
przemek@przemek-VirtualBox:~/mbedtls$ echo $ASAN_CFLAGS
-O2 -Werror -fsanitize=address,undefined -fno-sanitize-recover=all
przemek@przemek-VirtualBox:~/mbedtls$ make CFLAGS="$ASAN_CFLAGS" LDFLAGS="$ASAN_CFLAGS"
przemek@przemek-VirtualBox:~/mbedtls$ programs/fuzz/fuzz_x509crt clusterfuzz-testcase-minimized-fuzz_x509crt-6666050834661376
przemek@przemek-VirtualBox:~/mbedtls$

I read readme file and I was able to run oss-fuzz tests for mbedtls, but to reproduce this issue I suspect that this part is relevant:

To run the fuzz targets without oss-fuzz, you first need to install one libFuzzingEngine (libFuzzer for instance). Then you need to compile the code with the compiler flags of the wished sanitizer.

So I installed libFuzzer:

przemek@przemek-VirtualBox:~/oss-fuzz$ pip install libFuzzer
Collecting libFuzzer
  Downloading libfuzzer-0.0.2-py3-none-manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (229 kB)
     |████████████████████████████████| 229 kB 2.4 MB/s 
Installing collected packages: libFuzzer
Successfully installed libFuzzer-0.0.2

And tried to rerun przemek@przemek-VirtualBox:~/mbedtls$ programs/fuzz/fuzz_x509crt clusterfuzz-testcase-minimized-fuzz_x509crt-666605083466137 but still nothing happened.

BTW it seems that there is a wrong location in readme file: Finally, you can run the targets like ./test/fuzz/fuzz_client it seems this should refer to programs not test ?

gilles-peskine-arm commented 1 year ago

You don't need oss-fuzz to use the fuzzing that are in the mbedtls source tree. Just build the library. Since the bug is a memory leak, you need to build with Asan enabled: otherwise nothing tries to detect the memory leak. The options you passed to make should be enough to reproduce the error. Did you try running make clean and then make CFLAGS=... LDFLAGS=... again?

mprse commented 1 year ago

Yes, I rebuild several times with make clean before each build (with default and full configs) and development and https://github.com/Mbed-TLS/mbedtls/commit/97edeb4fb8a9890a8511363f55d5f27f29c4577a branches and ensured that flags are set CFLAGS by echoing $ASAN_CFLAGS.

programs/fuzz/fuzz_x509crt does not provide main function (there is only fuzz target) and I'm not sure how can this work?

Edit: The main is defined in onefile.c .

davidhorstmann-arm commented 1 year ago

This is unrelated to reproducing the bug but I think I have an idea what could be causing this.

Looking at the stack trace, it looks like:

x509_get_authority_key_id() is getting an allocated sequence for the subject alt name here which it puts into crt.authority_key_id.authorityCertIssuer.
If there is an error with parsing, this is passed back up the call chain to x509_crt_der_parse_core() which sees it and calls mbedtls_x509_crt_free().
But mbedtls_x509_crt_free() does not free the crt.authority_key_id.authorityCertIssuer member.

It is caused in part because mbedtls_x509_get_subject_alternative_name_ext() doesn't free the allocated sequence on error. Designing the function this way has lead to at least one previous memory leak and I have coincidentally put a fix up as #7581.

However, this case is really supposed to be dealt with by the call to mbedtls_x509_crt_free() whenever an error happens, so freeing the authority_key_id member properly there would be the true fix.

davidhorstmann-arm commented 1 year ago

If the above is correct, it should memory-leak when parsing any authorityCertIssuer with more than one GeneralName unless there's something I'm missing..

mprse commented 1 year ago

Thanks @davidhorstmann-arm. Can you paste commands that that you have used to build library and reproduce the memory leak failure?

davidhorstmann-arm commented 1 year ago

Actually I haven't reproduced it locally, I was just looking at the code and the stacktrace in the issue description. I can have a go at reproducing it this afternoon if that would help.

athoelke commented 1 year ago

It is caused in part because mbedtls_x509_get_subject_alternative_name_ext() doesn't free the allocated sequence on error. Designing the function this way has lead to at least one previous memory leak and I have coincidentally put a fix up as #7581.

However, this case is really supposed to be dealt with by the call to mbedtls_x509_crt_free() whenever an error happens, so freeing the authority_key_id member properly there would be the true fix.

7581 is incomplete, because the sequence has to be freed as part of `mbedtls_x509_crt_free()`, even on success.

Depending on how mbedtls_x509_get_subject_alt_name_ext() is called, cleaning the sequence up in that function on error may be strictly unnecessary, if all paths involve a crt object and a call to mbedtls_x509_crt_free() on error paths.

athoelke commented 1 year ago

[I identified the issue just from code review and the asan report - as per @davidhorstmann-arm]

mprse commented 1 year ago

Yes I finally reproduced it locally. I was missing input file. @davidhorstmann-arm is right mbedtls_asn1_sequence_free(authority_key_id->authorityCertIssuer.next); is missing on mbedtls_x509_get_subject_alt_name_ext() failure.

davidhorstmann-arm commented 1 year ago

7581 is incomplete, because the sequence has to be freed as part of mbedtls_x509_crt_free(), even on success.

Ah, you're right, thanks!

Depending on how mbedtls_x509_get_subject_alt_name_ext() is called, cleaning the sequence up in that function on error may be strictly unnecessary, if all paths involve a crt object and a call to mbedtls_x509_crt_free() on error paths.

That's correct, and in the past we did not have these functions clean up their objects upon error until we changed one in #6391 after it was misused.

davidhorstmann-arm commented 1 year ago

@mprse I wasn't sure if you were working on a fix already, but I've drafted one (untested) in #7593. It seems to be a one-line fix, feel free to supersede it if you have one already, but I continue with it if that would help.

mprse commented 1 year ago

I started working on it. I reproduced the memory leak in test and confirmed that adding mbedtls_asn1_sequence_free(authority_key_id->authorityCertIssuer.next); on mbedtls_x509_get_subject_alt_name_ext() failure fixes the problem. I still need to analyse the code as this might not be a complete fix as @athoelke suggested, but I wonder why we don't have memory leak on successful parsing if mbedtls_x509_crt_free does not free authorityCertIssuer sequence. Also need to generate certificate with more than one authorityCertIssuer (not sure if this is even possible) and corrupt the second entry to cover this case.

mprse commented 1 year ago

Looking at rfc5280 authorityCertIssuer is of GeneralNames which is the sequence of GeneralName type. In that case I agree that we need https://github.com/Mbed-TLS/mbedtls/pull/7581 + freeing authorityCertIssuer sequence in mbedtls_x509_crt_free().

We currently don't have memory leak in positive tests, because we have only test cases with single authorityCertIssuer name. I'm not sure if there can be more than one name or how to generate such certificate using open_ssl. Now the following extension rule is used:

authorityKeyId_subjectKeyId.crt.der:
    $(OPENSSL) req -x509 -nodes -days 7300 -key server2.key -outform DER -out $@ -config authorityKeyId_subjectKeyId.conf -extensions 'v3_req'

[v3_req]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer:always

Can someone more familiar with certificates give a little help here?

mprse commented 1 year ago

I analysed the certificate used in fuzz test that caused memory leak. It is built as follows (AuthorityKeyIdentifier part):

Bytes	Tags	Len	Comment
304A	MBEDTLS_ASN1_CONSTRUCTED \| MBEDTLS_ASN1_SEQUENCE	4A(74)	AuthorityKeyIdentifier sequence
8014	MBEDTLS_ASN1_CONTEXT_SPECIFIC	14(20)	keyIdentifier
2020202020202020202020202020202020202020	-	-	keyIdentifier data
A12A	MBEDTLS_ASN1_CONTEXT_SPECIFIC \| MBEDTLS_ASN1_CONSTRUCTED \| 1	2A(42)	authorityCertIssuer(General Names)
A425	MBEDTLS_ASN1_CONSTRUCTED \| MBEDTLS_ASN1_CONTEXT_SPECIFIC \| MBEDTLS_X509_SAN_DIRECTORY_NAME	25(37)	First name of directory name type
30233110300E0603202020130720202020202020310F300D06032020201306202020202020	-	-	First name data
8200	MBEDTLS_ASN1_CONTEXT_SPECIFIC \| MBEDTLS_X509_SAN_DNS_NAME	0	Second name of DNS name type (alloc memory for structure for the new name)
2020	_MBEDTLS_ASN1CONSTRUCTED	20(32)	Third name of unknown type and length exceeding space for authorityCertIssuer
20202020202020...	-	-	Remaining data

Can I use this corrupted certificate used in fuzz tests as our test input?

gilles-peskine-arm commented 1 year ago

We can use the data from the fuzzer as a non-regression test. But it's better to also have a test case that is specifically constructed to trigger the specific error path. Data from a fuzzer often matches several error conditions, so it can be fragile: maybe later it won't trigger that particular error path anymore. Also data from a fuzzer can be harder to debug if the test case fails after some refactoring attempt.

mprse commented 1 year ago

authorityCertIssuer field in the certificate generated using open ssl holds the copy of the certificate Issuer (of Name type) represented as sequence of directoryNames (of Name type):

In the malformed certificate from fuzz test authorityCertIssuer consists of 3 names:

directoryName
dNSName
undefined type

Such malformation is very hard to achieve. It is not just single byte malformation but rewriting cert with some additional names in authorityCertIssuer with compliance in the sizes of individual elements in the tree.

I'm wondering if this patch is even needed for this particular case for authorityCertIssuer if only single entry of directoryName sequence is possible, but as mbedtls_x509_get_subject_alt_name_ext is universal function for parsing generalNames it is needed. I will use the fuzz cert for a non-regression test.

mprse commented 1 year ago

It looks like freeing authorityCertIssuer sequence in mbedtls_x509_crt_free is sufficient. It is always called on failuer while parsing AuthorityKeyIdentifier. mbedtls_x509_get_subject_alt_name_ext is also used to parse subject altnames for both crt and csr and in both cases on failure free routine is called and releases sequence of subject altnames.

Mbed-TLS / mbedtls

Memory leak in SubjectKeyId/AuthorityKeyId extraction #7576

7581 is incomplete, because the sequence has to be freed as part of `mbedtls_x509_crt_free()`, even on success.

7581 is incomplete, because the sequence has to be freed as part of `mbedtls_x509_crt_free()`, even on success.

Mbed-TLS / mbedtls

Memory leak in SubjectKeyId/AuthorityKeyId extraction #7576

7581 is incomplete, because the sequence has to be freed as part of mbedtls_x509_crt_free(), even on success.

7581 is incomplete, because the sequence has to be freed as part of mbedtls_x509_crt_free(), even on success.

7581 is incomplete, because the sequence has to be freed as part of `mbedtls_x509_crt_free()`, even on success.

7581 is incomplete, because the sequence has to be freed as part of `mbedtls_x509_crt_free()`, even on success.