intel / tinycbor

Concise Binary Object Representation (CBOR) Library
MIT License
502 stars 187 forks source link

[Bug] Use after free in TinyCBOR -> json2cbor() #259

Closed Samsung-PSIRT closed 1 week ago

Samsung-PSIRT commented 1 week ago

Issue: a pointer to input buffer is stored in two places, then in one of them memory is re-alloced, but old pointer survives in the second place, then used. This causes access to the memory that is not already valid. Source code: tinycbor/tools/json2cbor/json2cbor.c

Sanitizer says:

==3238731==ERROR: AddressSanitizer: heap-use-after-free on address 0x502000004cd0 at pc 0x5555556f2d75 bp 0x7fffffffd500 sp 0x7fffffffd4f8
READ of size 1 at 0x502000004cd0 thread T0
    #0 0x5555556f2d74 in extract_number tinycbor/tinycbor/src/extract_number_p.h:55:38
    #1 0x5555556f2a19 in cbor_encoder_close_container_checked tinycbor/tinycbor/src/cborencoder_close_container_checked.c:69:11
    #2 0x5555556f0427 in decode_json tinycbor/tinycbor/tools/json2cbor/json2cbor.c:361:16
    #3 0x5555556deb3a in LLVMFuzzerTestOneInput fuzz/TinyCBOR/harness_TinyCBOR.cpp:261:21
    #4 0x5555555edfda in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) crtstuff.c
    #5 0x5555555ed5e9 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) crtstuff.c
    #6 0x5555555ef1d2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #7 0x5555555ef6f0 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #8 0x5555555dc0c5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) crtstuff.c
    #9 0x555555607676 in main (TinyCBOR/harness_TinyCBOR_libFuzzer+0xb3676) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)
    #10 0x7ffff7a261c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #11 0x7ffff7a2628a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #12 0x5555555d0984 in _start (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0x7c984) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)

0x502000004cd0 is located 0 bytes inside of 5-byte region [0x502000004cd0,0x502000004cd5)
freed by thread T0 here:
    #0 0x5555556a58b0 in realloc (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0x1518b0) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)
    #1 0x5555556f005a in decode_json tinycbor/tinycbor/tools/json2cbor/json2cbor.c:334:34
    #2 0x5555556f036c in decode_json tinycbor/tinycbor/tools/json2cbor/json2cbor.c:357:19
    #3 0x5555556deb3a in LLVMFuzzerTestOneInput fuzz/TinyCBOR/harness_TinyCBOR.cpp:261:21
    #4 0x5555555edfda in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) crtstuff.c
    #5 0x5555555ed5e9 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) crtstuff.c
    #6 0x5555555ef1d2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #7 0x5555555ef6f0 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #8 0x5555555dc0c5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) crtstuff.c
    #9 0x555555607676 in main (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0xb3676) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)
    #10 0x7ffff7a261c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #11 0x7ffff7a2628a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #12 0x5555555d0984 in _start (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0x7c984) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)

previously allocated by thread T0 here:
    #0 0x5555556a5493 in malloc (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0x151493) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)
    #1 0x5555556deaaf in LLVMFuzzerTestOneInput fuzz/TinyCBOR/harness_TinyCBOR.cpp:259:24
    #2 0x5555555edfda in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) crtstuff.c
    #3 0x5555555ed5e9 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) crtstuff.c
    #4 0x5555555ef1d2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #5 0x5555555ef6f0 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) crtstuff.c
    #6 0x5555555dc0c5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) crtstuff.c
    #7 0x555555607676 in main (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0xb3676) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)
    #8 0x7ffff7a261c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #9 0x7ffff7a2628a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #10 0x5555555d0984 in _start (fuzz/TinyCBOR/harness_TinyCBOR_libFuzzer+0x7c984) (BuildId: 2e42cc6752f939bc4e9c815a57c26ef09514c0cb)

SUMMARY: AddressSanitizer: heap-use-after-free tinycbor/tinycbor/src/extract_number_p.h:55:38 in extract_number
Shadow bytes around the buggy address:
  0x502000004a00: fa fa 05 fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x502000004a80: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x502000004b00: fa fa fd fa fa fa fd fa fa fa fd fa fa fa 05 fa
  0x502000004b80: fa fa 04 fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x502000004c00: fa fa fd fa fa fa fd fa fa fa 05 fa fa fa 04 fa
=>0x502000004c80: fa fa 05 fa fa fa 05 fa fa fa[fd]fa fa fa fa fa
  0x502000004d00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000004d80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000004e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000004e80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x502000004f00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3238731==LeakSanitizer has encountered a fatal error.
==3238731==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==3238731==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
MS: 0 ; base unit: 0000000000000000000000000000000000000000
0x5b,0x36,0x2e,0x34,0x5d,
[6.4]
artifact_prefix='./'; Test unit written to ./crash-9a7c570f506fb55b4166b7dcfada2e08cee83d79

The file crash-9a7c570f506fb55b4166b7dcfada2e08cee83d79 contains the string "[6.4]" which is JSON array of 1 element of type double and value 6.4.

To reproduce:

std::string json = "[6.4]";
cJSON *doc = cJSON_ParseWithOpts(json.c_str(), NULL, true); // by the way, Memory Sanitizer always complains about use of uninitialized memory in strlen() here which happens because of std::string implementation (which we cannot change)

extern "C" CborError decode_json(cJSON *json, CborEncoder *encoder);
extern uint8_t* buffer;
extern size_t buffersize;

    // encode as CBOR
    // CBOR_API void cbor_encoder_init(CborEncoder *encoder, uint8_t *buffer, size_t size, int flags);
    CborEncoder encoder;
    buffersize = json.size();
    buffer = (uint8_t*)malloc(buffersize);
    cbor_encoder_init(&encoder, buffer, buffersize, 0);
    CborError err = decode_json(doc, &encoder); // may re-allocate 'buffer'
    cJSON_Delete(doc);
    if (CborNoError != err) {
        free(buffer);
        return -1;
    }

Further logic is a bit complicated:

CborError decode_json(cJSON *json, CborEncoder *encoder)
{
    CborEncoder container;
    CborError err;
    cJSON *item;

    switch (json->type) {
    case cJSON_False:
    case cJSON_True:
        return cbor_encode_boolean(encoder, json->type == cJSON_True);

    case cJSON_NULL:
        return cbor_encode_null(encoder);

    case cJSON_Number:
        if ((double)json->valueint == json->valuedouble)
            return cbor_encode_int(encoder, json->valueint);
encode_double:
        // the only exception that JSON is larger: floating point numbers
        container = *encoder;   // save the state
        err = cbor_encode_double(encoder, json->valuedouble);

        if (err == CborErrorOutOfMemory) {
            buffersize += 1024;
            uint8_t *newbuffer = realloc(buffer, buffersize);
            if (newbuffer == NULL)
                return err;

            *encoder = container;   // restore state
            encoder->ptr = newbuffer + (container.ptr - buffer);
            encoder->end = newbuffer + buffersize;
            buffer = newbuffer;
            goto encode_double;
        }
        return err;

    case cJSON_String:
        return cbor_encode_text_stringz(encoder, json->valuestring);

    default:
        return CborErrorUnknownType;

    case cJSON_Array:
        err = cbor_encoder_create_array(encoder, &container, get_cjson_size_limited(json)); <<<<<<<<<< SAVE OLD POINTER: container.ptr and encoder->ptr point to the same memory area
        if (err)
            return err;
        for (item = json->child; item; item = item->next) {
            err = decode_json(item, &container); <<<<<<<<<< HERE OLD MEMORY IS FREED AND NEW ONE GOES TO container
            if (err)
                return err;
        }
        return cbor_encoder_close_container_checked(encoder, &container); <<<<< BUG HERE: THIS FUNCTION USES OLD POINTER

Here decode_json() recursively calls itself to decode the contents of array, which is double value 6.4. While doing this, 9 bytes are required, which is less than 6 bytes initially allocated for "[6.4]" string, thus, memory extension happens:

CborError decode_json(cJSON *json, CborEncoder *encoder)
{
    CborEncoder container;
    CborError err;
    cJSON *item;

    switch (json->type) {
    case cJSON_False:
    case cJSON_True:
        return cbor_encode_boolean(encoder, json->type == cJSON_True);

    case cJSON_NULL:
        return cbor_encode_null(encoder);

    case cJSON_Number:
        if ((double)json->valueint == json->valuedouble)
            return cbor_encode_int(encoder, json->valueint);
encode_double:
        // the only exception that JSON is larger: floating point numbers
        container = *encoder;   // save the state
        err = cbor_encode_double(encoder, json->valuedouble);

        if (err == CborErrorOutOfMemory) {
            buffersize += 1024;
            uint8_t *newbuffer = realloc(buffer, buffersize); <<<<<<<<<<< THIS realloc INVALIDATES OLD MEMORY
            if (newbuffer == NULL)
                return err;

            *encoder = container;   // restore state
            encoder->ptr = newbuffer + (container.ptr - buffer);
            encoder->end = newbuffer + buffersize;
            buffer = newbuffer;
            goto encode_double;
        }
        return err;

Thus in the line return cbor_encoder_close_container_checked(encoder, &container); variable container already hosts a new memory pointer, while encoder still keeps the old one. The next call:

CborError cbor_encoder_close_container_checked(CborEncoder *encoder, const CborEncoder *containerEncoder)
{
    const uint8_t *ptr = encoder->ptr; <<<<<<<<<< USES POINTER THAT IS ALREADY INVALID
    CborError err = cbor_encoder_close_container(encoder, containerEncoder);
    if (containerEncoder->flags & CborIteratorFlag_UnknownLength || encoder->end == NULL)
        return err;

    /* check what the original length was */
    uint64_t actually_added;
    err = extract_number(&ptr, encoder->ptr, &actually_added); <<<<<<<< ACCESS TO DE-ALLOCATED MEMORY HERE
    if (err)
        return err;

    if (containerEncoder->flags & CborIteratorFlag_ContainerIsMap) {
        if (actually_added > SIZE_MAX / 2)
            return CborErrorDataTooLarge;
        actually_added *= 2;
    }
    return actually_added == containerEncoder->added ? CborNoError :
           actually_added < containerEncoder->added ? CborErrorTooManyItems : CborErrorTooFewItems;
}

and finally crash here:

static CborError extract_number(const uint8_t **ptr, const uint8_t *end, uint64_t *len)
{
    uint8_t additional_information = **ptr & SmallValueMask; <<<<<<<<<< **ptr READS MEMORY THAT IS ALREADY UNAVAILABLE
    ++*ptr;
    if (additional_information < Value8Bit) {
        *len = additional_information;
        return CborNoError;
    }
    if (unlikely(additional_information > Value64Bit))
        return CborErrorIllegalNumber;

    size_t bytesNeeded = (size_t)(1 << (additional_information - Value8Bit));
    if (unlikely(bytesNeeded > (size_t)(end - *ptr))) {
        return CborErrorUnexpectedEOF;
    } else if (bytesNeeded == 1) {
        *len = (uint8_t)(*ptr)[0];
    } else if (bytesNeeded == 2) {
        *len = get16(*ptr);
    } else if (bytesNeeded == 4) {
        *len = get32(*ptr);
    } else {
        *len = get64(*ptr);
    }
    *ptr += bytesNeeded;
    return CborNoError;
}

Fix recommendations: synchronize pointers in all places where decode_json() is called, for example here:

    case cJSON_Array:
        err = cbor_encoder_create_array(encoder, &container, get_cjson_size_limited(json));
        if (err)
            return err;
        for (item = json->child; item; item = item->next) {
            size_t offset = encoder->ptr - buffer; <<<<<<<<<<<<<<<<< ADD
            err = decode_json(item, &container);
            encoder->ptr = buffer + offset; <<<<<<<<<<<<<<<<< ADD
            encoder->end = buffer + buffersize; <<<<<<<<<<<<<<<<< ADD
            if (err)
                return err;

Note:

thiagomacieira commented 1 week ago

json2cbor is just a tool and therefore not subject to the same security guarantees as the library itself. Therefore, I consider this a regular bug.

I can't reproduce with a simple file and valgrind:

$ cat /tmp/src.json 
[6.4]
$ valgrind ./bin/json2cbor /tmp/src.json | xxd 
==1142061== Memcheck, a memory error detector
==1142061== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==1142061== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==1142061== Command: ./bin/json2cbor /tmp/src.json
==1142061== 
==1142061== 
==1142061== HEAP SUMMARY:
==1142061==     in use at exit: 0 bytes in 0 blocks
==1142061==   total heap usage: 8 allocs, 8 frees, 83,558 bytes allocated
==1142061== 
==1142061== All heap blocks were freed -- no leaks are possible
==1142061== 
==1142061== For lists of detected and suppressed errors, rerun with: -s
==1142061== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
00000000: 81fb 4019 9999 9999 999a                 ..@.......
thiagomacieira commented 1 week ago

I don't understand your logic. You're saying:

Here decode_json() recursively calls itself to decode the contents of array, which is double value 6.4. While doing this, 9 bytes are required, which is less than 6 bytes initially allocated for "[6.4]" string, thus, memory extension happens:

Indeed and that's clearly marked in the code you pasted: https://github.com/intel/tinycbor/blob/26c63e3d5977f77a6483edde4519489254670375/tools/json2cbor/json2cbor.c#L325-L341

Thus in the line return cbor_encoder_close_container_checked(encoder, &container); variable container already hosts a new memory pointer, while encoder still keeps the old one.

Well, no, that's impossible. As you can see in the code above, we restore encoder from the old state that was saved in container, then loop back in the goto statement. When that happens, container is saved from encoder, so at that point both variables are identical again.

What's more, we do not call cbor_encoder_close_container_checked at all in this recursion level. We do return CborNoError and the previous level does call that function.

thiagomacieira commented 1 week ago

Modifying the section of code to ensure the buffers always change:

        if (err == CborErrorOutOfMemory) {
            buffersize += 1024;
            uint8_t *newbuffer = malloc(buffersize);
            if (newbuffer == NULL)
                return err;
            memcpy(newbuffer, buffer, buffersize - 1024);
            free(buffer);

Before that free, in my debugger, I see:

(gdb) p buffer
$1 = (uint8_t *) 0x4204a0 "\2016.4]\n"
(gdb) p newbuffer
$2 = (uint8_t *) 0x41f530 "\2016.4]\n"

By the time we reach call to cbor_encode_double again, encoder->data.ptr and container.data.ptr are 0x41f531, so they're pointing to the new buffer. After the return, when we get back to the outer level, the caller's container.data.ptr is 0x41f53a, so pointing to the new buffer.

It calls cbor_encoder_close_container_checked, which does not read the first parameter, only writes to it: https://github.com/intel/tinycbor/blob/26c63e3d5977f77a6483edde4519489254670375/src/cborencoder.c#L574-L590

thiagomacieira commented 1 week ago

Uh, the code you pasted for cbor_encoder_close_container_checked is very different than what's in the library today. Looks like you're missing cc2bfbb20954f0be9237624b53feca0e27e88f72, which was included in release 0.5.0.

Please upgrade, your version is too old.