ludocode / mpack

MPack - A C encoder/decoder for the MessagePack serialization format / msgpack.org[C]
MIT License
521 stars 82 forks source link

mpack crashes without a matching call to mpack_complete_* #88

Open cataphract opened 2 years ago

cataphract commented 2 years ago

Taking the example from the README:

#include <stdlib.h>
#include <stdio.h>
#include <mpack.h>

int main()
{
    char* data;
    size_t size;
    mpack_writer_t writer;
    mpack_writer_init_growable(&writer, &data, &size);

    // write the example on the msgpack homepage
    mpack_build_map(&writer);
    mpack_write_cstr(&writer, "compact");
    mpack_write_bool(&writer, true);
    mpack_write_cstr(&writer, "schema");
    mpack_write_uint(&writer, 0);
    //mpack_complete_map(&writer);

    // finish writing
    if (mpack_writer_destroy(&writer) != mpack_ok) {
        fprintf(stderr, "An error occurred encoding the data!\n");
        return 1;
    }

    // use the data
    printf("%.*s", (int)size, data);
    free(data);
    return 0;
}
glopes /tmp $ gcc -o t -g -I/tmp a.c mpack.c && valgrind -q ./t
==14417== Invalid free() / delete / delete[] / realloc()
==14417==    at 0x4C37DAD: realloc (vg_replace_malloc.c:1192)
==14417==    by 0x108D37: mpack_realloc (mpack.h:1842)
==14417==    by 0x10ABAB: mpack_growable_writer_teardown (mpack.c:1208)
==14417==    by 0x10B420: mpack_writer_destroy (mpack.c:1499)
==14417==    by 0x108C95: main (a.c:21)
==14417==  Address 0x52360b0 is 48 bytes inside a block of size 4,096 alloc'd
==14417==    at 0x4C32FB5: malloc (vg_replace_malloc.c:380)
==14417==    by 0x10DC1F: mpack_builder_begin (mpack.c:2470)
==14417==    by 0x10DCDF: mpack_builder_build (mpack.c:2495)
==14417==    by 0x10E1EC: mpack_build_map (mpack.c:2711)
==14417==    by 0x108C32: main (a.c:13)
An error occurred encoding the data!

This makes it impossible to use error chaining. In this case, I omitted the call to mpack_complete_map altogether, but it can also be skipped because of an earlier error. In fact, it's a problem even with:

mpack_build_map(&writer);
mpack_write_cstr(&writer, "foo");
mpack_complete_map(&writer);
mpack_writer_destroy(&writer);
ludocode commented 2 years ago

Thanks for the bug report. It should be fixed now. The writer will clean up properly when destroyed with an open builder. I don't have great tests for it at the moment (or for any of the builder code for that matter.) This will hopefully improve later. Thanks for using the builder, and please report any more bugs you find.

If you run your test with the latest code, the call to mpack_writer_destroy() returns mpack_error_bug after cleaning everything up. This indicates there is a bug in the calling code, in this case the missing call to mpack_complete_map(). You can print the error like this:

    // finish writing
    mpack_error_t error = mpack_writer_destroy(&writer);
    if (error != mpack_ok) {
        fprintf(stderr, "An error occurred encoding the data! %s (%i)\n",
                mpack_error_to_string(error), error);
        return 1;
    }

This will now print:

An error occurred encoding the data! mpack_error_bug (8)

Valgrind will report no leaks or other problems.

If you additionally build with -DDEBUG, the write tracking will know that you have an unclosed map so it will assert in the call to mpack_writer_destroy() and print:

mpack breakpoint hit at src/mpack/mpack-common.c:583
unclosed mpack_type_map

Valgrind will print the abort and may also print leaks since it doesn't clean up when it aborts.

You can build a new amalgamation package from the latest source with tools/amalgamate.sh. I might do a point release at some point but I'm not in a rush to do so because this bug can only occur due to another bug in the calling code.

cataphract commented 2 years ago

I might do a point release at some point but I'm not in a rush to do so because this bug can only occur due to another bug in the calling code.

Thanks for fixing this. I'd just push back a bit on this description. IIUC, an error can occur doing the serialization (e.g. an I/O error), that will short-circuit mpack_complete_map even if it's subsequently called and in that case the call to destroy would still issue an invalid free.

ludocode commented 2 years ago

Hmm. You're partially right. An error can't happen from I/O; it can only happen from a malloc failure. When a builder is open all writes are diverted to a buffer because we can't output anything until we figure out the size of the container being built. If a malloc does fail though, it will indeed flag an error and short circuit subsequent calls to mpack_complete_map(), leaking the builder and causing the growable writer to crash. I am intending to fully support malloc failures so this is a serious bug; I'll need to make a patch release soon.

There is another bug that can occur though if the caller is using longjmp for error handling. The I/O happens in mpack_complete_map() (via mpack_builder_resolve()) when closing the final container. This can flag an error. Nothing in mpack_builder_resolve() is short circuited on an error though, so it will still walk through all of its builder pages, attempt to write them (ignored due to error), then free them... unless the error handler longjmps out. This is also something I am intending to support properly.

I'm not sure of the proper fix here yet, but most likely I will just defer the callback until after the writer has walked all its pages. The reader does something similar when reading into an allocation.

In any case I am reopening this issue. I need to fix this longjmp bug, plus I need a lot more unit tests to make sure these bugs don't crop up again. At the very least the builder needs tests to make sure it handles any malloc failure properly.