ludocode / mpack

MPack - A C encoder/decoder for the MessagePack serialization format / msgpack.org[C]
MIT License
521 stars 82 forks source link

builder page allocation leak and segfault #94

Open momentarylapse opened 2 years ago

momentarylapse commented 2 years ago

hi, I ran into a segfault with the following code:

mpack_writer_t writer;

char *output_data = nullptr;
size_t output_size = 0;
mpack_writer_init_growable(&writer, &output_data, &output_size);

const int N_OUTER = 10;
const int N_INNER = 100;

mpack_start_array(&writer, N_OUTER);
for (int k=0; k<N_OUTER; k++) {
    mpack_build_map(&writer);
    mpack_write_str(&writer, "key", 3);
    mpack_write_u64(&writer, k*k);
    mpack_write_str(&writer, "value", 5);
    mpack_start_array(&writer, N_INNER);
    for (int i=0; i<N_INNER; i++) {
        mpack_build_map(&writer);
        mpack_write_str(&writer, "value0", 6);
        mpack_write_i64(&writer, i);
        mpack_write_str(&writer, "value1", 6);
        mpack_write_i64(&writer, i*k);
        mpack_complete_map(&writer);
    }
    mpack_finish_array(&writer);
    mpack_complete_map(&writer); // <- segfault here
}
mpack_finish_array(&writer);

mpack_writer_destroy(&writer);

At the end of the 5th outer loop, the call to mpack_complete_map() will corrupt the internal memory of the builder, resulting in a segfault. Some mild debugging traced the issue back to the page allocation mechanism. According to the logs, we might be freeing one page and using the next without initializing it.

The problem vanishes when writing different integer values. It seems to be triggered only by very specific alignment conditions.

I would appreciate it ,if you could confirm the findings or give me a clue of any mismatching calls in my code. Thanks!

ludocode commented 2 years ago

Fixed, sorry about this. It's an unfortunate off-by-one bug which is triggered on the rare occasion that a build ends up exactly at the end of a 4096 page boundary. I'm a little surprised I never encountered this bug in my own projects; I guess I tend to keep my builds fairly small so my nested builds never roll over a page.

I'm having trouble crafting a unit test that will reproduce this. The tests currently set a very small page size to test various wrapping behaviors but this size makes it impossible for a build to be aligned at the end. I'll have to make the page size vary by unit test. I will keep this open until it's properly unit tested.

pizzard commented 2 years ago

Thanks for the very quick fix! can't we add the current example above to the unit tests as a regression test, at least until a more specific way to test this has been implemented?

ludocode commented 2 years ago

Your example doesn't trigger the bug in the unit test suite because the unit tests use small page sizes to test various wrapping behaviors. I need to change the unit test runner to make the page size configurable to properly test this. It also needs to be optimized and simplified. The unit test suite is run under Valgrind so unit tests need to be as fast as possible.

momentarylapse commented 2 years ago

Yes, that was quick. And now, everything works perfectly. Thanks a lot for your great work!

ollbx commented 1 year ago

I think I stumbled over this, using v1.1. I haven't confirmed it, but the problem seems very similar and it was fixed by switching to the develop branch. It would be great, if this could be fixed with a new stable release.

ludocode commented 1 year ago

Release v1.1.1 is out. Sorry for the delay. I'll leave this open because I still want to improve the unit tests for this.