Size prefixed buffers with large alignment (e.g. vectors of doubles) do not pass verifier

bjornharrtell commented 2 years ago

I'm unable to find this case exercised in the monster_test suite and I'm unsure about what is the correct way to use the API.

I've tried these two variants:

int gen_flatgeobuf1(flatcc_builder_t *B)
{
    double xy[] = { 0, 1 };
    //double z[] = { 0 };

    FlatGeobuf_Geometry_ref_t geometry;

    flatcc_builder_reset(B);

    flatbuffers_double_vec_ref_t xyRef = flatbuffers_double_vec_create(B, xy, 2);
    //flatbuffers_double_vec_ref_t zRef = flatbuffers_double_vec_create(B, z, 1);

    ns(Geometry_start(B));
    ns(Geometry_xy_add(B, xyRef));
    //ns(Geometry_z_add(B, zRef));
    geometry = ns(Geometry_end(B));

    ns(Feature_start_as_root_with_size(B));
    ns(Feature_geometry_add(B, geometry));
    ns(Feature_end_as_root(B));
    return 0;
}

int gen_flatgeobuf2(flatcc_builder_t *B)
{
    FlatGeobuf_Geometry_ref_t geometry;

    flatcc_builder_reset(B);

    ns(Geometry_start(B));
    ns(Geometry_xy_start(B));
    ns(Geometry_xy_push_create(B, 1));
    ns(Geometry_xy_push_create(B, 2));
    ns(Geometry_xy_end(B));
    //ns(Geometry_z_start(B));
    //ns(Geometry_z_push_create(B, 1));
    //ns(Geometry_z_end(B));
    geometry = ns(Geometry_end(B));

    ns(Feature_start_as_root_with_size(B));
    ns(Feature_geometry_add(B, geometry));
    ns(Feature_end_as_root(B));
    return 0;
}

Commenting in the lines in any of the two methods results in verifier error "vector header out of range or unaligned".

I have the full test case added in this branch https://github.com/bjornharrtell/flatcc/tree/flatgeobuftest.

(the schema etc is from my project https://github.com/flatgeobuf/flatgeobuf)

mikkelfj commented 2 years ago

First of all, make sure the buffer you verify is on an sufficiently aligned memory location - it likely is given your findings from your previous issue.

Try to move content creation to after start_as_root, and try to check return codes, e.g. null refs.

Flatcc actually has a start buffer call, but it is included in the _as_root call. But technically you should not create any data like subtables and vectors before the buffer has been started, although I think it usually works, at least in simple cases, though not for nested buffers. As such I am not sure if this is the problem. You can also start the buffer, then create the vectors and subtables, then the table, then using the table ref in the buffer end call to make it root.

BTW: I have not read the schema, but if xy is a vector, how does push_create work without an external vector argument? Is is supposed to be filled in afterwards (I know I wrote the code, but that was before several world changing events).

EDIT: never mind the the BTW: it's past bedtime. You of course push elements the vector, not vectors.

bjornharrtell commented 2 years ago

Hmm... I tried fiddling around with the order but to no avail. I had assumed it was depth first like C++ but you are saying it's not? Fx. just modifying the second example to the following:

int gen_flatgeobuf2(flatcc_builder_t *B)
{
    FlatGeobuf_Geometry_ref_t geometry;

    flatcc_builder_reset(B);

    ns(Feature_start_as_root_with_size(B));

    ns(Geometry_start(B));
    ns(Geometry_xy_start(B));
    ns(Geometry_xy_push_create(B, 1));
    ns(Geometry_xy_push_create(B, 2));
    ns(Geometry_xy_end(B));
    //ns(Geometry_z_start(B));
    //ns(Geometry_z_push_create(B, 1));
    //ns(Geometry_z_end(B));
    geometry = ns(Geometry_end(B));

    ns(Feature_geometry_add(B, geometry));
    ns(Feature_end_as_root(B));
    return 0;
}

Makes it fail verification with "vector header out of range or unaligned" even without attempting to add the second vector.

I feel completely stuck so if you got the time please checkout my branch at https://github.com/bjornharrtell/flatcc/tree/flatgeobuftest, it has the full setup including schema and generation (I just added a new testcase to flatcc and made it so that it is the only that is running on ./scripts/test.sh.

mikkelfj commented 2 years ago

I'll have a look, but not right away, maybe sometime next week. It is probably something trivial that I am not seeing right now. The only thing that rubs me a bit is the push_create with scalar elements. I thought create was only for structs and tables, but I probably made it generic, I don't recall.

Are you verifying "with_size", otherwise the problem might be that the size prefix shifts the offset 4 bytes and the double element is 8 bytes and then no longer aligned as the verifier sees it.

As to depth first, flatcc has a stack so you don't have to use depth first, but you can. What confuses this is a bit is that everything must be created inside a buffer context that must be started and ended. So while you can create a vector before or inside a table, you cannot do that before a buffer has been started (in the general case). So why is reset not enough between buffers? Because of nested buffers that can also be started and ended. It is bad to create things outside nested buffers and put them inside. However, in most cases you only have a top-level buffer and then start_as_root / end_as_root both serve to start the buffer and the root table. While technically a violation, I don't think something bad happens if you create data like vectors before start_as_root, and I believe many users do, but I have not formally verified that this is always valid at top-level. That said, it is likely not the problem that you have.

mikkelfj commented 2 years ago

Also, try to run the code without a size prefix and see if that makes a difference.

mikkelfj commented 2 years ago

Also, you do not end_as_root_with_size. The end call should mirror that start call.

bjornharrtell commented 2 years ago

Thanks for the tips @mikkelfj I'll see if I can get anywhere meanwhile... but one thing I immediately note is that there is no end_as_root AFAIK.

mikkelfj commented 2 years ago

What do you mean there is no end as root? In your source, or in the library?

This is copied from your last code example:

  ns(Feature_end_as_root(B));

bjornharrtell commented 2 years ago

Sorry, I meant there is no end_as_root_with_size in the library.

mikkelfj commented 2 years ago

Hmm that could be a problem. I suggest you figure out how start and end buffers explicitly. There should be some examples.

bjornharrtell commented 2 years ago

Doesn't look like it - I get the same issues when rewriting to not use with size i.e Feature_start_as_root - Feature_end_as_root.

bjornharrtell commented 2 years ago

I think I have narrowed down the issue to have something do with arrays of doubles (or possibly scalars with bigger size than bytes?). See https://github.com/dvidelabs/flatcc/pull/211.

mikkelfj commented 2 years ago

Thats interesting.

Would you be able to investigate and possibly recommend a solution? Likely the magic happens either in builder.c or in bad arguments to same.

You can also call the low-level builder commands directly instead of the generated functions. For empty vectors of double that should be relatively straightforward.

bjornharrtell commented 2 years ago

I will surely try as this is blocking https://github.com/postgis/postgis/pull/583.. but unfortunately I lack knowledge of flatbuffers internals so it will likely not be easy for me. Also I find C (especially with lots of macros) difficult to fully grasp.

mikkelfj commented 2 years ago

Yes the macros are opaque. However, the builder.c calls are fairly straightforward. The macros mainly generate the inline functions you call. These typically contain two or three lines of call to builder.c with the proper argument. So if you place a breakpoint in builder.c you can see the arguments given (outside of PostgreSQL of course). include/flatcc/builder.h is commented and explains basic usage. There are vector calls there. The alignment argument should be telling. If alignment is right in the call, but the buffer is badly aligned anyway, then it is a bit complex because alignment is complex.

mikkelfj commented 2 years ago

Note that it could also be a bug in the verifier.

bjornharrtell commented 2 years ago

Hmm ok. I'm very far from knowing what I'm getting into.. :) but.. an immediate reaction is to this part:

https://github.com/dvidelabs/flatcc/blob/a889e487609ec0f2b6aa9b6a920bc8129a8b547f/src/runtime/builder.c#L1380-L1382

... in that the get_min_align and set_min_align does not take elem_size into account.

Update: Hmm nevermind, I see now that it only sets align if it's smaller and I guess align is coming in the call correctly.. but looking into that.

mikkelfj commented 2 years ago

You found the right spot, and debug output could tell if align is correct.

Buffers are written back to front so there is a virtual zero reference last, and then data is added at progressively more negative offsets. It is not necessary to track this in detail because the pushed iov data is just prepended. The only issue is tracking alignment along the way.

The size is the value stored in the elem_size field you refer to, but it is not related to space consumed within the buffer because the push_iov calls prepend data back to front as needed, and this decides the space used.

Essentially the logic works like this: first tell the buffer context what alignment the overall buffer needs because that affects adding zero padding between the buffer header and the following data - and this is where "with_size" makes a difference since the heading becomes larger.

Next the elem_size is computed as the number of bytes that vector payload consumes because that is the FlatBuffer format standard. The write call converts the native value into endian adjusted raw bytes in a local variable. The variable is later added to an iov vector the cause the emitter backend to prepend the data to the temporary buffer structure.

Next front padding is computed. The current start of the buffer is behind the vector we are creating. We need to zero pad in front of the current start so the vector is placed correctly. The front_pad function computes that size. The padding is always limited in size so a prepared zero-buffer is added to the iov vector with the padding size. Then vector content as added. This content should be endian converted already via a temporary stack copy, or a fast track direct copy on little endian platforms (advanced: and also big endian platforms if the users has a little endian copy and make the right calls(. Once the vector content is added, the element size field is added to the iov vector. emit_front then sends all the prepared data to the emitter that copies the data in front of the existing buffer data. The payload can be copied directly from user space when little endian.

There is the possibility that the the element size field will not be properly aligned if the vector elements are smalled the uoffset_t, but min aligment ensures a vector is always aligned to at least uoffset_t, i.e. 4 bytes. This is handled by adjusting the align value before giving it to front_pad.

mikkelfj commented 2 years ago

I think I understand your concern better now.

get_min_align(&align, field_size);

field_size is a local shortcut for sizeof(flatbuffers_uoffset_t). It makes sure the alignment is no less than 4 bytes. The adjusted value is important for both set_min_align and front_pad later on.

set_min_align(B, align);

This only affects the global buffer alignment requirement and does not affect how we create the vector since that global padding is added last, just before writing the buffer header.

vec_pad = front_pad(B, vec_size, align);

This computes how many zeros must follow the vector so the vector is properly aligned. If the vector is aligned to 1, 2, or 3 bytes, then field_size will override and make the alignment of the first element happen on a 4-byte boundary so the preceeding size field will be valid. If the alignment is larger than 4, it it must be a power of two, and thus the size field will also be aligned correctly.

mikkelfj commented 2 years ago

Note that for element sizes up to 4, any bug in the align argument will not be visible due to the above. But for doubles, it will be, and there is a 50% chance the vector will be misaligned, if the align argument is too small.

bjornharrtell commented 2 years ago

@mikkelfj thanks for trying to explain - but it will take a long while before I understand all of it. :)

Meanwhile, I have gotten closer to find out more specifically when this problem occurs. See #213 for a more isolated and minimal example.

mikkelfj commented 2 years ago

Hi @bjornharrtell I think you provided enough date to find the likely cause of the problem. See my comment in #213.

If your specific usecase does not care about storing the buffer aligned, I suggest using the the non-size prefixed version and manually prepend the size of the buffer. The buffer data following the size prefix must then be copied to an aligned location, or alternatively be access unaligned. You can also store an 8 byte prefix to have the data at least be 8 byte aligned.

Also, regardless of this issue, you may want to tail pad the buffer up to a multiple of 8 - this together with an 8 byte size prefix will ensure that stacked buffers are at least 8 byte aligned if the first buffer is. If you need more than 8 byte alignment, you will need to copy data.

mikkelfj commented 2 years ago

That was premature, see my edited comment in #213. I think the buffer is valid and the verifier is wrong, but I'm not 100% sure.

bjornharrtell commented 2 years ago

@mikkelfj your note about tail padding the buffers is interesting and I've thought about it before. It is likely and unfortunately a design flaw in my format (FlatGeobuf) which is sad in that is does stack buffers but do not tailpad. I had made the assumption that because my buffers have scalars of 64 bit size they would not need the padding but I now understand that is not necessarily true if not only for the fact that those 64-bit sized fields are optional.

The reason this fact hasn't bitten me harder yet is that I believe all reader implementation of my format copies out the individual buffers from the stream avoiding alignment issues but that's a bit sad for performance.

mikkelfj commented 2 years ago

You can get the buffers alignment requirement in a call to the builder once it is ready to finalize or just after. The alignment can vary, for example if you mostly use 4 byte values but optionally can add a 64 bit value. We could probably add a call to automatically tailpad to alignment. There appears to be a missing end_as_root_with_size and both these issues would be worth looking into at some point. Notably the size field would / should be affected by tailpadding.

EDIT: there should probably be a flag in the builder context to tell if the finalize call should tailpad.

bjornharrtell commented 2 years ago

@mikkelfj agreed that would be nice. I believe tail padding is also neglected in other language supports including the reference C++ (but I might be wrong).

mikkelfj commented 2 years ago

I think C++ aligns tail to 8 bytes.

bjornharrtell commented 2 years ago

Hmm interesting! :) Oh well.. I still have work to do.. As I wrote I will see if I can confirm your findings about the potential verifier bug by verifying in C++ and I'm not confident we're out of the woods if I ignore the verifier because even though I can get my code to work locally and pass tests it throws up when subjected to ubsan and Valgrind. May be my own fault though. Mabye I will try to add way to run the flatcc build and test with ubsan and see what that turns up.

jobol commented 2 years ago

probably related to #127

mikkelfj commented 8 months ago

I assume we can trust that PR #213 faithfully reproduces the key concern of this issue. If so, the root cause is identified. The verifier needs to have a special with_size variant to deal with size prefixed buffers. See https://github.com/dvidelabs/flatcc/pull/213#issuecomment-1780025784 for details.

mikkelfj commented 8 months ago

To summarize: this core issue is that buffers aligned to 8 bytes or more do not work with size prefixed buffers when these are aligned correctly. The idea was to verify the buffer at the 4-byte offset after the size field, but this shifts alignment so the verifier fails. It would be possible to copy the buffer down 4 bytes and verify but that is not really acceptable.

I added new verifier versions _with_size, and _and_size that correctly deals with this case.

I have also incorporated a modified version of the double vec test case from PR #213 but instead of verifying offset by 4 bytes, which fails in this case, I use the new with_size verifier, which passes verification, and manual buffer inspection looks correct too.

The update is in https://github.com/dvidelabs/flatcc/commit/7db6cf64a7b2335b45f4c390d091262529625a36

dvidelabs / flatcc

Size prefixed buffers with large alignment (e.g. vectors of doubles) do not pass verifier #210