`force_align` on vectors missing (or: how to align vectors in flatcc?)

benvanik commented 3 years ago

[ Summary: flatcc should eventually support force_align on vectors - meanwhile use vectors of structs and force_align structs, or use low level builder calls to create aligned vectors access as discussed below - the problems investigated were eventually decided to be incorrect use of API - low level alignment doesn't work if the buffer isn't started first. ]

Hello! I have some variable-length data that must be aligned on a 16-byte boundary that I'm storing in a [uint8]:

table Buffer {
  data:[uint8];
}

At some point it looks like the google flatbuffers library started supporting force_align on these vector fields, so this works there:

table Buffer {
  data:[ubyte] (force_align: 16);
}

The current flatcc compiler doesn't support this new usage of force_align yet though:

[build] bytecode_module_def.fbs:103:17: error: 'force_align': known attribute not expected in this context

My use requires me being able to ensure alignment for compatibility with DMA operations and SIMD code, but there are also some other big flatbuffers users that are now requiring it: https://github.com/tensorflow/tensorflow/blob/1e66b5790c93a1752fd8ea92a146aff811f2ee95/tensorflow/lite/schema/schema_v3a.fbs

If force_align on vectors could be supported in the flatcc parser/builder that'd be fantastic, but I was also wondering if there was a workaround that could be used (I suspect this isn't the first time alignment has come up as a question :). I tried to directly do this by passing an alignment during construction of the vector but that doesn't seem to do what I expected (16 byte alignment of the data in the flatbuffer):

flatcc_builder_start_vector(fbb, 1, /*align=*/16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* ptr = flatbuffers_uint8_vec_extend(fbb, length);
memcpy(ptr, source_data, length);
flatbuffers_uint8_vec_end(fbb);

I'd expect the flatbuffers_uint8_vec_t when loaded from the buffer to be at a 16 byte offset (or, ignoring file header weirdness, would at least be consistently stride 16) however the pointers I get back seem rather arbitrary (4-, 8-, and 12-byte alignment). I've started trying to use vectors of structs (that do support force_align in flatcc) but that also doesn't seem to do what I expect, but would love to hear if any of you have seen solutions in the past that are known to work. Also I could be doing something very wrong - I just started digging into the internals of flatcc and don't know much yet :)

benvanik commented 3 years ago

it looks like the google flatbuffers compiler turns the force_align into a ForceVectorAlignment before starting the vector: https://github.com/google/flatbuffers/blob/78f0c0d1d96a220163a03174d3240864a6a139da/include/flatbuffers/flatbuffers.h#L1667-L1675

Is there an equivalent on the flatcc builder API?

mikkelfj commented 3 years ago

Hi Ben, to your last question first. Flatcc does not have ForceVectorAlignment equivalent. Alignment is taken from the schema during construction where supported. (EDIT: not entirely true, see my follow up comment). (Though there is force assignment to table field members similar to what flatcdoes to bypass skipping default values, but that is something different.

I don't really recall exactly what is supposed to be supported. You report either suggests a regression or something that hasn't yet been implemented. If it is supported by Googles flatc compiler it certainly should be in flatcc as well.

In either case I fully support adding force_align to vectors, I'm just not sure how much time I can commit to that for the time being.

mikkelfj commented 3 years ago

If this is really important to you, you can call the low level builder interface to create the vector without generated code. This can be freely mixed with other code, you just have to be somewhat careful.

https://github.com/dvidelabs/flatcc/blob/988b149b5e29b95d8c2587fe40ed927d3418a1c7/include/flatcc/flatcc_builder.h#L1407-L1464

Also, if you can help investigate if this is a regression or a missing feature, it would be helpful.

mikkelfj commented 3 years ago

Another way to work around this is to create a struct with the necessary alignment and store that struct in a vector. It slowly comes back now - I think flatc originally skipped vector force_align because you could do this instead.

Using a struct in this way will not physically change the buffer and you can cast from a simple array if you create the vector in one operation from a C array, and likewise when reading.

EDIT: I should read more close:

I've started trying to use vectors of structs (that do support force_align in flatcc) but that also doesn't seem to do what I expect, but would love to hear if any of you have seen solutions in the past that are known to work. Also I could be doing something very wrong - I just started digging into the internals of flatcc and don't know much yet :)

Can you please elaborate on why this doesn't work?

You can also use the verifier to test buffer alignment. (For the same reason the verifier must be updated upon changes to force_align support).

benvanik commented 3 years ago

Thanks for the suggestions! I agree that the support for force_align on vectors is not a regression in flatcc but was just a late addition to flatbuffers at some point - I suspect someone came along and ran into the same thing I did and threw it in there :)

I think I noticed what was throwing me off about my direct flatcc_builder_start_vector use: it is aligning correctly, however everything is offset 8 bytes in the final buffer due to the file identifier. If I strip the first 8 bytes with the identifier then everything is correctly aligned to what I specify. I suppose the question then becomes: is there a way to tell flatcc to pad out the buffers to ensure that alignment is still respected when the file identifier is prefixed? I think even if force_align was supported this may still be an issue that prevents the intended alignment from being met.

Today I'm doing:

file_identifier "FILE";
table MyRoot {
  data:[uint8];
}
root_type MyRoot;

flatcc_builder_init(&fbb);
flatcc_builder_start_vector(&fbb, 1, 16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* p = flatbuffers_uint8_vec_extend(&fbb, 1234);
memset(p, 'x', 1234);
flatbuffers_uint8_vec_ref_t data = flatcc_builder_end_vector(&fbb);
MyTable_start_as_root(&fbb);
MyTable_data_add(&fbb, data);
MyTable_end_as_root(&fbb);
// etc

The output buffer then has 08000000 46494C45 for the identifier and later on the xxxxxxx at a +8 alignment.

Thoughts? And thanks for the help! If I can get something working I'm happy to write it up in the docs so that others coming along can use it too :)

mikkelfj commented 3 years ago

I'd be happy to help you progress on this. The following is an explanation of how things work or are supposed to work, but it probably won't help much. But it is a start:

I think I noticed what was throwing me off about my direct flatcc_builder_start_vector use: it is aligning correctly, however everything is offset 8 bytes in the final buffer due to the file identifier. If I strip the first 8 bytes with the identifier then everything is correctly aligned to what I specify.

This is not correct. The header is included in the alignment. Are you sure the buffer is placed in sufficiently aligned memory? There are subtle differences in the way a buffer is created if you create it with an identifier or with a null identifier (I don't recall all the details up front). The buffer can also be created with an optional size prefix.

The buffer can have a header in a few different ways: there can optionally be a file identifer, and there can optionally be a size prefix. You need to all the verifier in the proper way, and then it will very from the buffer start.

There are some variants start_as_root_with_identifier and with _size.

The header format is : [<size:4 bytes>]<offset to root object:4 bytes>[<file identifier: 4 bytes>]

(personally I think this is a mess, but it is what we have)

It is actually rather complex, but when creating a buffer, flatcc adjusts alignment according these different kinds of headers. Technically it builds the buffer backwards, properly aligned relative to a virtual zero point, and then inserts enough prepadding between the header and data to ensure that everything keeps staying aligned from the buffer start based on the largest alignment requirement seen during construction.

Related, but not very important here: You may also notice that vectors are aligned to their elements and that flatcc reader returns a pointer to the first element, which should be aligned. But the actual vector starts 4 bytes earlier, and this vector header is not ailgned if elements are larger than 4 bytes because the vector size field is 4 bytes.

mikkelfj commented 3 years ago

Here is the padding before the header: https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/runtime/builder.c#L769

mikkelfj commented 3 years ago

Thoughts? And thanks for the help! If I can get something working I'm happy to write it up in the docs so that others coming along can use it too :)

Documentation is always welcome. Buf if you dig sufficiently into this, you might consider implementing support for vector force align. I think it might be relatively easy, albeit hard to grasp.

https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/compiler/symbols.h#L223-L228

Here struct sets the align field https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/compiler/semantics.c#L668-L675

In code generation, the align field might be ignored where not supported, i.e. non structs, but it could also already be supported, since e.g. integers have different alignment requirements already, so merely setting the align field could possibly work.

Complications: The generated verifier must also support this, and the JSON parser must also, but again, if the align field is already inspected, it might happen automatically.

On the other hand, if structs are not aligned correctly in vectors right now, there are bigger problems.

Also important to never reduce alignment below natural alignment. I think this has something to do with it: https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/compiler/semantics.c#L112

benvanik commented 3 years ago

Thank you for explaining (I really appreciate it!) - but I'm realizing that I'm still missing something :)

I'm using flatcc_builder_copy_buffer to get the final builder bytes and fwrite-ing them to a file which then verifies/loads correctly with flatcc so I'm (fairly) sure it's ok. I'm looking at the bytes in the file as written to disk in a hex editor and definitely seeing that they are not 16 byte aligned as I had specified (the 'xxxx's being the data as in the example above): There's dozens of these vectors in the file and all are exactly +8, which is great as it means they are all consistently aligned whereas before using the vector_start with align=16 they were all over the place, but also means I don't understand what you mean about the file header being included in the alignment. Is it possible there's a bug in flatcc's handling of this alignment, or something I need to set to ensure it is included?

I have been hacking around in my own code but if this seems like it should be working then I'll build out a reproducer based on samples/bugreport/ to help isolate whether it's my code producing the buffers or not.

mikkelfj commented 3 years ago

I wrote the following just before receiving your last comment:

I was wondering: there could be a bug in the builder api where it does align vectors correctly but it fails to record the alignment size in the buffer state such that the prepadding works on wrong assumptions. This would also affect 64-bit integers because their alignment are larger than what is minimally required, so it ought to have been detected by now if that were the case.

mikkelfj commented 3 years ago

I don't understand what you mean about the file header being included in the alignment

This means that if a something is aligned to, say, 16 bytes, then that something is placed at multiple of 16 bytes relative to the start of buffer where the header is placed. At least it is supposed to.

benvanik commented 3 years ago

Ok cool - sounds like a reproducer is worth it! I'll put one together now.

mikkelfj commented 3 years ago

https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/include/flatcc/flatcc_builder.h#L407-L408

This min_align should be updated whenever adding something to the builder.

You could try adding a function flatcc_builder_set_min_align(B, align); and see if that helps. It just needs to set min_align to max of existing min_align and the new argument. Call it after starting the buffer and before ending it. See if that helps.

mikkelfj commented 3 years ago

It is called for create_vector: https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/runtime/builder.c#L1380-L1381

But it appears to be missing from start_vector: https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/runtime/builder.c#L1026

Does it help to add it there?

benvanik commented 3 years ago

Reproducer here: https://github.com/benvanik/flatcc/commit/017023dfa53a377272b8ea8cf1eaa9bb6f33d0d1 If I run this I end up with an output.bin file with the vectors aligned to +8:

I tried adding the set_min_align(B, align); but it didn't change the output.

mikkelfj commented 3 years ago

I tried adding the set_min_align(B, align); but it didn't change the output.

Exit frame calls set_min_align when a stack object pops, so end_vector should pop the frame and effectively call set_min_align at that point, provided the frame context has the proper local aligment stored.

https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/runtime/builder.c#L623

mikkelfj commented 3 years ago

The fact the vectors are correctly aligned to something, here +8, is not surprising, but it confirms the suspicion that the padding between the header and data is incorrect sometimes. The min_align field should control this. It would be helpful with traces of the min_align value.

benvanik commented 3 years ago

I added some prints of it:

pre start min_align: 0
post start min_align: 0
pre end min_align: 0
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre root end min_align: 1
post root end min_align: 16

It looks like it is staying as 16 after initially set in the first vector, but 1 right before I call the Eclectic_FooBar_end_as_root - perhaps that's the issue?

mikkelfj commented 3 years ago

See my comments to the repro. That example is not really working as intended, I presume.

mikkelfj commented 3 years ago

It looks like it is staying as 16 after initially set in the first vector, but 1 right before I call the Eclectic_FooBar_end_as_root - perhaps that's the issue?

Are you saying it suddently drops down to 1? That seems very odd. Can you track down where that happens?

benvanik commented 3 years ago

(sorry didn't see comment on repro)

Yes - and if I set it back to 16 before ending the root then I get the correctly aligned output:

    B->min_align = 16;
    Eclectic_FooBar_end_as_root(B);

The verifier does succeed:

running build/myissue      
pre start min_align: 0     
post start min_align: 0    
pre end min_align: 0       
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre root end min_align: 1  
post root end min_align: 16
success

Regarding the comment on the repo: I would have expected the vector contents to be aligned to the alignment (the xxxx's). What does the align on start_vector do if not align the contents? (according to something I read - which I'll try to find - the alignment specifies the start of the vector contents but not the size prefix, which will remain 4 byte aligned)

mikkelfj commented 3 years ago

My comment was misguided - I somehow thought you created a high level example using standard constructs, but you were using the low level features we discussed all along - brain fart. My comments are still relevant if you skip the low level details and want to test struct alignment.

I think I found the issue:

https://github.com/dvidelabs/flatcc/blob/e327f2408ebf2f2770d2a4d71512a3a1cd79c0de/src/runtime/builder.c#L823

You cannot create the vector before you start the buffer. Usually it is harmless, especially outside of nested buffers, but this is an example of where it does not work.

mikkelfj commented 3 years ago

Also note:

MyTable_start_as_root is really a combined start_buffer and start_table call. In principle you must start the buffer, then create the vector, then start the root table. That is how it works in other languages. But in flatcc you are allowed to create objects while parent objects are open. But you are not allowed to create objects before the buffer is initialized.

EDIT: fixed misleading text: changed "create the table" to "create the vector".

mikkelfj commented 3 years ago

You might want to read here:

https://github.com/dvidelabs/flatcc/blob/master/doc/builder.md#buffers

benvanik commented 3 years ago

Ahhhh that did it!

I am now calling flatcc_builder_start_buffer(B, Eclectic_Buffer_file_identifier, 16, 0); immediately after init'ing the builder, recording my vectors, and then using Eclectic_FooBar_start instead of Eclectic_FooBar_start_as_root. This produces results that both verify and have the correct alignment in the data!

I had seen the buffer calls but must have just skipped over them as the _as_root was working (well, not as I wanted, but I didn't think to look there!). Very useful to know that the implicit start_as_root behavior may mask issues like this. I'll try these changes in my application and verify it works there too.

mikkelfj commented 3 years ago

I actually wonder if the following example under Tables is correct:

Monster_start(B);
Monster_Hp_add(B, 80);
...
flatcc_builder_buffer_create(B, Monster_end(B));

mikkelfj commented 3 years ago

It shouldn't be necessary to use the alignment of 16 in your start buffer call, provided that ordering is correct. 0 should work just as well.

benvanik commented 3 years ago

You're right - I may have a use for block_align to pad the buffer out but it's not needed here. For anyone who googles this in the future and lands here this is the code to align a vector:

    flatcc_builder_init(B);
    // Start the root table before doing anything else:
    Eclectic_FooBar_start_as_root(B);

    // Specify an alignment for the vector contents:
    flatcc_builder_start_vector(B, 1, /*align=*/16, FLATBUFFERS_COUNT_MAX(1));
    uint8_t* p = flatcc_builder_extend_vector(B, 12345);
    memset(p, 'x', 12345);
    flatcc_builder_ref_t data_vec = flatcc_builder_end_vector(B);

    // Finish building the root table:
    Eclectic_FooBar_data_add(B, data_vec);
    Eclectic_FooBar_end_as_root(B);

mikkelfj commented 3 years ago

Note that you can use start_as_root and then create you vectors. It is slightly less efficient because the root table is parked on the stack while the vectors are created, but nothing significant, and it can be much more convenient since you can add the vectors to parent one at time, something you cannot do in C++ or other languages.

mikkelfj commented 3 years ago

On your example: while it probably is correct, a flatcc principle is to always pair start and end calls with _as_root or not. To be strictly correct you should call root = Eclectic_FooBar_end(B), then end_buffer(B, root), or use start_as_root before the vectors as per my previous comment.

benvanik commented 3 years ago

Totally correct - updated (and tested to verify my sanity :)

mikkelfj commented 3 years ago

Also, the example requires use of little endian because the endian conversion is not automatic here. As long as the data is 8 bit units it doesn't matter. But given the nature of the problem, there are probably higher level structures stored in that vector, that isn't platform portable anyway.

benvanik commented 3 years ago

Correct - in my particular case it's an entire file of a format that has its own specifications (WAV, PNG, etc) but you are right that if it was something like a list of floats then it would need special handling.

mikkelfj commented 3 years ago

I still wonder why it didn' work with vectors of structs?

benvanik commented 3 years ago

Now that I have this reproducer set up I'll give that a try too - it's possible you'll be able to spot what I was doing wrong :)

benvanik commented 3 years ago

I think struct alignment is working perfectly (assuming this is what you intended):

struct Aligned128 (force_align: 16) {
  a:uint64;
  b:uint64;
}
table Buffer {
  data:[Aligned128];
}

      #define iree_align(value, alignment) \
        (((value) + (alignment)-1) / (alignment)) * (alignment)
      Eclectic_Buffer_start(B);
      Eclectic_Buffer_data_start(B);
      uint8_t* p = (uint8_t*)Eclectic_Buffer_data_extend(B, iree_align(sizes[i], 16) / 16);
      memset(p, 'x', sizes[i]);
      Eclectic_Buffer_data_end(B);
      buffer_refs[i] = Eclectic_Buffer_end(B);

This produces 16-byte aligned struct vectors in the output file:

I believe I was running into the same buffer start-related issue before with this approach as this only works with the start_buffer (or start_as_root) prior to the vector starts. So that was my main mistake, thankfully!

Now that I have the flexibility of doing the alignment based on runtime parameters via the flatcc_builder_start_vector I'll avoid specifying it in the file - since these are all byte buffers there's not much value in the generated named versions for my particular use case though there may be others who would want the convenience of force_align on vectors for other primitive types.

Thanks again for all the help! I leave it in your hands as to whether you leave this issue as the feature request for force_align so others see it or close it - I can recreate one without all the noise focused only on that feature :)

mikkelfj commented 3 years ago

You are welcome, and thanks for the update. It is a feature that should be added, so I'll leave it open.

dvidelabs / flatcc

`force_align` on vectors missing (or: how to align vectors in flatcc?) #179