frc971 / 971-Robot-Code

Apache License 2.0
64 stars 8 forks source link

Better C++ flatbuffers API #8

Open bsilver8192 opened 2 years ago

bsilver8192 commented 2 years ago

The current flatbuffers C++ API requires building up messages inside-out. It should be possible to use the C++ stack for this instead, to allow the C++ code to build up messages from the outside in (which is often more natural).

More concretely, this means generating C++ classes for each flatbuffer struct which hold values for each field (and track which ones are set). Nested structs should be contained in their parent objects, because a major use case is calling functions which return a nested object, and there's no other easy place to store these objects. Then, these objects can do a depth-first traversal of the C++ object graph to actually write the flatbuffer (aka each object writes out its children, tracking the resulting offsets in local variables, then writes out itself and returns the offset to its parent).

I think writing the buffer should be done in a offset_t Write method (or similar), which is passed the FlatBufferBuilder. The top-level will normally be called via a templated wrapper type that holds onto the fbb with a destructor which calls Write on the top-level object and then Finish, to keep everything in outside-in order.

An alternative would be holding the fbb in each object, and then having their destructors write it out, but that means more stack space and makes it impossible to decide to skip writing it out later (for example, build up a sub object and then realize it's not actually needed, without taking up any space in the final buffer). Doing it in the destructor also means the parent object has to keep track of where the offsets to all its children are coming from.

Need to think through handling arrays of primitives. These can be big and variable-sized, neither of which interacts well with putting them on the stack. At the same time, allocating them immediately unlike other objects makes for a confusing API. Maybe provide APIs for both?

Handling arrays of objects is tricky. ArrayWriter<T> StartArray(int max_size) would work well for many cases, with convenience void CreateArray(span<const T>) when the temporary storage is managed externally. However, there's no place to stash C++ pointers to the intermediate objects. The flatbuffers array needs to be placed after those objects in the buffer, and C++ pointers are larger than offsets on 64-bit platforms. Forcing the user to allocate that array externally goes against making this API nice and easy to use, but that's the best I can think of right now. void CreateArray(span<const T*>), with an extra level of indirection, could be handy but also looks like a big foot-gun with dangling references.

Do we need to manage shared subobjects? Writing them out redundantly is easy, but not helpful for space efficiency. Maybe use a bit in the bitmask to track whether it's been written out, and make a union for all the variable storage which gets overwritten to the offset it was written to?

This will increase stack usage, which may be undesirable. It's probably worth using a bitmask in the generated code to track which fields are set, rather than using std::optional or a separate bool for each one.

Copying sub-objects could become expensive. It should be possible to structure this so that RVO (return value optimization) constructs the sub-objects in place for the common case of a function returning an entire sub-object.

These classes end up looking similar to the existing TableT, but without storing data in the objects. Do we want to expose reading the fields (and checking if they're set) and/or building them from a const Table*?