google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library
https://flatbuffers.dev/
Apache License 2.0
22.56k stars 3.19k forks source link

Question on using Flatbuffers for high performance research application #8224

Closed drexlerd closed 4 months ago

drexlerd commented 5 months ago

Hi,

I want to build a high performance planning library for my research area. I am currently looking into a general way to optimize the memory layouts of my data types and to reduce the number of memory allocations to almost zero. Flatbuffers looks like the perfect fit for this use case. I hope that some flatbuffers experts can give me advice on whether my idea can work or not. I am in a slightly less general setting where I do not need to store flatbuffers on the disc and I also do not need to transfer the flatbuffers over the network. Hence, the assumption is that I will only use them in the current process of my machine.

In AI planning we explore huge state spaces. We start in an initial state, compute all applicable actions, generate all successor states until eventually finding a goal state. We generate millions of states and actions. A concrete example:

Generating the applicable actions usually requires quite a lot of memory allocations and deallocations because an action contains a bunch of std::vector and Bitset (a dynamic version that are wrappers around std::vector) which will be used to compute the successor state. Here an illustration:

class Action {
    Bitset applicability_positive_precondition_bitset_;
    Bitset applicability_negative_precondition_bitset_;
    Bitset unconditional_positive_effect_bitset_;
    Bitset unconditional_negative_effect_bitset_;
    std::vector<Bitset> conditional_positive_precondition_bitsets_;
    std::vector<Bitset> conditional_negative_precondition_bitsets_;
    std::vector<Bitset> conditional_positive_effect_bitsets_;
    std::vector<Bitset> conditional_negative_effect_bitsets_;
    ...
}

I was thinking to use a flatbuffers vector[uint64] + uint16 to store the number of bits in the Bitset and provide a wrapper that allows me to interpret this data as a Bitset. I would also copy flatbuffers to a segmented vector for persistent storage to ensure that data will never be invalidated.

I also would like to use pointers to other data because it is more convenient than using indices to access other data (Remember the assumption that we are on the same machine, and flatbuffers are stored in persistent memory).

My current guess is that this can lead to significant performance improvements. I already wrote my own flat memory layouts but testing each individually is too stressful in the long run. Testing my Bitset and Pointer wrappers is easy.

The concrete questions are: Do you think this idea can work? Do you see any issues with those wrappers that I am not aware of? Is this a commonly used approach to overcome the limitations of the available data types in flatbuffers.

Thanks for this amazing Library already.