rttrorg / rttr

C++ Reflection Library
https://www.rttr.org
MIT License
3.12k stars 432 forks source link

Fast binary serialization of objects and properties #200

Open mrduda opened 5 years ago

mrduda commented 5 years ago

Hello, I am trying to write efficient serialization algorithm which uses metadata and accessors provided by RTTR, so the (de)serialization code does know nothing about objects serialized, able to quickly iterate object structure including properties of primitive types, enums, containers, nested objects, and can put all this into binary stream along with simple metadata like class and property names. For binary format and low-level I/O I'm using bitsery, but it can be very much any other library, it does matter only for the binary format and extra overhead for data retrieval.

The writer is done in straighforward way - just a recursive function (toBitsery) which takes rttr::variant as input and walks it recursively, writing all properties of primitive types (PODs, strings) as well as complex types (containers, nested objects) in the order they are exposed to RTTR. To be able to deserialize it afterwards, a header data is written into same stream at the beginning.

The reader is a bit more complex because there could be inconsistencies of different kinds, like missing or renamed property, type mismatch etc, so there's more work to do in fromBitsery than in code above, but the logic is similar: we read the header, then process the data from stream (deserialize) and try to write values into properties. The algorithm is straighforward: just iterate named properties, analyze types, pick a right function to read from stream, assign values to properties. Here's what I have achieved, the code has no dependencies except of RTTR and bitsery libraries: rttr_binary_serialization.zip

And here comes the most important: the performance. To analyze what possible bottlenecks do we have, I created a very large datasets: 1M objects with size between 4 and 300 bytes, each consists of POD and dynamically allocated types - strings, vectors, maps etc, all exposed to RTTR by value, some objects are put into containers of other object types, some are wrapped into rttr::variant and are put into fields of another classes. The profiler showed two most obvious points in the code where performance was bottleneck: rttr::property::set_value and rttr::variant where variant's value is copied from one place to another. Knowing about the fact that most of the data is exposed to RTTR via class member by value, I've tried to use direct access to the data via pointer, and that required a little trick in two places in RTTR core library: property.h/.cpp, public methods:

    int get_value_offset() const;
    void *get_object_pointer(instance &object) const;
    // returns byte offset from the object pointer to the class field, or -1 if not possible
    int property::get_value_offset() const
    {
        return m_wrapper->get_value_offset();
    }
    // returns void* pointer under rttr::instance or nullptr if not possible
    void *property::get_object_pointer(instance &object) const
    {
        return m_wrapper->get_object_pointer(object);
    }

and a following implementation in detail/property/property_wrapper_base.h and property_wrapper_member_object.h:

        virtual int get_value_offset() const { return -1; };
        virtual void *get_object_pointer(instance& object) const { return nullptr; }

    // ... property_wrapper_member_object.h

    class property_wrapper // ... in all possible specializations for property_wrapper class
    {
            int get_value_offset() const override
            {
                // create a fake object instance and get the offset to actual field
                int ret = -1;
                C *ptr = reinterpret_cast<C*>(malloc(sizeof(C)));
                ret = static_cast<int>(reinterpret_cast<uint8_t*>(&(ptr->*m_acc)) - reinterpret_cast<uint8_t*>(ptr));
                free(ptr);
                return ret;
            }

            void *get_object_pointer(instance& object) const override
            {
                // returns void* pointer under rttr::instance
                return reinterpret_cast<void*>(object.try_convert<C>());
            }

The speed up comes from the point that the object pointer is retrieved once per object that contains properties, and the access to individual property is performed as a simple pointer cast, by offsetting the pointer by amount of bytes (which is calculated only once per property, while reading metadata from binary stream). Writing a typed value to the property is as fast as it can be - read from bitsery directly into memory where target value is located, without extra actions like wrapping a value into variant, checking possibility of value conversion, locating a right accessor details, and so on.

This solution does seemingly work, but it looks too hacky for me.

Could you please take a look and give any advice how to efficiently read/write property data in performance-critical applications?

Thank you in advance.

sam-apparance commented 2 years ago

I expect this is what Unreal does internally when serialising UObjects. Having accurate knowledge of the types and where the members is means it can streamline the process a lot, like you are doing, even handling versioning and upgrades gracefully. I am currently looking at member address and offset calculations and your code above looks like just what I need as there are times when I don't want to assume rttr::variant is not going to be copying a load of state around.

I would point out that you don't need to malloc/free an actual object as you are never dereferencing the pointer, just use a reinterpreted number as the pointer for the purposes of offset calculation. Since a member offset is innate compiler knowledge and you're just 'tricking' it to give you the offset value and get_value_offset will compile down to a constant value. Note that compilers do funny things around null pointers so don't use 0 as your 'pointer'. I have done this using 1 in a large software system that I've been using for years where I rolled my own reflection.

These methods would be a good addition to the library, maybe submit a pull request?