felixguendling / cista

Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.
https://cista.rocks
MIT License
1.82k stars 118 forks source link

Memory usage using cista::variant #121

Open cflaviu opened 2 years ago

cflaviu commented 2 years ago

I'm trying to use cista for a protocol with messages (structures) having between 8 and 88 bytes. I used cista::offset::vector<cista::variant<msg1, msg2, etc.>>.

Having a vector containing 10 messages of 8 bytes, what would be the memory used? I guess the serialized buffer will have approximately the same size.

felixguendling commented 2 years ago

Generally type T serialized with cista always take up sizeof(T) bytes. That sizeof(cista::variant) is, for a cista::variant<T1, T2, ..., TN> the max(sizeof(T1), sizeof(T2), ..., sizeof(TN)) plus one byte information which one of the types T1, T2, ... is actually stored. Due to alignment, this byte can actually require alignof(cista::variant<T1, T2, ..., TN> size - which is the largest data type you used in any of the T1, T2, ..., TN. Thus, it depends. But assuming your largest scalar type is 64bit, the sizeof(cista::variant) would always be 96 bytes, regardless of the actual type it stores.

Note that this would also be the case if you would use a raw C union datatype and memcpy it to the destination buffer.

I think in case you want to transmit a large number of messages in one chunk of data, you could try data oriented programming - storing something like this:

struct s {
  data::vector<msg1> msg1_;
  data::vector<msg2> msg2_;
  // ...
};

If you want even more performance, the only way would be to do a custom protocol sending one byte indicating the message type, the message size, and then followed by the message.

Another way would be to make a cista::variant<cista::unique_ptr<msg1>, cista::unique_ptr<msg2>, ... which would make the variant 16 bytes (8 bytes pointer size + 1 byte indicating the msg type + 7 bytes padding due to alignment). This has still a lot of overhead compared to the 8 bytes msg + 1 byte type indicator but it's much more efficient compared to the 96 bytes per message with the variant<msg1, msg2, ...>.

felixguendling commented 2 years ago

Can you propose a way how cista::space_optimized_variant should work? If it's stored as a value, what should sizeof(cista::space_optimized_variant) be (this is a compile time constant which cannot depend on the type stored in it at runtime) and how would you store the value?

cflaviu commented 2 years ago

Would be a custom serializer/deserializer a solution? About OOP remark, different messages have to be ordered.

felixguendling commented 2 years ago

Cista uses the same format for serialization as it uses for in-memory use at the program runtime. All Cista data structures can be used as a replacement for the corresponding std:: companions (i.e. cista::variant vs. std::variant, etc.). The property that the data layout does not change between serialization/deserialization and program runtime has the advantage that you can just mmap large Cista data-sets into memory and start using them without the need for deserialization. However, this comes with the disadvantage that all data structures need to have the same size in-memory at runtime and serialized - it's always sizeof(T).

I'm not sure, if a custom serializer would work - possibly. I'm pretty sure, this would lead to the data being const after deserialization because you cannot simply store another (especially bigger) value to the variant. And this would make real deserialization necessary, which is currently not the case (in offset mode you can just reinterpret_cast<T const*>(mmap.data()) and use the data without any deserialization step).

Maybe you can experiment and post your findings here? I would be curious how this could work.

felixguendling commented 2 years ago

About OOP remark, different messages have to be ordered.

What I mean is not Object Oriented Programming (OOP) but Data Oriented Programming.

Maybe the order could be indicated by an additional field per message.

I think the zero-copy property of Cista makes it impossible to have a different size for sizeof(T) and the serialized message size. And in both cases, C unions and cista::variant, sizeof(T) is rather big (88 bytes or more).

What you really need is probably a format [(type, msg), (type, msg), ...] if you want to store/transmit the messages in the most memory-efficient way. But I cannot think of a way how Cista could help with this, because this would be per-definition not zero-copy and Cista is per definition zero-copy.

felixguendling commented 2 years ago

Another hint would be maybe to implement this for your variant:

https://github.com/felixguendling/cista/blob/master/include/cista/serialized_size.h

I added this on the request in another issue here. So it looks like someone is doing stuff like this. I just don't know how 🤣

cflaviu commented 2 years ago

Thanks for support. As you mention, the simplest way is to serialize message type as byte then message and so on.

serialized_size.h looks interesting.

I was thinking about some space optimization involving variants, it's just an idea :) That cista::space_optimized_variant would work only with a specialized vector. I don't know the implementations of std and Cista variants but variant could keep only a byte for index/type and an offset refering the value. The specialized vector would have two buffers, one buffer keeping variant instances with identical size known at compile time and another buffer storing values with different sizes of variant objects. The offset from variant would point to the values stored in value buffer. Of course adding padding if necessary. Kind of variant vector.

cflaviu commented 2 years ago

BTW I would like to use a binary API based on Cista than an API based on JSON.

felixguendling commented 2 years ago

cista::space_optimized_variant

I think, this could work. But implementing this will take some time.

However, due to alignment requirements in the data vector keeping the variants, there would still be some waste compared to what you could do manually. For example if you had a variant_vector<uint64_t, uint8_t>, the 64bit int has always to be aligned to 64bit (i.e. this could happen: 1 byte uint8_t, 7 byte padding, 8 byte uint64_t, 1 byte uint8_t, 7 byte padding, 8 byte uint64_t). Otherwise, it will probably still work on x86 even without alignment but according to the C++ standard, this is undefined behaviour and there are some platforms (ARM?) that have stricter alignment requirements and I want Cista to be free of UB where possible.

BTW I would like to use a binary API based on Cista than an API based on JSON.

I don't exactly understand what you mean with this? Can you make an example (input -> what it does -> output)?

At the moment, it's not possible in C++ to get the names of fields in structs, so there cannot be an automatic conversion from C++ struct to JSON without additional info. With C++20 it's possible to have something like field<cista::string, "fieldname">.

cflaviu commented 2 years ago

For 1. indeed there is a trade-off between the flexibility of variants and compactness. If the variant objects will have more than 8 bytes, I think the space efficiency will be decent. Maybe an option would be to group variant objects by their size at construction time when initialization list is used in order to improve space efficiency.

For 2. I'm trying to create a flexible and minimalist RPC binary protocol in contrast with many JSON web based APIs which are IMO not time and space efficient compared to Cista. Initially I wanted to use Cista to serialize/deserialize both RPC primitives (call, call result, notification, etc.) and data transferred (function parameters and function results). The best way seems now to implement a custom serialization for RPC primitives and to use Cista only for data transferred.

cflaviu commented 2 years ago

For performance reasons I would not use meta information for RPC for every transaction as it is in JSON. A set of discovery functions could provide meta information about API.

felixguendling commented 2 years ago

group variant objects by their size at construction time when initialization list is used in order to improve space efficiency

Sounds like a great idea. I think the variant_vector could also be useful for other use cases.

A set of discovery functions could provide meta information about API

For this, the information about the field names needs to be available, which it currently is not. So still a wrapper field<"field_name", cista::offset::string> would be required. Otherwise, macros could be used.