felixguendling / cista

Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.
https://cista.rocks
MIT License
1.78k stars 113 forks source link

`cista::to_tuple` Fails to properly decompose Object With Array Type #141

Closed TheFloatingBrain closed 1 year ago

TheFloatingBrain commented 2 years ago

cista::to_tuple is giving this error

/root/workdir/Include/ThirdParty/cista/cista.h:2005:11: error: 13 names provided for structured binding
 2005 |     auto& [p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13] = t;
      |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/workdir/Include/ThirdParty/cista/cista.h:2005:11: note: while ‘const al’ decomposes into 4 elements

with this struct:

struct al {
  float a;
  alignas(8) char b;
  char bb;
  char arr[10];
};

Note: Struct was taken from here (I am trying to implement offsetof for all members in a struct).

P.s There are other cases that this does not work work with (base structs, structs with #pragma (pack, N) etc. with similar problems, do you want bug reports on those?

felixguendling commented 2 years ago

In C++ it's better to use something like std::array<T, Size> instead of C-arrays. Cista offers cista::array<T, Size> for this purpose if you want deterministic serialization behavior.

It's a known issue that cista requires custom serialization or cista_members member function for structs that contain C-arrays because the aggregate initialization trick allows you to initialize each array entry separately, but it's still only one member variable.

I think if your struct looks like this, it should have the same properties (memory layout and performance wise) but it should work with all cista functionality such as cista::to_tuple:

struct al {
  float a;
  alignas(8) char b;
  char bb;
  cista::array<char, 10> arr;
};

I never tried using alignas in combination with cista. However since cista makes use of offsetof for serialization, it should work as long as offsetof returns the correct address.

TheFloatingBrain commented 2 years ago

In C++ it's better to use something like std::array<T, Size> instead of C-arrays. Cista offers cista::array<T, Size> for this purpose if you want deterministic serialization behavior.

It's a known issue that cista requires custom serialization or cista_members member function for structs that contain C-arrays because the aggregate initialization trick allows you to initialize each array entry separately, but it's still only one member variable.

I think if your struct looks like this, it should have the same properties (memory layout and performance wise) but it should work with all cista functionality such as cista::to_tuple:

Okay, I may have to report back on this one, my use case involves computing the offset of members of a struct (POD'ish) so I can send interleaved vertex data to the GPU in OpenGL. If it does in fact have the same memory layout properties, I can certainly recommend to anyone using the library to use std::array or explicitly serialize their object type (or perhaps specialize for std::array type things, I am not sure I could go to &myStdArray and get the first element, as I cant call .data() GPU side).

I never tried using alignas in combination with cista. However since cista makes use of offsetof for serialization, it should work as long as offsetof returns the correct address.

On a personal note, as someone who spent a few days looking into this problem of how to implement a "cleaner" offsetof and looking at a lot of references, I am intrigued by cista'suse of offsetof/cista_member_offset, looking at a chunk like the one below

  auto const start = origin->empty()
                         ? NULLPTR_OFFSET
                         : c.write(static_cast<T const*>(origin->el_), size,
                                   std::alignment_of_v<T>);

  c.write(pos + cista_member_offset(Type, el_),
          convert_endian<Ctx::MODE>(
              start == NULLPTR_OFFSET
                  ? start
                  : start - cista_member_offset(Type, el_) - pos));

I did read the article on how cista basically works, but it looks like instead of trying to make a function that can take any member and compute offset of (at compile time?), you alias the members into a certain set of names while some how preserving memory layout?

Your work is very impressive by the way :v:

felixguendling commented 2 years ago

Basically, what cista does is to first memcpy the data structure 1:1 into the serialization buffer (file / memory / mmap). This is done for every "entry point" - the root data structure as well as everything that's behind a heap allocated pointer (like vector, unique_ptr). You then always have a pair of pointer and offset: the origin of your data (original data structure you had in-memory and want to serialize) and the offset to the memcpy-ed data in the buffer. After that, cista iterates recursively through each member and does what's needed:

Here, the offsetof(my_struct, member_) is required to compute the corresponding offset pos + offsetof(...) of the member variable in the buffer: pos corresponds to the start address where the struct was serialized, so pos + offsetof(my_struct, member_) yields the offset of the member in the serialization buffer which is required for the recursive serialize call or to apply endian conversion directly as shown in your example code.

&myStdArray

Maybe something like &my_array[0] could work. Since the data structures have the same memory layout, you could probably also try to use the C-array in OpenGL code. I only have experience with CUDA where you can use C++ data structures.

Thank you! :-)

felixguendling commented 1 year ago

Closing as this is expected behavior (as described in my previous comments).