felixguendling / cista

Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.
https://cista.rocks
MIT License
1.78k stars 113 forks source link

Feature request: builtin-in serialize and type_hash support for std::array #156

Closed AdelKS closed 1 year ago

AdelKS commented 1 year ago

Hello!

By looking closer at the code, it appears to me that built-in support for type hashing and serializing std::array could be possible ? This has the benefit to avoid migrating downstream code bases that use std::array. Mirroring the code that already exists for cista::array in my tests seems to work

template <typename T, size_t Size>
hash_t type_hash(std::array<T, Size> const&, hash_t h,
                std::map<hash_t, unsigned>& done) noexcept
{
  h = hash_combine(h, hash("std::array"));
  h = hash_combine(h, Size);
  return type_hash(T{}, h, done);
}

template <typename Ctx, typename T, size_t Size>
void serialize(Ctx& c, std::array<T, Size> const* origin, offset_t const pos) {
  auto const size =
      static_cast<offset_t>(serialized_size<T>() * origin->size());
  auto i = 0u;
  for (auto it = pos; it != pos + size; it += serialized_size<T>()) {
    serialize(c, origin->data() + i++, it);
  }
}

Also, the code of cista::array seems to be standard enough so that std::array could replace it ?

I am probably missing some important piece of information that made it not possible to begin with.

Thanks for your time !

Adel

felixguendling commented 1 year ago

With cista, I try to achieve a serialization format that is platform/compiler/standard-library independent. This means that I cannot use std::vector, std::array, etc. because I cannot be 100% sure about their memory layout (which is important for serialization, since we write their memory layout to disk/network). If they differ between standard library implementations, this would make serialized data incompatible between platforms which is undesirable. If the standard guarantees the exact memory layout for std::array, it would be possible to make an exception for this container.

AdelKS commented 1 year ago

Thanks for your input !

Okay I think I understand better what's at stake here:

For all standard layout, non-polymorphic aggregate types you do not have to write a custom function

Even though std::array can indeed be serialized as any custom type, we are not sure it's doesn't have extra members across STL implementations. Found a discussion here and indeed, sizeof(std::array < T, N>) == sizeof(T) * N is a necessary and sufficient condition to have cista::array to be binary compatible with std::array.

felixguendling commented 1 year ago

sizeof(std::array < T, N>) == sizeof(T) * N

This condition is not sufficient. You could reorder elements arbitrarily (for reads as well as writes). So for example STD lib A might access element i at position i and B might access element i at N-i-1 (reversed order) and your condition is still true. Serializing data with std::array<T> from STD lib A and deserializing with STD lib B will yield reversed results. Of course, this is very theoretical and there's no good reason to do this. On the other side, maintaining a custom cista::array implementation has a low overhead.

AdelKS commented 1 year ago

I understand your point, strictly speaking. My sentence needs to be updated to "the only extra necessary and sufficient condition is sizeof(std::array < T, N>) == sizeof(T) * N.

Now, for the order of the elements, the standard mandates them to be stored in the intuitive way because its iterator satisfy the LegacyRandomAccessIterator (since C++17) which abides by

(a + n) is equivalent to (std::addressof(*a) + n).

felixguendling commented 1 year ago

Makes sense. Thank you! I changed cista::array to be a alias of std::array.

AdelKS commented 1 year ago

Amazing, thanks for your awesome work !