Large bit arrays can be found even in the standard data type set, so this matter may potentially affect common applications.
In C, the memory storage format of bit arrays is the same as their wire representation, which enables serialization via memcpy. The application can manipulate the contents using nunavutCopyBits, nunavutSetBit, and nunavutGetBit.
In Python, bit arrays are stored using NumPy arrays and serialized using numpy.packbits.
In C++, the current implementation (de)serializes arrays bit-by-bit which is likely to cause performance issues. Sadly the C++ implementation cannot enforce a wire-compatible memory storage format because it has to be compatible with standard containers like std::vector<bool> and std::bitset. Are there any ideas on how to improve the bit array serialization without requiring the use of custom bit containers where the memory storage format is known?
In the specialization of VariableLengthArray<bool> that I implemented in #284 the memory storage format is the same as the wire format but currently, the serialization methods cannot benefit from that.
Large bit arrays can be found even in the standard data type set, so this matter may potentially affect common applications.
In C, the memory storage format of bit arrays is the same as their wire representation, which enables serialization via
memcpy
. The application can manipulate the contents usingnunavutCopyBits
,nunavutSetBit
, andnunavutGetBit
.In Python, bit arrays are stored using NumPy arrays and serialized using
numpy.packbits
.In C++, the current implementation (de)serializes arrays bit-by-bit which is likely to cause performance issues. Sadly the C++ implementation cannot enforce a wire-compatible memory storage format because it has to be compatible with standard containers like
std::vector<bool>
andstd::bitset
. Are there any ideas on how to improve the bit array serialization without requiring the use of custom bit containers where the memory storage format is known?In the specialization of
VariableLengthArray<bool>
that I implemented in #284 the memory storage format is the same as the wire format but currently, the serialization methods cannot benefit from that.