Closed KrisThielemans closed 8 months ago
For C++, we generate aligned structs, but the serialized format is not necessary aligned. The serializers take care of the conversion. For Python structured arrays, we accept aligned or unaligned dtypes.
can't say this makes things a lot clearer.
For C++, we generate aligned structs,
somehow I doubt it. The compiler will do things for you on alignment of structures in a vector and fields within structures (or even in bitfields), but you (probably) have no control over what it does.
For Python structured arrays, we accept aligned or unaligned dtypes
My Python isn't good enough to understand this I'm afraid (without reading up), but that's fine.
In any case, those answers have nothing to do with the doc on how the data is encoded in the binary format 😄 . I think you're saying that in the binary format, the data is stored without "gaps" to make data alignment. That makes sense to me as otherwise you could have a lot of space overhead. In any case, that is what needs documenting then.
FWIW I don't think we should make a habit of documenting what is not in the binary format. So saying there are no gaps seems unnecessary. I know that formats that just write structs directly to a file, sometimes end up with gaps, but that is clearly not what is being done here. If we start talking about alignment in the context of the binary format, I think it will add confusion.
Added a small clarification to the docs
By using
varint
s , strings etc t's possible that data is not aligned to a 32-bit or whatever boundary. It doesn't seem documented if the binary format fills in the gaps or not. This certainly needs to be documented for Records and Streams.