HDFGroup / hsds

Cloud-native, service based access to HDF data
https://www.hdfgroup.org/solutions/hdf-kita/
Apache License 2.0
126 stars 52 forks source link

Save padding/offset of fields in compound types #273

Closed mattjala closed 7 months ago

mattjala commented 8 months ago

Currently, HSDS removes all bytes of padding from compound types. This can lead to a re-opened datatype having a different total size, and different arrangement in memory than the type that a user originally wrote.

In order for HSDS to preserve the offsets of fields within compound types, it will probably need to accept an optional "offset" field for each field in a compound type, return this information as part of datatype reads, and then the client will need to use those (optional) offsets to re-assemble the datatype.

jreadey commented 8 months ago

In numpy there's no way it seems to specify the exact offsets of the fields. There's an alignment option: https://numpy.org/doc/stable/reference/generated/numpy.dtype.html, that can be used to flip between packed and compiler specified alignment. That might be the best we can do for now (though there could be potential issues sharing data between machines with different architectures)

jreadey commented 8 months ago

Another alternative would be for clients that care about padding (C specifically) to add some extra fixed char elements as needed to adjust the alignment.

mattjala commented 7 months ago

This should be handled on the client's side with padding - see HDFGroup/vol-rest#91