European-XFEL / h5glance

Explore HDF5 files in terminal & HTML views
BSD 3-Clause "New" or "Revised" License
65 stars 7 forks source link

Rework displaying HDF5 datatypes #17

Closed takluyver closed 4 years ago

takluyver commented 4 years ago

Previously, h5glance relied on converting HDF5 datatypes to numpy dtypes (in h5py) before trying to display them. This works OK for normal datatypes (e.g. int32, float64), but HDF5 allows for much more flexibility, like non-standard float formats, or using 3 bytes per number. Converting to numpy dtypes would either break in these cases, or 'promote' it to a standard dtype and give misleading information about what's in the file. I don't think these are used often, but I want h5glance to do something sensible if they come up.

This works directly from the h5py TypeID objects instead. Standard types are still displayed as e.g. uint64, while non-standard numeric types will look like custom 3-byte float. I've gone through the TypeID subclasses and tried to cover all the possible datatypes.

takluyver commented 4 years ago

Heads up @vallsv - this probably breaks things for non-h5py objects. I'm open to addressing that, but the usual constraints apply: I have limited time, and my focus is on using this with h5py objects.

As I learned about HDF5, I realised that its datatypes don't necessarily map directly to numpy dtypes, so the focus of this is to describe HDF5 datatypes rather than numpy dtypes.

tmichela commented 4 years ago

I'm not expert in h5py types, but that looks good to me