Open axtimwalde opened 4 years ago
@axtimwalde @d-v-b Sounds promising. Do you have a link to the proposal?
In general, I'm all for adding a meta-data to the N5 and HDF5, maybe to the point where it is possible to recreate the XML completely, such that XML is not needed for those. However, I think for the foreseeable future the XML will remain the authoritative source, because it provides extensibility for non-N5/HDF5 backends, such as TIFF files, CATMAID etc.
@tpietzsch the proposal is here (the "COSEM style"): https://github.com/janelia-cosem/schemas/blob/master/multiscale.md ; it's not final, but the basic principles are: a) put multiscale-specific stuff in group-level attributes b) keep dataset-specific stuff (resolution, offset, etc) in dataset attributes.
The lack of structured meta-data in HDF5 made it necessary to store more complex stuff in the external BDV XML file. Also, it naturally limited the richness of meta-data concerned with multi-scale image pyramids as is visible in the flat specifications that we currently use. Modern multi-file backends such as Zarr and N5 support structured meta-data, typically through JSON. While HDF5 does not natively support structured meta-data, we recently added support for this through the N5-HDF5 API. The method is simple: primitive flat meta-data is stored in the corresponding native HDF5 type, structured data is stored as JSON. The API hides this background trickery and is consistent across all backends. @d-v-b spent some time to propose an improved meta-data scheme for multi-scale image pyramids that resolves four issues with the existing format:
Time series, setups, and channels are not considered in this proposal and we welcome input.
Using the same method, however, it should be possible to store all meta-data that is currently stored in the BDV-XML file as an attribute of the N5/ HDF5 container. I find this very attractive.