HDFGroup / hdf5-json

Specification and tools for representing HDF5 in JSON
https://hdf5-json.readthedocs.io
Other
72 stars 25 forks source link

Add production rules for dataset byte streams #56

Open ghost opened 7 years ago

ghost commented 7 years ago

This is my proposal to start a discussion...

Few explanations:

  1. Because storage layout information is available somewhere else I did not make a distinction between byte streams of a chunked dataset vs. a single byte stream of a contiguous dataset. The byteStreams key will always hold an array of byte stream information.

  2. For the same reason, each byte stream information will have its location in the dataset's dataspace as dspace_anchor key. For contiguous datasets, its value will always be [0, 0, ...].

  3. Checksum information has two keys: type (MD5, SHA1, a URI, etc.) and value. The type information is repeated for every byte stream but I wanted to allow having byte stream checksums of different types.

  4. Checksum value's spec describes it simply as an ASCII string without the slash but we may want to be more accurate here.