Open krischer opened 6 years ago
Being fully self-defining is obviously not possible unless huge amount of metadata is added to each record. No JSON!
I do agree that full self-definition without consulting the reference will be extremely difficult and does not mesh with the requirement of having a short header.
I assume the second bullet point should only be valid for the optional/additional data not part of the fixed header?
I don't like the proposed fixed header at all. If I would vote, then #18/#20 would be NO from me.
"Self-defining" is probably too strong a word here. Perhaps "self-contained" is better, in the sense that what is really desired is that a single NGF record, along with the specification and documents it references, contains sufficient information to parse the fixed headers and fully reconstruct the timeseries. Additional/optional data may be externally defined as long as it is parsable and "namespaced" in a way that allows systems that understand that particular additional content to take advantage of it.
We don't want to go down the rabbit hole of trying to define everything independent of outside resources, the format is for data exchange with seismologists, not with space aliens. So referring to outside definitions of URI, compression formats, byte order, time formats, etc should be acceptable within the specification and is unavoidable.
(Please let me know if I missed a point or misunderstood something)
This requirement is a bit undefined and not much discussion happened. Thus please vote on the following issues (2 & 3 largely mesh with #14):
1 Yes 2 Yes 3 In no particular order (and there may be others worth considering): JSON although it is ascii and verbose JSON5 ascii and verbose but less so than JSON CBOR binary, but has some controversy MessagePack binary, some controversy
The binary vs ascii difference is important, but it is critical is that it be a hierarchical key-value storage.
1) Yes 2) No - No undefined add ons in the headers
@kaestli
- No - no undefined add ons in the headers.
The description of this item does not include "undefined add ons". I think it is about how optional headers are structured.
- Should the new data format be parse-able without resolving to heuristic checks? E.g. a mandatory byte order field. (Yes/No)
Yes
- Should information not in the fixed header (assuming we have one) be in some form of standardized format (e.g. no binary blockette that is meaningless without the spec)? (Yes [some standardized format] / No [keep binary blockette layout or something similar])
Yes. This will provide a flexible format, with still the possibility of having specific header add ons for a certain community (e.g., FDSN extensions)
- Assuming 2. results in a "yes": What should this format be? Please propose one or more.
Yes, as per @crotwell. We prefer a binary format.
The NGF format should be self-defining in all key aspects