FDSN / miniSEED3-TechnicalEvaluation

Discussion and evaluation of miniSEED 3
5 stars 1 forks source link

Requirement: The new format must be fully self-defining #5

Open krischer opened 6 years ago

krischer commented 6 years ago

The NGF format should be self-defining in all key aspects

andres-h commented 6 years ago

Being fully self-defining is obviously not possible unless huge amount of metadata is added to each record. No JSON!

krischer commented 6 years ago

I do agree that full self-definition without consulting the reference will be extremely difficult and does not mesh with the requirement of having a short header.

I assume the second bullet point should only be valid for the optional/additional data not part of the fixed header?

andres-h commented 6 years ago

I don't like the proposed fixed header at all. If I would vote, then #18/#20 would be NO from me.

crotwell commented 6 years ago

"Self-defining" is probably too strong a word here. Perhaps "self-contained" is better, in the sense that what is really desired is that a single NGF record, along with the specification and documents it references, contains sufficient information to parse the fixed headers and fully reconstruct the timeseries. Additional/optional data may be externally defined as long as it is parsable and "namespaced" in a way that allows systems that understand that particular additional content to take advantage of it.

We don't want to go down the rabbit hole of trying to define everything independent of outside resources, the format is for data exchange with seismologists, not with space aliens. So referring to outside definitions of URI, compression formats, byte order, time formats, etc should be acceptable within the specification and is unavoidable.

krischer commented 6 years ago

Summary

(Please let me know if I missed a point or misunderstood something)

This requirement is a bit undefined and not much discussion happened. Thus please vote on the following issues (2 & 3 largely mesh with #14):

  1. Should the new data format be parse-able without resolving to heuristic checks? E.g. a mandatory byte order field. (Yes/No)
  2. Should information not in the fixed header (assuming we have one) be in some form of standardized format (e.g. no binary blockette that is meaningless without the spec)? (Yes [some standardized format] / No [keep binary blockette layout or something similar])
  3. Assuming 2. results in a "yes": What should this format be? Please propose one or more.
crotwell commented 6 years ago

1 Yes 2 Yes 3 In no particular order (and there may be others worth considering): JSON although it is ascii and verbose JSON5 ascii and verbose but less so than JSON CBOR binary, but has some controversy MessagePack binary, some controversy

The binary vs ascii difference is important, but it is critical is that it be a hierarchical key-value storage.

ketchum-usgs commented 6 years ago

1) Yes 2) No - No undefined add ons in the headers

chad-earthscope commented 6 years ago
  1. Yes.
  2. Yes, with FDSN defining "reserved" headers.
  3. what @crotwell said.
ozym commented 6 years ago
  1. Yes
  2. Yes, assuming some form of namespacing.
  3. As per @crotwell also, plus perhaps add BSON and UBJSON to the list (not an endorsement but for extra comparison)
kaestli commented 6 years ago
  1. Yes (however byte order can be defined as part of the standard, not as flag in the header)
  2. No - no undefined add ons in the headers.
chad-earthscope commented 6 years ago

@kaestli

  1. No - no undefined add ons in the headers.

The description of this item does not include "undefined add ons". I think it is about how optional headers are structured.

claudiodsf commented 6 years ago
  1. Should the new data format be parse-able without resolving to heuristic checks? E.g. a mandatory byte order field. (Yes/No)

Yes

  1. Should information not in the fixed header (assuming we have one) be in some form of standardized format (e.g. no binary blockette that is meaningless without the spec)? (Yes [some standardized format] / No [keep binary blockette layout or something similar])

Yes. This will provide a flexible format, with still the possibility of having specific header add ons for a certain community (e.g., FDSN extensions)

  1. Assuming 2. results in a "yes": What should this format be? Please propose one or more.

Yes, as per @crotwell. We prefer a binary format.

ihenson-bsl commented 6 years ago
  1. Yes
  2. Yes
  3. Yes, as per @crotwell, prefer binary format.
ValleeMartin commented 6 years ago
  1. Yes
  2. Yes
  3. Yes
JoseAntonioJara commented 6 years ago
  1. Yes
  2. Yes
  3. We prefer a binary format.