FDSN / miniSEED3

https://docs.fdsn.org/projects/miniseed3/
8 stars 3 forks source link

miniseed version number in the header (header field 2) #5

Closed kaestli closed 1 year ago

kaestli commented 2 years ago

We'd recommend to allow for a version indication of both major and minor versions (minor versions indicating those where a file according to the spec of the previous minor is still valid according to the new minor, major versions being not backward-compatible) Reasoning: A typical case for a new minor version would be the the extension of the list of valid payload encodings (header field 5), a process anticipated in the spec and an information relevant for software to check compatibility with a seedlink file. Potential implementations: a) add a new byte for the minor (especially, alongside with proposal/issue #6, the record header would remain unchanged) b) interpret the byte as version x10, allowing 25 majors and up to 10 minors per major.

crotwell commented 2 years ago

The docs say:

New data encodings may be added to the format in the future without incrementing the format version.

so adding a new encoding type does not require a change in the version number.

My feeling is that the structure proposed is flexible enough, given the json extra headers, that anything that would be accomplished with a "backward-compatible" change can be done without a new version. I would prefer that any change to the fixed header to always be considered "breaking" and simply increment the major version number. The extra headers and payload are effectively designed to allow backwards compatible changes without changing the spec version.

That said, it should be made clear that the specification document itself can be revised in a backwards compatible way, and adding an encoding is a good example of that, as well as fixing typos and adding clarification. In this case, it would be useful to reference both the old and new version of the documentation. Perhaps using a "revision" number or date would be sufficient.

kaestli commented 2 years ago

This statement in the docs nicely display the problem: if business rules allow to add encodings without changing the version number, then the version number cannot tell a software whether it can read the file. it must check the encoding field in addition.

crotwell commented 2 years ago

Yes, so one way to think of this is the version tells you if you can read basic structure of the file and the fixed header, the encoding tells you if you can then read the actual timeseries from the payload. Sometimes you need to do both, but sometimes only the first. And of course even if the list of encodings was fixed, you would still have to check the encoding to know how to parse the timeseries, so incrementing the version actually doesn't help the reader.

I see this as a feature, not a problem. :)

djeastonca commented 1 year ago

Perhaps using a "revision" number or date would be sufficient.

To close the loop on what appears to be the outcome of this issue, I propose that the documentation be versioned with a two component reversion number that reflects the Record Version. For example, the initially ratified mseed3 specification would start with revision 3.0. It would be updated to revision 3.1 if, for example, a new payload encoding is added. Eventually if a change to the header structure arises and is eventually ratified, then a revision 4.0 of the specification would document it.

kaestli commented 1 year ago

So you would have different 3.x specification versions, as well as a version flag in the file, which however does not tell you which specification version actually applies? I have a hard time to see this as a feature...

crotwell commented 1 year ago

I think I agree with @kaestli in that version 3.1 kind of implies that the file format has changed instead of just documentation. Perhaps separating them makes it clearer, so the documentation might be "version 3 revision 1". or "revision 2024-05-14"?

Given this issue has been a source of confusion, a section in the spec is probably needed to explain it. See PR #26

Don't care too much about a number vs a date, I used a date in the PR but am open to different ideas.

djeastonca commented 1 year ago

Yes I agree that the 2-level versioning approach can introduce additional confusion. I've provided some minor feedback on the PR.

djeastonca commented 1 year ago

It appears that, per Roman's update yesterday that ETH feedback on their remaining open issues has been recorded and that they are satisfied with the current draft of the data format, this issue can be closed after https://github.com/iris-edu/miniSEED3/pull/26 is resolved and merged