Open krischer opened 6 years ago
The previous discussion on this topic resulted in a proposal by D. Neuhauser to simply expand the existing (SEED 2.x) binary time structure to include microseconds. Subsequent discussion during the previous evaluation concluded that there is justification for nanosecond resolution.
Combining all feedback/discussion ended with keeping the SEED 2.x binary time structure fields and extending it with a nanosecond field. This is a direct representation of UTC time and allows leap seconds to be represented as second=60. For illustration, the fields would look something like this:
Year (0 - 65535) UINT16
Day-of-year (1 - 366) UINT16
Hour (0 - 23) UINT8
Minute (0 - 59) UINT8
Second (0 - 60) UINT8
Nanosecond (0 - 999999999) UINT32
A variation in D. Neuhauser's original "MSEED3 Time Structure" proposal was to change the year value to a signed type and then the year range would be -32768 to 32767. On the one had this would allow representation of data for years before 0. On the other hand this would likely be a common "gotcha" for folks porting miniSEED 2.x reading code, where the value was unsigned. On writing the last IRIS format proposal I deemed avoiding the porting gotcha as higher priority than representing years earlier than 0. I do not have strong opinions on this though, happy to go either way.
Which "existing standard" are we talking about?
Which "existing standard" are we talking about?
I don't know everything about it, but could we simply use the ISO 8601 standard ?
The only things we need to check are :
It can - at least Wikipedia (didn't check the actual spec) states: There is no limit on the number of decimal places for the decimal fraction.
. So a nanosecond for the time right would result in (lenghts are without and with trailing Z
):
2018-01-11T11:26:52.829308846Z
(29 or 30 bytes)20180111T112652.829308846Z
(25 or 26 bytes)This is probably too big (and it would involve parsing strings which is pretty expensive). Now we could play some tricks and remove all punctuation and only use 4 bits to represent each letter (basically using the first 128 ASCII mappings) and this would get the size down to about 11 bytes.
But at that point we could just use the struct proposed by @chad-iris which should be easier to parse and also takes 11 bytes.
What would be useful independent on the chosen representation: To be able to not specify an absolute start time - synthetic data for example does not necessarily have one.
With reference to the Year range above, perhaps the range 1 - 65535 may be better. This avoids the "what is year zero?" question and (my bias showing here) the golang
time definition has:
A Time represents an instant in time with nanosecond precision. .... The zero value of type Time is January 1, year 1, 00:00:00.000000000 UTC.
The string parsing is likely too expensive for perhaps the most commonly used header field. Keeping the time binary I feel is important.
(Please let me know if I missed a point or misunderstood something)
Seems to very clear cut - a format that can represent leap-seconds is mandatory in any case so we are voting here for the time representation. There are two choices (both with nanosecond precision):
Year (0 - 65535) UINT16
Day-of-year (1 - 366) UINT16
Hour (0 - 23) UINT8
Minute (0 - 59) UINT8
Second (0 - 60) UINT8
Nanosecond (0 - 999999999) UINT32
2018-01-11T11:26:52.829308846Z
.Please vote on which you would prefer. (Binary struct / ISO 8601 string)
Binary struct
Binary struct.
@kaestli
Binary struct, however with year=int32 (for planetary modelling)
Why does "planetary modeling" require a year type of int32 and how is it relevant for a the FDSN's time series data format?
The method of specifying time should be using an existing standard that can represent leapseconds.