CWFMF / wx_specification

Specification for weather data format to be used within system API
1 stars 0 forks source link

Summarize time specifications used in existing service implementations #9

Open jordan-evens opened 12 months ago

jordan-evens commented 12 months ago

Please provide a run-down on what you were talking about as far as options for how to represent timestamps for data.

Links to relevant standards would be appreciated if available.

erick-ouellette commented 12 months ago

Summary

The OGC WMS/WFS, OGC Open API core feature spec, and even XML/GML schema pave the way for precedence to use RFS3339/ISO8601 time format in the following ways.

Time can be specified in the following combinations:

OGC WMS/WFS

https://www.ogc.org/standard/wms/

The WMS interface standard uses the time pattern identified. In a WMS Service GetCapabilities XML response, a layer may declare time dimensions for the calling WMS Client to request in the &TIME= url arg.

The XML capabilities looks like the following example. The time dimension declares which range or list of times for the layer product, and the reference_time is a list of one ore more times that the times are based off of which is often the product's model issue/run time.

<Dimension name="time" units="ISO8601" default="2023-10-11T18:00:00Z" nearestValue="0">2023-10-11T12:00:00Z/2023-10-13T12:00:00Z/PT1H</Dimension>
<Dimension name="reference_time" units="ISO8601" default="2023-10-11T12:00:00Z" multipleValues="1" nearestValue="0">2023-10-11T06:00:00Z,2023-10-11T12:00:00Z</Dimension>

In the url layer request the user would specify something like &TIME=2023-10-25T12:00:00Z. By default if they do not specify the &DIM_REFERENCE_TIME= (which yeah is different than advertised in the capabilities, i know) then the default reference time is used. Otherwise if used it is possible to have two very different looking products for the same &TIME at different &DIM_REFERENCE_TIME.

For the &TIME= url argument, the standard does allow the use of lists and ranges as shown in the capabilities, but last I checked, there are little to no real-world examples of that. I suspect it is more applicable for WCS which is a way to get multiple grids of cell data.

OGC OpenAPI

https://docs.ogc.org/is/17-069r3/17-069r3.html#_parameter_datetime

https://datatracker.ietf.org/doc/html/rfc3339#section-5.6

The core features standard for OGC and OpenAPI also allow the range and list formats in geoJSON. The parameter name used here seems to be datetime but otherwise it's the same. I'm also tickled to see that there's an open ended qualifier .., which will be nice if carried over into the WMS world and will be helpful for when layers like observations don't need clearly defined end times. It would save some client implementations to refresh the capabilities.

XML/GML

There also exist xml schema definitions for xs:dateTime and xs:duration (aka interval), which also follow the basic RFS3339/ISO8601 time formatting

Programming Notes

Most languages should have libraries (except maybe javascript) to conveniently convert str2time() epoch and reverse time2str(). You will likely have to write a short function to parse the interval into an epoch value for your iterators. Take note that you cannot iterate month and year intervals using epoch, and those should have special per-day iterators and test if the month changes. This is rare.

Time formatting from epoch to string is most conveniently done with some sort of sprintf formatting. The most concise sprintf format is "%FT%TZ" which is equivalent to the longer version people tend to create "%Y-%m-%dT%H:%M:%SZ" and can be prone to typos. Javascript doesn't natively have this, as most other languages do.

I know someone in this work-group uses javascript. So to have javascript be similar to other languages here's some implementation for converting and manipulating time strings and epoch.

/* DATE TIME */
/* NOTE: Javascript epoch is in milliseconds, rather than seconds */
time=function() {
  var now=0;
  var date = new Date();
  now = date.getTime();
  return now;
};

epoch2Date=function(epoch) {
  var date = new Date();
  date.setTime(epoch);
  return date;
};

Date2struct=function(date) {
  var yr = date.getUTCFullYear()+'';
  var mo = (date.getUTCMonth()+1)+'';
  if (mo.length == 1) {
    mo = "0"+mo;
  }
  var da = date.getUTCDate()+'';
  if (da.length == 1) {
    da = "0"+da;
  }
  var hr = date.getUTCHours()+'';
  if (hr.length == 1) {
    hr = "0"+hr;
  }
  var mi = date.getUTCMinutes()+'';
  if (mi.length == 1) {
    mi = "0"+mi;
  }
  var se = date.getUTCSeconds()+'';
  if (se.length == 1) {
    se = "0"+se; 
  }
  var ms = date.getUTCMilliseconds()+'';
  if (ms.length == 1) {
    ms = "00"+ms;
  } else if (ms.length == 2) {
    ms = "0"+ms;
  }

  return {'yr':yr, 'mo':mo, 'da':da, 'hr':hr, 'mi':mi, 'se':se, 'ms':ms};
};

struct2Date=function(struct) {
  var date = new Date();
  date.setUTCFullYear(parseInt(struct.yr,10));
  date.setUTCMonth(parseInt(struct.mo-1,10));
  date.setUTCDate(struct.da);
  date.setUTCHours(struct.hr);
  date.setUTCMinutes(struct.mi);
  date.setUTCSeconds(struct.se);
  date.setUTCMilliseconds(struct.ms);
  return date;
};

Date2epoch=function(date) {
  var e = parseInt(date.getTime(),10);
  return e||0;
};

iso2epoch=function(iso){
  if (!iso) { 
    return 0; 
  }
  // first, let the javascript Date library have a crack at it
  // Known: Date doesn't handle YYYY. 
  // Otherwise, most requests will fall into here. And Date does a very good consistent job
  var epoch;
  if (iso.length > 4) {
    var date = new Date(iso);
    if (date) {
      epoch = date.getTime();
      if (epoch != undefined) {
        return epoch;
      }
    }
  } else if (iso.length == 4) {
    var s = {
      'yr':iso,
      'mo':'01',
      'da':'01',
      'hr':'00',
      'mi':'00',
      'se':'00',
      'ms':'000',
    };
    epoch = Date2epoch(struct2Date(s));
    return epoch;
  } 
  return 0;
}

// figure out the time string format from an iso time string
iso2fmt=function(iso){
  if (!iso) { 
    return ''; 
  }
  var fmt = iso;
  fmt = fmt.replace(/^\d{4}-\d{2}-\d{2}/,'%F');
  fmt = fmt.replace(/^\d{4}-\d{2}/,'%Y-%m');
  fmt = fmt.replace(/^\d{4}/,'%Y');
  fmt = fmt.replace(/\d{2}:\d{2}:\d{2}/,'%T');
  fmt = fmt.replace(/\d{2}:\d{2}/,'%H:%M');
  fmt = fmt.replace(/\.\d{3}/,'.%f');
  return fmt;
}

// js doesn't have posix strftime. This is an 'our' strftime, but in UTC(Z) only.
strftime=function(fmt,e){
  var s = Date2struct(epoch2Date(e));
  var tstr = fmt;
  tstr = tstr.replace('%F',[s.yr,s.mo,s.da].join('-'));
  tstr = tstr.replace('%Y',s.yr);
  tstr = tstr.replace('%m',s.mo);
  tstr = tstr.replace('%d',s.da);
  tstr = tstr.replace('%T',[s.hr,s.mi,s.se].join(':'));
  tstr = tstr.replace('%H',s.hr);
  tstr = tstr.replace('%M',s.mi);
  tstr = tstr.replace('%S',s.se);
  tstr = tstr.replace('%f',s.ms);
  return tstr;
}

format_iso8601_time=function(epoch){
  return strftime("%FT%TZ",epoch);
}

function parseISO8601interval(string){
  var P = string.match(/P(\d*[A-SU-Z])*/);
  var T = string.match(/T(\d*\w)*/)

  var time = 0;
  var exception = '';

  if( P && P[1] ){
    var split = P[1].match(/(\d*[YMD])/g);
    split.forEach( function(a){
      var d = a.match(/\d*/);
      var w = a.match(/\D/);
      var t = 0;
      switch(w[0]){
        case 'Y':
          t = 0; // can't effectively figure epoch per year
          exception = 'Y';
          break;
        case 'M':
          t = 0; // can't effectively figure epoch per month
          exception = 'M';
          break;
        case 'D':
          t = 24*3600;
          break;
      }
      time += d[0]*t*1000;
    });
  } 

  if( T && T[0] && T[1] ){
    var split = T[1].match(/(\d*[HMS])/g);
    split.forEach( function(a){
      var d = a.match(/\d*/);
      var w = a.match(/[HMS]/);
      var t = 0;
      switch(w[0]){
        case 'H':
          t = 3600;
          break;
        case 'M':
          t = 60;
          break;
        case 'S':
          t = 1;
          break;
      }
      time += d[0]*t*1000;
    });
  }
  return {time: time, exception: exception};
}
jordan-evens commented 10 months ago

I got partway through implementing this before I realized that just because other services let you specify independent slices of time with commas doesn't mean it makes any sense for us to do it. Fire weather relies on a continuous stream of weather and the startup indices - it doesn't make any sense to have gaps in the middle. We also still need to know the reference time for the model, so we'd need a reference-time and a datetime field, versus the 3 fields in the other format.

        "idps": {
            "reference-time": "2007-06-30T12:00:00Z",
            "datetime": "2007-07-01T00:00:00Z/2007-07-02T09:00:00Z/PT3H",
            "time": {
                "units": "PT1H",
                "since": "2007-06-30T12:00:00Z",
                "values": [12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45]
            }

The other thing I couldn't figure out is how to limit daily weather format/interval if we're using the other format. With the old structure it would be:

            "time": {
                "units": "P1D",
                "since": "2007-06-30",
                "values": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
            }

and you can enforce that the units field is P1D and the since field is just a date and not a time using the json schema. For the other format, the best I can see is something like:

            "reference-time": "2007-06-30T12:00:00Z",
            "datetime": "2007-06-30T12:00:00Z/2007-07-11T12:00:00Z/P1D",

But that doesn't really make it obvious that it's daily weather, and I don't see a way to enforce that the interval is P1D because the datetime field is just a string and wheras the units is a duration.

I pushed a change that has the reference-time and datetime in there and compares them to the other fields in the example, but I'm seeing more drawbacks than benefits to this format at this point.