NOAA-OWP / wres

Code and scripts for the Water Resources Evaluation Service
Other
2 stars 1 forks source link

As a user, I want to evaluate streamflow data from the NWIS daily values service #258

Open epag opened 3 weeks ago

epag commented 3 weeks ago

Author Name: James (James) Original Redmine Issue: 91342, https://vlab.noaa.gov/redmine/issues/91342 Original Date: 2021-04-27


Given an evaluation that requires streamflow data from the NWIS Daily Values web service When I declare that evaluation in WRES Then I should be able to choose the NWIS Daily Values web service and have that evaluation succeed

( So that I can complete HEFS baseline validations without downloading and reformatting the data myself, off to the side. )


Redmine related issue(s): 96660, 99415, 101224


epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-04-27T19:59:52Z


Priority TBD. The HEFS folks need this and have a workaround that involves reading and reformatting data on the side and supplying that to WRES, which is error prone.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:12:14Z


Take your run-of-the-mill instantaneous value url such as @https://nwis.waterservices.usgs.gov/nwis/iv?endDT=2021-04-06T12%3A00%3A00Z&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01T00%3A00%3A01Z@ and try to run it against the daily value service with a one letter change, like @https://nwis.waterservices.usgs.gov/nwis/dv?endDT=2021-04-06T12%3A00%3A00Z&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01T00%3A00%3A01Z@:

Response: HTTP Status 400 - illegal date-time input for dv-service startDT - time zone not allowed, server=[nadww01]

(And the status is 400, apart from the text shown in the body, so at least that's consistent)

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:13:18Z


Remove the zone, try this instead: @https://nwis.waterservices.usgs.gov/nwis/dv?endDT=2021-04-06T12%3A00%3A00&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01T00%3A00%3A01@

HTTP Status 400 - illegal date-time input for dv-service startDT - hours or minutes not allowed, server=[nadww01]

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:14:19Z


Ok, remove the times, @https://nwis.waterservices.usgs.gov/nwis/dv?endDT=2021-04-06&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01@ now we get a response.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:17:06Z


Does that url work with the instantaneous value API? Kind of, but look what happens to the dates. @https://nwis.waterservices.usgs.gov/nwis/iv?endDT=2021-04-06&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01@

    "queryInfo": {
      "queryURL": "http://nwis.waterservices.usgs.gov/nwis/ivendDT=2021-04-06&format=json&parameterCd=00060&sites=01379500&startDT=2021-01-01",
      "criteria": {
        "locationParam": "[ALL:01379500]",
        "variableParam": "[00060]",
        "timeParam": {
          "beginDateTime": "2021-01-01T00:00:00.000",
          "endDateTime": "2021-04-06T23:59:59.000"
        },
        "parameter": []
      },
      "note": [
        {
          "value": "[ALL:01379500]",
          "title": "filter:sites"
        },
        {
          "value": "[mode=RANGE, modifiedSince=null] interval={INTERVAL[2021-01-01T00:00:00.000-05:00/2021-04-06T23:59:59.000Z]}",
          "title": "filter:timeRange"
        },
</code>

It went to @INTERVAL[2021-01-01T00:00:00.000-05:00/2021-04-06T23:59:59.000Z]@

I don't see how that date range can possibly make sense, other than that it expands out to ensure whatever I meant by "2021-01-01" and "2021-04-06" it will capture it.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:36:33Z


I don't think it will be a huge amount of work.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:37:36Z


I haven't checked to see that the response body is the same, though, my estimate could go down if it's the same format, go up if it's a different format.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T20:42:02Z


While waiting for an opportunity to deploy, may as well work on this.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-04-27T21:27:45Z


Will have to look back through some of the history - perhaps #62751 was hyperbole, I don't know, but I do recall some significant smells with the daily values service. Otoh, I'm not sure they are smells that we can mitigate, so I guess we offer access and, from there, users will need to do their own due diligence.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-04-27T21:32:32Z


One of the main smells that I recall is that the service seemed to offer all sorts of stuff other than "daily values", including instantaneous values.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-04-27T21:34:25Z


Jesse wrote:

look what happens to the dates.

60265-18

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T21:57:33Z


Looks like the response body is different:

Caused by: wres.io.reading.PreIngestException: Failed to parse the response body from USGS url https://nwis.waterservices.usgs.gov/nwis/dv?endDT=2012-01-01&fo
rmat=json&parameterCd=00060&sites=01013500&startDT=2011-01-01
        at wres.io.reading.waterml.WaterMLBasicSource.deserializeInput(WaterMLBasicSource.java:222)
        at wres.io.reading.waterml.WaterMLBasicSource.ingest(WaterMLBasicSource.java:238)
        at wres.io.reading.waterml.WaterMLBasicSource.saveObservation(WaterMLBasicSource.java:133)
        at wres.io.reading.BasicSource.save(BasicSource.java:39)
        at wres.io.concurrency.IngestSaver.execute(IngestSaver.java:502)
        at wres.io.concurrency.IngestSaver.execute(IngestSaver.java:30)
        at wres.io.concurrency.WRESCallable.call(WRESCallable.java:18)
        ... 4 common frames omitted
Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "value" (class wres.io.reading.waterml.variable.Option), not m
arked as ignorable (2 known properties: "optionCode", "name"])
 at [Source: (byte[])"{"name":"ns1:timeSeriesResponseType","declaredType":"org.cuahsi.waterml.TimeSeriesResponseType","scope":"javax.xml.bind.JAXBElement$Glob
alScope","value":{"queryInfo":{"queryURL":"http://nwis.waterservices.usgs.gov/nwis/dvendDT=2012-01-01&format=json&parameterCd=00060&sites=01013500&startDT=201
1-01-01","criteria":{"locationParam":"[ALL:01013500]","variableParam":"[00060]","timeParam":{"beginDateTime":"2011-01-01T00:00:00.000","endDateTime":"2012-01-
01T00:00:00.000"},"parameter":[]},"note":[{"valu"[truncated 29162 bytes]; line: 1, column: 2018] (through reference chain: wres.io.reading.waterml.Response["v
alue"]->wres.io.reading.waterml.ResponseValue["timeSeries"]->java.lang.Object[][0]->wres.io.reading.waterml.timeseries.TimeSeries["variable"]->wres.io.reading
.waterml.variable.Variable["options"]->wres.io.reading.waterml.variable.VariableOptions["option"]->java.lang.Object[][0]->wres.io.reading.waterml.variable.Opt
ion["value"])
        at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
        at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:987)
        at com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1974)
        at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1701)
        at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1679)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:330)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:214)
        at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:24)
        at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:214)
        at com.fasterxml.jackson.databind.deser.std.ObjectArrayDeserializer.deserialize(ObjectArrayDeserializer.java:24)
        at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
        at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
        at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
        at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4593)
        at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3609)
        at wres.io.reading.waterml.WaterMLBasicSource.deserializeInput(WaterMLBasicSource.java:216)
        ... 10 common frames omitted
epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T22:00:14Z


This also could be due to rigid pojo settings on our side.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-04-27T22:01:53Z


Not as straightforward as hoped, but I think still within the 16 hour estimate.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-12-08T23:00:58Z


Comes up periodically. Latest example is #99415. Sounds like we either need a more flexible reader for handling iv and dv together, else a new reader for the dv service.

Bumping priority, fwiw.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-12-08T23:10:09Z


Oh, and one thing to bear in mind with the dv service, which may or may not affect the estimate: the dv service is more complicated w/r to time scale, but it sounds like it does explicitly qualify it.

See #60265-12.

For streamflow, a mean daily flow is the most likely (perhaps the only) scenario, but that is not necessarily true for other variables.

edit: for the same reason, the time scale is something that can be requested (@statCd@), so that will need to be reconciled with any @existingTimeScale@ (although such declaration is rare).

epag commented 3 weeks ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-12-09T13:59:52Z


I haven't tried this on every location, but, for one reason or another, I found out that you could calculate all of the daily values by indeed just coming up with the mean for all values within a day for a location's flow. I think you'll have a tough time finding stage - I've never seen it available for daily.

I played around with the dashboard to see some daily values and the random selection of about 10 locations only ever showed locations having flow and temperature (not common) for daily values. Within each result set, there's a field of @value/timeseries/#/variable/options/option/#@. In @iv@, it has a name of @Statistic@, but on @dv@ it'll have @value@ that'll be @Minimum@, @Mean@, @Maximum@, or @Median@ (which is rare). I haven't seen anything declaring what span of time the statistic was over, probably because it's tribal knowledge that the user explicitly navigated to the daily value service. The @statCd@ is @all@ by default, so you'll get every stat available for the location. This hasn't really been an issue for flow because there's only ever @Mean@. I imagine it may be a different case if temperature (@00010@) were tried. Here's an example url: https://waterservices.usgs.gov/nwis/dv/?format=json&indent=on&sites=14362000&startDT=2021-12-01&endDT=2021-12-07&siteStatus=all

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-12-09T14:15:43Z


Yup. There are also some complete examples here (under examples):

https://waterservices.usgs.gov/rest/DV-Service.html

Here is one that explicitly asks for the mean (but you need to know the @stCode@):

https://waterservices.usgs.gov/nwis/dv/?format=waterml&stateCd=ri&parameterCd=00060&siteType=ST&statCd=00003

Right, the time scale duration does seem like tribal knowledge. However, as indicated by the USGS person in #60265-12, it is not totally coherent with the service endpoint because the values could be instantaneous values for some variables, like groundwater. In other words, "daily" can EITHER mean a time scale duration of one day OR it can mean one value per day OR it can mean both!

To be fair, nwis at least makes an effort to qualify the time scale of their data. It still astonishes me how often this information is completely ignored by many time-series data services, else confused/conflated with the timestep of the data. Hey ho.

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-01-26T14:59:04Z


Bumping this because it was mentioned again as a priority for HEFS.