ioos / sos-injector-db

Inject stations/observations from an arbitrary database into an IOOS 52 North SOS
The Unlicense
3 stars 5 forks source link

Clarification about get_observations.sql start_date #7

Closed emiliom closed 10 years ago

emiliom commented 10 years ago

The documentation for get_observations.sql say just this regarding start_date: "queried from target SOS". In the latest edit of the GCOOS sample SQL statements, the corresponding WHERE statement looks like a.observationDate > strftime('%Y-%m-%dT%H:%M:%SZ', ?). This is effectively the implementation of start_date.

If sos-injector-db is set up to run, say, hourly on the hour, in principle I would want the start_date criteria to be something like observationDate > NOW - 1 hour. But is the start_date query parameter being passed ("?") essentially the run time (NOW)? If it is, I can set a time-difference interval in that WHERE statement. But if it already incorporates a lag interval, where is that interval specified?

So: what timestamp is being passed to the get_observations.sql query start_date parameter?

Thanks. I hope this is clear!

srstsavage commented 10 years ago

sos-injector-db determines the start_date to pass to the observation harvesting query by examining the target SOS for the most recent observation for that sensor. The start_date is passed to the query as an ISO8601 time string (e.g. 2014-03-25T21:57:00.000Z). So, your query should return all observations greater than the start_date (I should make it clear that observations exactly at the start date shouldn't be included, but the post-processing logic will remove them anyway).

The GCOOS queries have the strange strftime functions because sqlite doesn't support storing true datetimes and instead stores them as strings. We found it was much faster to convert the incoming start_date parameter to a string format matching the sqlite datetime string format (no millis) and just use lexical comparison than to construct datetime objects for all the observations. You shouldn't have this problem in MySQL, and should be able to just construct a datetime value from the incoming start_date parameter and use that to determine the correct observations.

Does this clear things up? The short version is that you should return all observations greater than the start_date parameter without building in your own offset.

emiliom commented 10 years ago

Thanks. I think that clears it up. I'm almost there!