Add new endpoint for timeseries

mrkhan commented 1 year ago

Requirement

granularity: Day, Month, Year.
- Intervals will be defined as ISO 8601 time format eg: P1M
use same response format and parameters as /stats
multiple hashtags as comma seperated
Not cumulative

Sample response format :

[{
          "changesets": 65009011,
          "users": 3003842,
          "roads": 45964973.0494135,
          "buildings": 242,
          "edits": 1095091515,
          "latest": "2023-03-20T10:55:38.000Z",
          "hashtag": "*",
          "enddate": "2017-10-01T00:00:00.000Z",
          "startdate": "2017-09-01T00:00:00.000Z"
},{
          "changesets": 65009011,
          "users": 3003842,
          "roads": 45964973.0494135,
          "buildings": 110,
          "edits": 1095091515,
          "latest": "2023-03-20T10:55:38.000Z",
          "hashtag": "*",
          "enddate": "2017-11-01T00:00:00.000Z",
          "startdate": "2017-10-01T00:00:00.000Z"
},…,{
          "changesets": 65009011,
          "users": 3003842,
          "roads": 45964973.0494135,
          "buildings": 844294167,
          "edits": 1095091515,
          "latest": "2023-03-20T10:55:38.000Z",
          "hashtag": "*",
          "enddate": "2018-11-23T12:23:00.000Z",
          "startdate": "2018-11-01T00:00:00.000Z"
}]

TODO:

[ ] decide endpoint name
[ ] decide response format eg: map, list

mmerdes commented 1 year ago

@mrkhan and @tyrasd Thanks for drafting the sample response. So this implies a mandatory parameter named granularity?

Hagellach37 commented 1 year ago

the parameter might be better called duration according to what is defined here: https://en.wikipedia.org/wiki/ISO_8601#Durations

In ISO writing this would refer to something like P1M for a monthly duration or P1Y for a yearly duration

Hagellach37 commented 1 year ago

to be very correct. the values for roads in the response should only have maximum three numbers after the digit.

Assuming that the value in the response is in kilometer and from our database we get an integer in meter.

mrkhan commented 1 year ago

Yes, granularity/duration is mandatory. Although I am a bit confused about how should this be handled eg: startDate=2010-01-15&endDate=2010-03-25&duration=P1M

in this example since the start and end are not at the perfect month intervals

koebi commented 1 year ago

Here's my thoughts on any granularity parameter:

The name should explain what it does. I think granularity is quite clear.
Using ISO durations as a format gives the impression of free choice. In fact, there's only three options.
In the backend, this probably should be an ENUM, so the API should be a string enum as well?
It should either be mandatory or have a default

As for the handling of what @mrkhan described, I'd suggest respecting start and end date and binning data on "natural" granularity breaks, i.e. midnight for daily, 01. of month for monthly and 01.01. for `yearly.

In the case of startDate=2010-01-15&endDate=2010-03-25&duration=P1M that'd lead to three bins:

[{
    "changesets": …,
    "startdate": "2010-01-15T00:00:00Z",
    "enddate": "2010-02-01T00:00:00Z"
},
{
    "changesets": …,
    "startdate": "2010-02-01T00:00:00Z",
    "enddate": "2010-03-01T00:00:00Z"
},
{
    "changesets": …,
    "startdate": "2010-03-01T00:00:00Z",
    "enddate": "2010-03-25T00:00:00Z"
}]

EDIT: Having an ISO duration would (from my POV) mean having to adhere to ISO-specs regarding duration, and I'm not sure whether that's what we'd want?

mrkhan commented 1 year ago

I like the natural granularity break, simple and makes sense. ohsome Dashboard (frontend) have 6 granularities, shall we too have them?

    {label: 'hourly', value: 'PT1H'},
    {label: 'daily', value: 'P1D'},
    {label: 'weekly', value: 'P1W'},
    {label: 'monthly', value: 'P1M'},
    {label: 'quarterly', value: 'P3M'},
    {label: 'yearly', value: 'P1Y'}

I'd suggest let stick to the standard duration than to have our own.

rtroilo commented 1 year ago

What about to return a little bit more structure and metadata instead of "just" an array of data? Something similar to the ohsome-api, maybe including some metadata about the request it self. e.g.:

{
  "attribution" : {
    "url" : "https://ohsome.org/copyrights",
    "text" : "© OpenStreetMap contributors"
  },
  "apiVersion" : "1.9.0", -- some details about the api
  "metadata": {
    "executionTime": 1943,
    "requestUrl": "https://stats.ohsome.org/..."
  },
  "query:" {
     "timespan": {
         "startDate": 2010-01-15,
         "endDate": 2010-03-25,
         "interval": "P1M", -- ganularity/duration or what ever fits better
      },
     "hashtag": "*",
  },
  "latest": "2023-03-20T10:55:38.000Z", -- what does this latest mean?
  "result" : [{
          "changesets": 65009011,
          "users": 3003842,
          "roads": 45964973.0494135,
          "buildings": 844294167,
          "edits": 1095091515,
          "enddate": "2018-11-23T12:23:00.000Z",
          "startdate": "2018-11-01T00:00:00.000Z"
  }, ...],
  "error": { -- in case of an invalid query or similar
     ...
  }
}

mrkhan commented 1 year ago

if this meta info is agreed then we should add it to other endpoints too. eg: /stats, /stats_static, etc

rtroilo commented 1 year ago

In general yes, but as I understood, for the moment, the /stats endpoint have to return the same structure as HOT do, so that they don't have to change their code base to much. (except switching to a new endpoint) But I agree, if we agree to to some sort of meta info, in the long run, all endpoints should return a similar/unified structure.

GIScience / ohsome-now-stats-service

Add new endpoint for timeseries #7