RADAR-base / RADAR-RestApi

RESTful interface to access near real-time data
Apache License 2.0
2 stars 0 forks source link

Volume API crashes with 500 error #113

Closed mpgxvii closed 6 years ago

mpgxvii commented 6 years ago

Volume API crashes with 500 error when there is no &startTime=&endTime=

POST {{baseUrl}}/api/aggregate/BioIT-Demo/{{getDemoSubjectDetail.response.body.$.subjectId}}/distinct
  ?timeWindow=TEN_MIN
Authorization: Bearer {{token.response.body.access_token}}
Content-Type: application/json

{
  "sources":[
     {
        "sourceId":"b6f1a0bb-b663-4aa2-a776-3a51d4dc5f86",
        "sourceData":[
          {"name": "EMPATICA_E4_v1_ACCELEROMETER"}
        ]
     }
  ]
}

The front end does not know the effectiveTimeFrame for all sourcedatatypes.

nivemaham commented 6 years ago

@mpgxvii and @herkulano Current API changes would allow you to query under these options. If the endTime is not provided endTime will be the current timeStamp, since cannot request future data. If the startTime is not provided startTime will be calculated based on default number of windows and given timeWindow. If timeWindow is not provided, a best fitting timeWindow will be calculated for provided startTime and endTime. If none of the parameters are provided, API will return data for a period of 1 year with ONE_WEEK of timeWindow (~52 records) from current timestamp. If all parameters are provided, they will be respected in the request, but will be checked for whether it is under maximum number of records allowed in a query.

herkulano commented 6 years ago

We need a call without any startTime or endTime so that we "frame" the data to it's effectiveTimeFrame.

An example is if the data is only for 2 weeks and it's 4 years ago. Effectively we should show this timeFrame to the user and not the last year, month etc. Users might be misled to think there's no data.

The purpose of the volume component is to help the users navigate all of the data through time. For that, we need to know what is the effectiveTimeFrame of all the requested sources.

nivemaham commented 6 years ago

I understand your point, but this feature is contradicting with the issue of #107 . However, I think this could be feasible to for aggregated data, since we query for distinct value. But we will have to have some limitations on how many records we would allow in a query.

herkulano commented 6 years ago

Could you run a cronjob function that would pre-calculate the count for each day and keep these values on a separate table?

The function could slowly queue the calculation of the day count and the Volume API would work with this table instead, so then it would get a maximum of 366 days per year.

herkulano commented 6 years ago

For the Volume API, I don't think we need a higher resolution than a day.

nivemaham commented 6 years ago

If we decide the timeWindow of the query would be ONE_DAY, then what we have now should work. However, not for TEN_SEC or something.

herkulano commented 6 years ago

Cool! I think we shouldn't need more than aONE_DAY for this purpose.

/cc @mpgxvii @afolarin

nivemaham commented 6 years ago

Yes. Exactly. Do you still want to query the whole table under ONE_DAY resolution? I thought the UI would allow some navigation which would produce respective queries to query for older data or future data. Keep in mind that a complete database can give you 366numberofSensorsnumberOfSources records in current response. If we remove this limitation, it will be huge. So I recommend to keep one year with one day resolution as default. We can change the default resolution to ONE_DAY as you suggested.

herkulano commented 6 years ago

Do you still want to query the whole table under ONE_DAY resolution?

Yes, as I mentioned above there could be a case where there are only 2 weeks of data 4 years ago, so it makes sense to get all the existing data for each sensor, but at a maximum resolution of ONE_DAY. There can even be a restriction on the API not to allow queries with a resolution higher than ONE_DAY.

nivemaham commented 6 years ago

It would be better if you could define a timeWindow and a timeFrame when the effectiveTimeFrame is unknown or as a default values for parameters( e.g. ONE_WEEK for last 5 years or ONE_DAY for last 2 years) and send a request to the rest-api, or it can be restricted on rest-api level as a default request for volume API. If we would remove the restrictions on the number of records to query, in few years we will have the same problems. Some requests may overload the server if there is no restrictions.