CrunchyData / pg_featureserv

Lightweight RESTful Geospatial Feature Server for PostGIS in Go
Apache License 2.0
459 stars 91 forks source link

numberMatched reponse? #139

Open KoalaGeo opened 1 year ago

KoalaGeo commented 1 year ago

Hi,

We're testing accessing some sensor data via pg_featureserv and users have requested a numberMatched property - Is this something on your roadmap?

Example use: All results between two dates - https://ogcapi.bgs.ac.uk/demo/collections/published.ql_sen_water_level_data_sta/items.json?filter=resulttime%20BETWEEN%202022-09-17%20AND%202022-10-17

NumberReturned = 10

If we increase limit to 1000 (https://ogcapi.bgs.ac.uk/demo/collections/published.ql_sen_water_level_data_sta/items.json?limit=1000&filter=resulttime%20BETWEEN%202022-09-17%20AND%202022-10-17) theres at least that many results

As you can see via our pygeoapi server there's 5622767 items https://ogcapi.bgs.ac.uk/collections/sensor-water-level-sta/items?limit=1&f=json. However pg_featureserv is showing itself much more performant:

pg_featureserv: https://ogcapi.bgs.ac.uk/demo/collections/published.ql_sen_water_level_data_sta/items.json?limit=1000 = 308 ms pygeoapi: https://ogcapi.bgs.ac.uk/collections/sensor-water-level-sta/items?limit=1000&f=json = 2.81 s

dr-jts commented 1 year ago

This is a tricky requirement to meet efficiently. As you will understand, providing a count of the total number of records returned by a given query requires running that query and retrieving all results from the database. This is obviously very inefficient.

We have thought that an additional parameter could be supplied to optionally force computing the total number of result records. Or it could be provided as a configuration option (although that is pretty crude).

This is a common problem when querying databases. Often the answer is "don't do it - find some other way to answer the question". Is that an option in your use case?

KoalaGeo commented 1 year ago

yeah, chatting round the office here, seems the consenus is that numberMatched is better as a seperate query from the one to pull the data.

As you suggest, maybe a ?countonly=true query parameter would be the way?

KoalaGeo commented 1 year ago

Just noticed pg_featureserv doesn't include a next link for when the server has more results available than it returns eg

https://ogcapi.bgs.ac.uk/demo/collections/published.ql_sen_water_level_data_sta/items.json?limit=10&filter=resulttime%20BETWEEN%202022-09-17%20AND%202022-10-17

Has at least 1000 results

https://docs.ogc.org/is/17-069r4/17-069r4.html#fc-response realise it's a SHOULD and not a SHALL but it's useful