hotosm / osm-analytics-cruncher

Backend code for osm-analytics
https://github.com/hotosm/osm-analytics
15 stars 10 forks source link

API Definition #11

Closed lukasmartinelli closed 6 years ago

lukasmartinelli commented 7 years ago

Let's discuss the API endpoint here.

tyrasd commented 7 years ago

Eventually it would make sense to set up a separate repository for the api, but for now I think it's ok to discuss it here. :)

smit1678 commented 7 years ago

Taking a first stab at some notes based on the current frontend to keep this conversation going.

When a HOT project or AOI is selected the frontend shows (within that specific AOI):

An API essentially already exists in the form of vector tiles, so a first take would look to extend this into a more RESTful API with endpoints that match the views you get within the OSMA frontend.

GET /stats - query by simple bounding box, return statistics GET /stats/country/portugal - within Portugal admin boundaries, returns stats GET /stats/project/2050 - within Tasking Manager project, return stats

To get temporal functionality:

GET /stats/year/day GET /stats/country/portugal/year/day GET /stats/project/2050/year/day

Is there a better way to do this? Anything missing here?

Are there needs for any CRUD operations yet? I don't think so but maybe.

lukasmartinelli commented 7 years ago

So how this could work implementation wise is we store all stats binned by the tiles in the vector tiles that are served by OSM Analytics.

osm_analytics_tool

Then to aggregate stats we fetch all vector tiles that are covered by the shape (you can use https://github.com/mapbox/tile-cover), fetch all those tiles and aggregate and return result.

This can be very fast because the backend has low latency to lookup these PBFs. Even if it has to lookup a few dozen.

smit1678 commented 7 years ago

@lukasmartinelli 💯 agreed. @tyrasd in line with your thinking and new improvements?

tyrasd commented 7 years ago

(sorry for the late reply, I've been at a conference last week)

Yes, the general approach looks fine.

Some implementation comments:

lukasmartinelli commented 7 years ago

So who is gonna move on implementing this?

We can assist along the way.

tyrasd commented 7 years ago

//cc @cgiovando

cgiovando commented 7 years ago

Our team at the World Bank is going through the hiring process as we speak and we should be able to have a developer/firm selected by mid-April who will be working on this.

mikelmaron commented 7 years ago

per chat w/ @smit1678, it may make sense to use this abstraction, but implement processing entirely browser side, in js library. cost of implementing and maintaining API on server may cause more problems than it solves.

tyrasd commented 7 years ago

One argument for having a server-side REST-API is that our data model might not yet be considered fully stable. E.g. if we want to include full-history data in the future, the data model needs to be changed and it will be necessary to change the algorithms calculating the stats from it. If we use a server side implementation, that's not a problem. But if we provide a library it will be harder to push necessary changes downstream to the data consumers.

mikelmaron commented 7 years ago

for that case -- just bump the library version @tyrasd?

tyrasd commented 7 years ago

Yeah, but until all data consumers have migrated to such a version 2, we'd need to continue to offer data-tiles compatible with v1 (which means a ~double consumption of resources for processing and hosting during that period).

Other plus-points for a server side API:

If we would start by building it as a server side nodejs service, we'd still be able to eventually (once we're happy with the functionality and data scheme) release the respective code as a library for everyone who's interested in integrating it that way (and potentially enabling some more elaborate statistics that would be too resource hungry for the rest api server).

esasisa commented 7 years ago

Server side API is more feasible solution for this situation. It provide control on data access and security on back-end data. REST services will provide interoperable integration solution over data-tiles.

jenningsanderson commented 7 years ago

Another use case I just came across: Ability to identify / Visualize / export individual geometries and edits aggregated by changeset comment. Since changeset comments are used by OSM communities (like the tasking manager), this would be very helpful.

...Maybe a good way to go about this would be embedding the changeset comments into additional metadata at the tile level?

tyrasd commented 6 years ago

Closing since in the meantime, the osma-api: https://github.com/GFDRR/osm-analytics-api was created.