worldbank / HNP

World Bank's Geospatial Team (GOST) support to the Global Practice for Health, Nutrition, and Population.
MIT License
6 stars 3 forks source link

Approach and API spec #2

Open geohacker opened 4 years ago

geohacker commented 4 years ago

We'll build the database and API using an array of tools like ogr2ogr for ETL and FastAPI / pygeoapi for querying. After diving into the use cases a bit more and comparing various approaches, we think Hecate is not a good fit at the moment.

Goals

  1. Geo data — import admin and fishnet layers with unique ids that can be matched to different indicator layers.
  2. Layers — introduce custom schema for different layers of data. We should account for being able to add new data layers.
  3. Catalogue — one of the goals is to allow people to discover indicators and thematic groups of datasets.
  4. Flexible schema, indexes and caching — there are lot of indicators and datasets that we will load initially and this needs a flexible schema that we can design for better query performance. We will also need to think about caching as the number of requests increase.
  5. Support for time series data — even though time series isn’t a priority right now, the nature of a lot of the indicators are time series. The dashboard will want to query these.

Data

Data ingestion

API

This is the first draft of the API spec. We'll finalize this as the data is imported.

Catalogue

Geo

Layers

@guidorice @bitner @pieschker @KPGeo

KPGeo commented 4 years ago

@geohacker will ask Dany about the OneDrive / S3 access. I think S3 has some institutional lock down issues, but it can't hurt to ask. We've been using direct access to OneDrive through file explorer and updating the registry to enable larger file movement. More to come.

geohacker commented 4 years ago

Some updates here - instead of writing custom endpoints, we are using a combination of FastAPI and PygeoAPI. Both projects have great community behind it. We have used FastAPI for a lot of other projects at Devseed and think it'll be great choice.

To see how this will work, we have a staging stack running here http://covid-publi-1v66das8fk57r-771481456.us-east-1.elb.amazonaws.com/docs. This is a FastAPI stack that wrap around Pygeoapi routes.

We are also working on implementing an api key using OpenAPI api key and http basic auth schemes. More on this next week!