simonw / museums

A website recommending niche museums to visit
https://www.niche-museums.com/
46 stars 11 forks source link

Don't reverse geocode every museum on every commit #1

Closed simonw closed 4 years ago

simonw commented 4 years ago

The reverse geocoding annotation script currently runs against every single museum on every commit, which is inefficient.

Instead, it should download the previous version of the database from https://www.niche-museums.com/browse.db and only run against the records that have not yet had their various osm_ columns populated.

This depends on https://github.com/simonw/sqlite-utils/issues/66 so I can use a corrected version of upsert.

simonw commented 4 years ago

This is called out in their terms of use, which recommend caching to avoid sending same request multiple times: https://operations.osmfoundation.org/policies/nominatim/

Easiest thing here would be to add a caching table, rather than messing around with upsert.

I also need to send a custom user-agent string.

simonw commented 4 years ago

I can cache based on a geohash.

simonw commented 4 years ago

Implemented in d8e6f55fa5ac4c80d3d53d664a8755b763430020 and 3aae24b73df7fadcc26f5891bba1c28de5b3be84