protomaps / basemaps

Basemap PMTiles generation and cartographic styles for OpenStreetMap data and more
https://maps.protomaps.com/
Other
347 stars 44 forks source link

Load places from Natural Earth at project init #108

Closed nvkelso closed 1 year ago

nvkelso commented 1 year ago

To get consistent min_zoom values between NE and OSM data we do a live data join. But that data join is always reading in NE data instead of just once.

The current config doesn't seem to noticeably affect build time so it's probably OK as is?

But it's better architecture to only load it once on project init. We can file a followup issue for that?

_Originally posted by @nvkelso in https://github.com/protomaps/basemaps/pull/93#discussion_r1297786030_

bdon commented 1 year ago

Can we come up with an allowlist of the small set of tables to load into memory?

ne_10m_populated_places, ne_10m_admin_0_countries, ne_10m_admin_1_states_provinces

nvkelso commented 1 year ago

Tilezen lists out an allowlist of Natural Earth assets here:

Which yields a maximal set:

ne_10m_admin_0_boundary_lines_disputed_areas
ne_10m_admin_0_boundary_lines_land
ne_10m_admin_0_boundary_lines_map_units
ne_10m_admin_0_boundary_lines_maritime_indicator_chn
ne_10m_admin_0_countries
ne_10m_admin_0_countries_iso
ne_10m_admin_0_countries_tlc
ne_10m_admin_0_map_units
ne_10m_admin_1_states_provinces
ne_10m_admin_1_states_provinces_lines
ne_10m_coastline
ne_10m_lakes
ne_10m_land
ne_10m_ocean
ne_10m_playas
ne_10m_populated_places
ne_10m_roads
ne_10m_urban_areas
ne_110m_admin_0_boundary_lines_land
ne_110m_coastline
ne_110m_lakes
ne_110m_land
ne_110m_ocean
ne_50m_admin_0_boundary_lines_disputed_areas
ne_50m_admin_0_boundary_lines_land
ne_50m_admin_0_boundary_lines_maritime_indicator_chn
ne_50m_admin_1_states_provinces_lines
ne_50m_coastline
ne_50m_lakes
ne_50m_land
ne_50m_ocean
ne_50m_playas
ne_50m_urban_areas

Of which only these need be loaded into a project database for later data joins (because they have names, min_zoom, and wikidataid properties)

ne_10m_admin_0_countries
ne_10m_admin_0_countries_iso
ne_10m_admin_0_countries_tlc
ne_10m_admin_0_map_units
ne_10m_admin_1_states_provinces
ne_10m_lakes
ne_10m_playas
ne_10m_populated_places

There are 3 versions of countries... because the basic one is Natural Earth's defacto polygons, *_iso is only those countries listed by the ISO for a data join, and *_tlc is a hybrid set of all the random "countries" that various countries recognize pair-wise that is larger than ISO but smaller than NE's map sub units (also for a data join). If you were to load just one "country" theme in it'd be the TLC (and ignore the map_units).

You only need the lakes and playas if we extend the places NE <> OSM data join to the physical_points label (a good idea but new work).

So finally the minimal set:

ne_10m_admin_0_countries_tlc
ne_10m_admin_1_states_provinces
ne_10m_lakes
ne_10m_playas
ne_10m_populated_places
bdon commented 1 year ago

This has been merged.