Open wipfli opened 6 months ago
I wrote a bit about how Google Maps handles localization here: https://oliverwipfli.ch/localization-in-google-maps-2024-05-19/
They support 81 languages and 249 regions. I wonder what the best way would be to encode different geopolitical perspectives in vector tiles.
A trivial solution would be to make one build per language/region pair but then you get 81*249=20169 builds which is a bit too much. On the other hand, a user might just get the combinations they need...
Another option would be to create groups of countries that have the same perspective. Because many countries should have the same perspective.
@wipfli what a great write-up, thanks for sharing!
For the languages and a global basemap I see a few solutions:
name
and localized name like name_en
) to some small number of localized names are exported during tile cut. Then you don't need a middleware layer or edge cache segmenting.In my experience, file size impact from names is extreme for low-zooms (0-5) and moderate for mid-zooms (6-9), because the number of localized names drops off significantly the farther zoomed in you are. Less of the place=*
features will have name translations, and few of the other features like highway=*
features have name translations. Limiting the number of named features in low-zoom and mid-zoom tiles should also be employed. Introducing automated translations (especially between latin and non-latin based languages) would ballon that back up regionally.
For the disputed boundaries (and map labels in the places layer):
The approach taken with Tilezen is to mark the point-of-view onto the line and point features so the style can turn them on or off. This means there is a default kind
key-value and optional series of kind:{country_code}
key-value pairs. These are most visible at low-zooms (where one can see multiple conflict areas in the same map view), but also important at high-zooms for consistency so that point-of-view is carried thru from the Natural Earth zooms to the OpenStreetMap zooms.
For both name localization (based on 2-char language code, though some require 3-char and 6 char) and region (based on 2-char country codes) localization, the style template and should allow for many variants for the combination of the lang and region codes.
See also:
It's fairly easy to implement the low-zoom solution into Protomaps basemaps from Natural Earth as it's straight mapping of input and output. The high-zooms from OSM is tricker to walk all the relations and there is more business logic during the ETL.
The protomaps basemap should have a mechanism for disputed map labels.
For example, Turkey is the only country that recognizes TRNC as a country, see wikipedia https://en.m.wikipedia.org/wiki/Northern_Cyprus. So a map localized to Turkey should show TRNC as a country label.
Related are divisions and perspectives in the new overture schema https://docs.overturemaps.org/schema/reference/divisions/division/