protomaps / basemaps

Basemap PMTiles generation and cartographic styles for OpenStreetMap data and more
https://maps.protomaps.com/
Other
378 stars 49 forks source link

Add allowlist of languages for name translations #112

Closed nvkelso closed 3 months ago

nvkelso commented 1 year ago

Currently we pass thru any and all name translations from Natural Earth (limited to around 25 names) and OpenStreetMap (can be hundreds of names).

While this is great for users in any and all locales (meaning they can get a map in their language, whatever their locale is)... it's not great for tile size, especially at low zooms.

Let's add a runtime modifiable config for a small set of default languages than can be optionally respecified with a list of locales.

Initial set of locales could match what's in Natural Earth now.

bdon commented 1 year ago

For reference:

ar
bn
de
en
es
fa
fr
el
he
hi
hu
id
it
ja
ko
nl
pl
pt
ru
sv
tr
uk
ur
vi
zh
zht
wipfli commented 1 year ago

Is there a dataset somewhere which lets you know what the most likely local languages are?

OSM seems to have not solved this problem entirely because the name tag is not used consistently for multi language labels... In particular in India OSM does not provide local names in the name tag.

wipfli commented 1 year ago

For India, I once found some state polygons and a list of official languages. See https://github.com/wipfli/swiss-map/tree/main/planetiler/india which helped me make this map: https://wipfli.github.io/index-by-grapheme/#map=4.25/19.72/78.8

But it would much better if we had something like this globally.

Defining the default local language might be a controversial political statement similar to disputed borders, but it would be a valuable addition which no open source map has hat the moment as far as I know.

bdon commented 1 year ago

@wipfli we will be able to do spatial logic soon via https://github.com/protomaps/basemaps/pull/114/files#diff-e8b747c6c34350863423324d80c854ca79d2a04ef037f811f896dcabeb2a5742

nvkelso commented 1 year ago

Who's On First records this at the country, dependency and region levels for both official and spoken languages.

Examples:

bdon commented 3 months ago

Allowed languages list is here: https://github.com/protomaps/basemaps/blob/main/tiles/src/main/java/com/protomaps/basemap/names/OsmNames.java#L18