Open 1ec5 opened 6 years ago
fyi @cammace ☝️
I'm toying with the idea of implementing this. I think the design makes sense, although the symbol-layer-verbosity problem is annoying.
As I mentioned in https://github.com/mapbox/mapbox-gl-js/pull/6270#issuecomment-375817855, I wonder if BCP 47 gives us more information than we want/need. If we restrict locale specifications to ISO 639-1 codes, we probably don't even need locale-utils
(saving code size, but more importantly semi-hidden complexity), and we have a simpler input to platform-specific APIs that may not speak BCP 47. On the other hand, we'd give up being able to choose number formatters based on country...
There’s already a need for more than ISO 639-1: many languages only have ISO 639-2 codes, not ISO 639-1 codes, and a few major languages like Chinese often need to be qualified by an ISO 15924 script code or ISO 3166 country code, such as for label localization. For example, the Mapbox Streets source distinguishes between zh
and zh-Hans
, leaving open the possibility of distinguishing zh-Hant
in the future.
@1ec5 🤔 How about two arguments, language + (optional) region:
This would not support script customization (e.g. Hans
vs Hant
and wow I just realized the s in hans was for "simplified"), or the variant
options in BCP 47. Again, the motivation is maximum cross-platform compatibility:
Separating the language and region into two arguments gives us less flexibility to support more locale information (such as script codes) in the future. I think it would be more forward-compatible if each locale-aware operator accepts a single locale code argument; each operator would decide for itself how specific a code it would honor. For example, locale matching needs to respect script differences, but perhaps string comparison does not.
It’s unfortunate that
streets-languages
would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specificname
fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator,mapbox-streets-languages
.
As of mapbox/tilejson-spec#42, TileJSON 3.0 will formally declare a vector_layers
property that enumerates the layers and their fields. While the specification doesn’t provide a way to explicitly state the language of each field, I think it would be fine to assume name_*
fields are of the form name_{ISO 639}
, which would be no less robust than hard-coding language fields in the style or SDK.
While I'm overall very positive about this change, it should still support user overrides to the locale. eg. My browser might be set to English, but I want to build in a button on my site that will swap the map to German, regardless of my browser setting.
My browser might be set to English, but I want to build in a button on my site that will swap the map to German, regardless of my browser setting.
That could be implemented via an API such as setLabelLanguage()
.
The style’s author has no opportunity to react to changes that could radically alter the style’s appearance, for instance by increasing the font size when the system language is Chinese.
It’s unfortunate that
streets-languages
would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specificname
fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator,mapbox-streets-languages
.
Per mapbox/mapbox-gl-native#15659 and https://github.com/mapbox/mapbox-gl-native/issues/14470#issuecomment-489216407, knowing the language contained in each layer of the Streets source would allow GL to choose the appropriate font for a given character without forcing the developer to specify font overrides. The locale matching proposed here would help to associate that information with the fonts specified in the stylesheet.
There should be a simple way for the style author to specify that a
text-field
should be set to thename_*
feature property that best fits the system’s preferred languages. Secondarily, it would be great if the most appropriate locale could be used on its own in expressions.Motivation
Localizing a style’s labels currently entails iterating over all the layers, manually replacing references to
name_*
feature properties within eachtext-field
value. If these values are expressions, replacing the references can be an involved, recursive step. The iOS and macOS map SDKs have a built-in option,MGLStyle.localizesLabels
, that applies these changes automatically based on the system language and region preferences. There’s a plugin for GL JS and a forthcoming plugin for the Android map SDK (mapbox/mapbox-plugins-android#74) that do likewise.While this approach is effective, it operates at such a high level that the localizing code doesn’t have a good way to reason about the style author’s intentions. Should
{name} ({name_en})
be replaced by{name_es} ({name_en})
or just{name}
? The style’s author has no opportunity to react to changes that could radically alter the style’s appearance, for instance by increasing the font size when the system language is Chinese. Moreover, the localization feature implicitly opts the map into runtime styling–specific behaviors like disabling automatic style refreshes.Design
The style specification would be extended with two expression operators:
user-locales
takes no arguments and evaluates to an array of locale identifiers corresponding to the user’s preferences.match-locales
has the signature["match-locales", inputLocales, availableLocales]
and evaluates to the item inavailableLocales
(an unordered array of locale identifiers) that corresponds to the first item ininputLocales
(e.g.,user-locales
) that matches one ofavailableLocales
.For the purposes of these operators, a locale identifier could include a language code, script code, or region code, or some combination thereof. I would be in favor of specifying BCP 47 as the locale identifier standard to follow.
In typical usage, a style author would opt into localization by setting
text-field
to a value such as:Meanwhile,
["at", 0, ["user-locales"]]
could be used on its own as part of a number formatting operator (#4119) and a case- and diacritic-folding string comparison operator (#4136).Design alternatives
It’s unfortunate that
streets-languages
would have to be hard-coded and duplicated on every symbol layer. However, I don’t see a good way around that unless the vector tile source formally declares its language-specificname
fields (perhaps via mapbox/tilejson-spec#14) or we encapsulate that array in a third expression operator,mapbox-streets-languages
.It might be tempting to rely on
match
as an alternative tomatch-locales
; however, locale identifier matching rules are rather complicated. For example, for the set of languages supported by the Streets source,en-US
should resolve toen
,zh-TW
should resolve tozh
, andzh-Hans-TW
should resolve tozh-Hans
.Implementation
user-locales
would be implemented by returningnavigator.languages
. iOS/macOS would use+[NSLocale preferredLanguages]
.match-locales
, GL JS could use locale-utils for this purpose. iOS/macOS would use+[NSBundle preferredLocalizationsFromArray:forPreferences:]
./ref https://github.com/mapbox/mapbox-gl-native/issues/10713#issuecomment-366382036 /cc @mapbox/gl-core @fabian-guerra @tobrun @langsmith @nickidlugash @bsudekum