Open zbraniecki opened 4 years ago
In MessageFormat other
means default.
Trouble is, this is not limited to MessageFormat, it goes all the way to Plurals, and CLDR:
https://github.com/unicode-org/cldr/blob/master/common/supplemental/plurals.xml
You can see that the other
entries only have examples, no rules.
That's be cause "if none of the rules apply, then return other"
So if we think of this as a switch:
switch (getPlural(locale, count)) {
case one: ...
case few: ...
...
default: ... // this is the same as case "other"
}
I think it is a good thing to have one and only one value as fallback, and that should be as generic as possible (covering all options)
Using *
would mean that translators should be the ones moving it around, depending what their language prefers (some languages default to neuter, some to masculine, etc.)
That can mess up localization tools, leveraging, and adds extra complexity for translators.
Does not match the "mental model" a programmer has about "the world":
switch (PLURAL($userNames)) {
[one] ...
[few] ...
*[many] ... // WAT? https://www.destroyallsoftware.com/talks/wat :-)
[other] ...
}
Which one is the default now? "many", because the *
says so, or "other", because CLDR (which is Unicode) says so?
Remember: "other" in CLDR plural means the same thing as "default" in programming languages switch
TLDR: I'm trying to make a case for:
key = { PLURAL($userNames), GENDER($users) ->
[one, masculine] ...
[one, feminine] ...
[one, other] ...
[other, masculine] ...
[other, other] ... // This is the default, and the only default
}
As per @stasm request in https://github.com/zbraniecki/message-format-2.0-rs/issues/6 I encoded the AST in my proposal to use
Option 2
.It handles multi-variant like
Anne published 2 pictures.
- where in Polish we'll need gender and plural selector.The issue I see with
Option 2
is that I'm not sure how to resolve uneven selectors, For example, if we'd like to extend the example to handleAnne and John published 2 pictures
andAnne published 2 pictures
, in Polish we'll have to handle the fact that Polish has different genders depending on the plural form of the subject.masculine
,feminine
andneuter
masculine-personal
andnon-masculine-personal
.In Fluent's proposal we would handle that via nesting:
As you can see it is fairly easy to encode the idea of "default" variants and uneven branches.
With
Option 2
, it becomes more tricky:we can encode it via a single "default":
which is limiting because we may resolve the plural perfectly and only struggle with gender.
Alternatively, we may have default per selector:
but that looks clunky.
There may be some other way to encode what are the defaults, like separately denote defaults, but they seem increasingly clunky to encode in human readable and consistent way.
I'm opening this issue with three thoughts:
default
values