Closed vanitabarrett closed 2 years ago
- As noted in feedback, ensure there is only one key for 'zero’ translations
A quick thought as I read the feedback: isn't this point more linked to the character count component itself (where the meaning of under_limit_zero
and over_limit_zero
are the same) that the pluralisation of translations itself? It feels more up to the component to check "oh, I've got zero, so I use that special key". If that's the case, it's maybe more part of #2805/#2701.
So, I tried to wrap my head around what would need to happen based on the spike and it seems like it's a two part update:
A. I18n
constructor function to receive both locale
and translations
, so it can decide how to pick pluralised form:
locale
option does not appear in the final comment for the Javascript and Nunjucks APIlang
in the spike. Should that be kept? This may cause confusion if the locale in the doc doesn't match the language of the translation strings and it feels safer to be explicit by passing a locale alongside the strings. I may have missed some comments with a decision on this, thoughFeels like this can be implemented on its own ahead ot the changes to the t
method (see below).
B. Then there's the meaty bit of updating I18n.t(key, {count: 123, otherPart:...})
and adding relevant tests:
Compute the suffix to append to the requested key for lookin up the translated string. That suffix depends on count
and the locale set in the I18n object.
In the spike, it uses Intl.PluralRules
and falls back to two maps for mapping:
count
to a suffix that'll be appended to the key
.
Two questions sprung to mind:Lookup the corresponding translation based on that key, showing an explicit error if the key is missing
Finally, interpolate the translation with whatever data is provided (count
included). The issue mentions " Internationalisation logic expanded to introduce support for number formatting, where relevant + tests", but:
a. I couldn't find any discussion on that topic specifically, nor example in the spike (is it a different spike maybe?)
b. This feels orthogonal to pluralising so maybe worth its own issue: people may want to format numbers in regular translations (eg. to show a price with 2 digit decimals).
So, I tried to wrap my head around what would need to happen based on the spike and it seems like it's a two part update:
A.
I18n
constructor function to receive bothlocale
andtranslations
, so it can decide how to pick pluralised form:
- The shape of the options passed to the constructor in the spike allowed for both to be passed, but no longer in the current implementation. Was that to provide the minimal API that'd make the tests pass or because a different direction is taken? (the
locale
option does not appear in the final comment for the Javascript and Nunjucks API
The latter, I believe – there's no need for a locale
option for the functionality implemented so far, so we haven't added it.
I think the intent was to include locale
in the options object, rather than add a new argument?
- The default locale was pulled from the document's
lang
in the spike. Should that be kept? This may cause confusion if the locale in the doc doesn't match the language of the translation strings and it feels safer to be explicit by passing a locale alongside the strings. I may have missed some comments with a decision on this, though
I think there are a few routes we could go down:
$module
by recursively looking for a lang attribute, starting with the $module
itself and up to and including the root element$module
by recursively looking for a lang attribute, starting with the $module
itself and up to and including the root element, but allow this to be overridden by configWith the first two options I think this might mean the component class working out the locale and passing it to the i18n object? Unless we pass the $module
itself so that the i18n class can recurse through the parent elements?
I'm not 100% sure about the second option as it'd mean allowing for both a lang
and a data-locale
attribute on the $module
? 🤔
- Do we need other options? I was thinking specifically about something to let people set the function for picking plural rules, in case we missed a locale. Again, this may be something that was already discussed and I missed
I'd suggest we can add this in the future if we find there's a need for it.
Feels like this can be implemented on its own ahead ot the changes to the
t
method (see below).B. Then there's the meaty bit of updating
I18n.t(key, {count: 123, otherPart:...})
and adding relevant tests:
- Compute the suffix to append to the requested key for lookin up the translated string. That suffix depends on
count
and the locale set in the I18n object. In the spike, it usesIntl.PluralRules
and falls back to two maps for mapping:
- BCP 47/RFC 5646 locale codes to groups of shared ways to handle plural rules,
- then said groups to functions converting the
count
to a suffix that'll be appended to thekey
. Two questions sprung to mind: a. What's the reason for rolling our own, vs. polyfilling? Was it to avoid the weight of another polyfill (15+KB for this one, with a few dependencies for it as well, judging by polyfill-library, not sure if included or not)? b. Kind of tied to A.3. : The maps load all rules for all languages. Were there discussions for loading only what's needed that I may have missed?
Suspect @querkmachine is the best person to answer these ones!
- Lookup the corresponding translation based on that key, showing an explicit error if the key is missing
- Finally, interpolate the translation with whatever data is provided (
count
included). The issue mentions " Internationalisation logic expanded to introduce support for number formatting, where relevant + tests", but: a. I couldn't find any discussion on that topic specifically, nor example in the spike (is it a different spike maybe?) b. This feels orthogonal to pluralising so maybe worth its own issue: people may want to format numbers in regular translations (eg. to show a price with 2 digit decimals).
I think number formatting should be tackled separately in https://github.com/alphagov/govuk-frontend/issues/2568
I think the intent was to include
locale
in the options object, rather than add a new argument?
I should have exemplified there for clarity, sorry. The spike was indeed receiving options as two keys in an object:
{
locale: "en",
translations: {
key1: 'Translated string'
}
}
In its current state, the constructor expects the translations straight away:
{
key1: 'Translated string'
}
I was wondering if this was because of putting in just the code to make tests pass or another decision (as having locale
+ translations
next to each other in an object looks a pretty good option)
I think there are a few routes we could go down:
- calculate the lang for the
$module
by recursively looking for a lang attribute, starting with the$module
itself and up to and including the root element- calculate the lang for the
$module
by recursively looking for a lang attribute, starting with the$module
itself and up to and including the root element, but allow this to be overridden by config- explicitly require it to be set in config
- explicitly require it to be set in config (but we have to 'fall back' to something, either the document or just hardcoded 'en'?)
With the first two options I think this might mean the component class working out the locale and passing it to the i18n object? Unless we pass the
$module
itself so that the i18n class can recurse through the parent elements?I'm not 100% sure about the second option as it'd mean allowing for both a
lang
and adata-locale
attribute on the$module
? 🤔
At the I18n
class level, it feels a bit weird to have to wrangle with the concept of a $module
that's in the DOM and that you can use to look further up for a lang
information. It feels more like the responsibility of the components themselves to gather the right arguments to pass to their I18n
instance.
I like the idea of looking up the DOM for the closest [lang]
and picking that as a fallback. It's a double edged sword, though, as it'd allow to set the value for multiple components at once, but at the same time, risk having components with translation strings in one language and their pluralisation rules in different locale by mistake. Is it complexity we need from the start though, or something that may be easier provided through a base class in v5.0?
Unrelated: All that talk about locale made me think we should likely consider it for the data attributes expected by the character count (added a comment to that effect on that issue).
I'd suggest we can add this in the future if we find there's a need for it.
Suspect @querkmachine is the best person to answer these ones!
I think number formatting should be tackled separately in https://github.com/alphagov/govuk-frontend/issues/2568
👍🏻 x3 😄
I was wondering if this was because of putting in just the code to make tests pass or another decision (as having
locale
+translations
next to each other in an object looks a pretty good option)
Unsure! Possibly another one for @querkmachine when it's back tomorrow…
I like the idea of looking up the DOM for the closest
[lang]
and picking that as a fallback. It's a double edged sword, though, as it'd allow to set the value for multiple components at once, but at the same time, risk having components with translation strings in one language and their pluralisation rules in different locale by mistake. Is it complexity we need from the start though, or something that may be easier provided through a base class in v5.0?
All good points. As you suggest, maybe we should keep it simple and expect an explicit data-i18n.locale
for now, and we can look at doing something 'cleverer' in 5.0?
If we were going to go down that route I think I'd prefer to hardcode 'en' as the fallback rather than looking at the document lang, purely to avoid encouraging users adding a lang
attribute on the document if the whole document isn't translated (e.g. the header and footer are in English, but the contents of the main
are translated). I might be overthinking it though…
For some background context: Part of my reasoning for including locale
as an explicit option in the initial spike was because of the need to hardcode pluralisation rules for legacy IE.
I was envisaging a situation where a user might need to translate a component into a language which the hardcoded rules don't support; for example, Mongolian. However, Mongolian has the same plural forms as English, so a user could choose to use Mongolian translation strings but set the locale
option to 'en' to use English language pluralisation logic.
Since then I've come to think that this functionality is probably extraneous and won't be needed by 98+% of users. It will also not be necessary when we remove JS support from IE, as the appropriate plural form information will already be available in evergreen browsers as Intl.PluralRules
.
If we opt to keep it as a configuration option, I worry that it's more likely to cause confusion unless we make the intended use more explicit: such as splitting the option in two and naming them something like overridePluralRulesLocale
and overrideNumberFormatLocale
.
So, my thinking here was that, rather than allowing the user to provide a locale
property as we did in the spikes, we would instead find and use the lang
attribute of a parent element or from the document level.
My reasoning for this is that the lang
attribute needs to be set anyway for accessibility compliance, both on the page-level and for any sections of the page that are in different languages to the rest of the page (WCAG criterion 3.1.1 (Level A) and 3.1.2 (Level AA)).
By making lang
be the only way of doing this, we would be enforcing (or, at least, very heavily encouraging) users to comply with accessibility guidelines, and prevent issues with one locale being given in data attributes/JS configuration that isn't aligned with the actual language in use. Keeping it simple now, and only adding that complexity if users feel they need it, seems the better route to take.
My hypothesis is that doing it this way shouldn't add unnecessary code weight as we already include an Element.prototype.closest
polyfill in Frontend, though that could always be wrong!
For the other questions I just noticed (cc: @romaricpascal)
a. What's the reason for rolling our own, vs. polyfilling? Was it to avoid the weight of another polyfill (15+KB for this one, with a few dependencies for it as well, judging by polyfill-library, not sure if included or not)?
Avoiding polyfills and their dependencies and simplicity of implementation. We don't actually need a lot of the PluralRules functionality at the moment, we only need to support positive integers (not negatives, decimals, and ranges as the spec supports).
b. Kind of tied to A.3. : The maps load all rules for all languages. Were there discussions for loading only what's needed that I may have missed?
I included a whole bunch of languages in the spike mainly for completion's sake (i.e. the work is already done and there if we need it). In practice, services are only provided in a few different languages—English, Welsh, and very occassionally Arabic—although GOV.UK content pages use many more. I imagined we would be able to cut down on the size of these if and when we had a better idea of what languages to prioritise.
We don't have any existing architecture for Frontend to conditionally load certain files, or having users load specific configuration files, which is why they're all bundled together for now. We've discussed using dynamic import
s in the past, but this would be something we would introduce post-IE support, at which point we'd be removing the hardcoded plural rule maps anyway.
What
Expand our internationalisation logic to handle pluralisation. This will allow users to be able to translate hardcoded strings in the character count component.
For example: “You have 1 character remaining” vs “You have 20 characters remaining”
Needs to handle languages other than English, that may have different plural forms, see https://cldr.unicode.org/index/cldr-spec/plural-rules
Why
Some components will need to be able to switch between different strings based on the value of something, e.g: the character count component will need to switch the count message based on the count (see example above)
Who needs to work on this
Developers
Who needs to review this
Developers
Done when