mdn / content

The content behind MDN Web Docs
https://developer.mozilla.org
Other
8.91k stars 22.43k forks source link

Region is not mandatory part of locale string #33687

Closed hamishwillee closed 3 days ago

hamishwillee commented 1 week ago

Fixes #33685

As noted in the linked issue, the region part of the locale identifier string is not mandatory. IMO this is just a left over when the page was created.

FWIW I think this page could do with more work @Josh-Cena . In particular the description is really over the top and enthusiastic in an unhelpful way. Specifically

The region is an essential part of the locale identifier, as it places the locale in a specific area of the world. Knowing the locale's region is vital to identifying differences between locales.

Essential does imply mandatory, which it is not. Vital implies mandatory and super important. Region is IMO just another important part of a language string that you can choose to refine your locale specificity.

We might perhaps do something like:

Using a region indicates a region-specific preference for a locale variant, allowing selection for differences between the same language in, say, different countries.

github-actions[bot] commented 1 week ago

Preview URLs

External URLs (1) URL: [`/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/baseName`](https://pr33687.content.dev.mdn.mozit.cloud/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/baseName) Title: `Intl.Locale.prototype.baseName` - (1 time) (Note! This may be a new URL 👀)

(comment last updated: 2024-05-31 10:23:35)

jackdeguest commented 1 week ago

Using a region indicates a region-specific preference for a locale variant, allowing selection for differences between the same language in, say, different countries.

Be careful that a region is not just a country code. It could be a world region like the code 150 for Europe. Thus, a region, under the LDML can either be a 2-characters country code that includes even obsolete one for consistency, or a 3-digits world region code.

hamishwillee commented 1 week ago

Be careful that a region is not just a country code. It could be a world region like the code 150 for Europe. Thus, a region, under the LGML can either be a 2-characters country code that includes even obsolete one for consistency, or a 3-digits world region code.

Yes. Would you be happier with the following? (if not, propose your own words):

Using a region indicates a region- or country- specific preference for a locale variant, allowing selection for language differences between the same language in, say, different countries.

jackdeguest commented 1 week ago

Using a region indicates a region- or country- specific preference for a locale variant, allowing selection for language differences between the same language in, say, different countries.

May I suggest to use locale identifier rather than locale variant, because variant is also a component of a locale under LDML. Also, let us not confuse language with locale.

A locale could be something such as ja-t-de-AT-t0-und-x0-medical-u-ca-japanese-tz-jptyo-nu-jpanfin-x-private-subtag whereas its language part would only be ja. So maybe this instead. What do you think?

Using a region indicates a region- or country- specific preference for a locale identifier, allowing selection for differences between the same locale in, say, different countries.

hamishwillee commented 1 week ago

@jackdeguest I think you know a lot more about locales than I do. I've added this into the PR. Let's see what the reviewer says.

jackdeguest commented 1 week ago

@jackdeguest I think you know a lot more about locales than I do. I've added this into the PR. Let's see what the reviewer says.

That's because I have recently researched quite a bit about it, and read the Unicode specifications, although not thoroughly, to create this perl module Locale::Unicode

I find the LDML fascinating, and quite amazing all the hard word and thinking they put into it.

Josh-Cena commented 1 week ago

Thanks, I'll hopefully take a look tonight or tomorrow (UTC+8).

Josh-Cena commented 3 days ago

@jackdeguest I've made some changes, mostly to make the pages consistent with collation, numberingSystem, etc. While I didn't make many meaning changes, I am particularly confused about this:

Using a region indicates a region- or country- specific preference for a locale identifier, allowing selection for differences between the same locale in, say, different countries.

We are selecting the region within the same language, not the same locale, right? We are refining the language en to either en-US or en-GB, which are two different locales, so it sounds more accurate to me to say "differences between the same language in, say, different countries".

jackdeguest commented 3 days ago

We are selecting the region within the same language, not the same locale, right? We are refining the language en to either en-US or en-GB, which are two different locales, so it sounds more accurate to me to say "differences between the same language in, say, different countries".

Yes, I concur. It is better this way.

This is because a language is a 2-characters or 3-characters identifier, while a locale is a combination of a language, script, region (a.k.a. territory something like GB or 150 for Europe), a variant and additional Unicode extensions, including BCP47 ones.

See also the Unicode specifications about what is a locale