Closed rec closed 4 years ago
@rec in the ISO 639-1 standard all language ALPHA-2 codes are lowercase. And whether or not Bahasa is used in front of Indonesia or if it's lowercase or not, seems to depend on the language being spoken. At least according to references such as https://www.loc.gov/standards/iso639-2/php/code_list.php. Which is news to me. I've done localization coding for a few sites as a web developer and hadn't noticed that in the past. Do you have other references that suggest otherwise for generic en
page? If so I'd love to see them. Localization is a very interesting topic for me.
Yes, I'm also really interested in localization! :-)
Let's start with the text. I speak Indonesian, though it has probably regressed to C2 by this point, but it is certain to me that "Indonesia" is always the country, in the same way that "France" is always France (in French).
Native speakers say just bahasa or in writing, BI sometimes.
But "Indonesia (id)" looks like a country name, like "de (Deutschland)" would
Regarding (independently) capitalization
ISO 639-1 only standardizes the 2- and 3-letter codes, not the full name of the language! From reading your page, it struck me that you intended to have a series of pairs looking like this:
name of the language as it appears in that language (language code)
i.e.
Türkçe (tr)
(which example from the page was definitely motivating for me!)
By that standard, these ones just jump out as "wrong" to me as a speaker of these languages
deutsch (de)
english (en)
indonesia (id)
because you just never ever see "deutsch" or "english" or "indonesia" in any printed material anywhere, but always Deutsch, English, or Indonesia.
(If you don't believe me, search for deutsch, and then try and find a lowercase version by clicking to the next page! I gave up. :-D
So it just "reads as wrong".
I did a quick check of the other languages, only a few of which I have any of knowledge of, and they seemed to be correct. Germanic languages capitalize language names in general, and Indonesian/Malay were copying English when they formalized their spelling and capitalization rules.
I agree with all those points. I guess I just meant to say that the language names all being lowercase could be more the result of normalization efforts. Though as it is a page specifically marked as en
in the HTML <html lang="en" dir="ltr">
all the languages should be capitalized with the possible exception of languages that don't have upper and lower case variants but are still referenced using Romanized lettering. In that instance I believe lower case is appropriate as that is the primary case for the majority of letters in any sentence. Though that is just my musings on the subject. I don't know of a specific standard. I would also at the very least expect indonesian instead of indonesia as you note. 👼
From the first page of a google search for "deutsch" :) deutsch - Wiktionary https://en.wiktionary.org/wiki/deutsch
That's an adjective, not a noun! :-D
On Tue, May 14, 2019 at 1:25 PM Dovi Cowan notifications@github.com wrote:
From the first page of a google search for "deutsch" :) deutsch - Wiktionary https://en.wiktionary.org/wiki/deutsch [image: Screenshot from 2019-05-14 12-24-27] https://user-images.githubusercontent.com/50210615/57694333-586dfb80-7643-11e9-909a-94e629da760a.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/semver/semver/issues/512?email_source=notifications&email_token=AAB53MX5BILDKIPA753PND3PVKOTLA5CNFSM4HHORS72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVLFO3Q#issuecomment-492197742, or mute the thread https://github.com/notifications/unsubscribe-auth/AAB53MUAOPSGZK3MV3K7UVTPVKOTLANCNFSM4HHORS7Q .
-- /t
PGP Key: https://flowcrypt.com/pub/tom.ritchford@gmail.com https://tom.ritchford.com https://tom.ritchford.com https://tom.swirly.com https://tom.swirly.com
Anyway, we are spending too much time on this little minute thingie. Sorry! :-)
All lower case would also make some sense, though offend my delicate aesthetics :-D but at this point it isn't that either because Turkish is capitalized.
So given changes have to be made.... I would have just emitted a code review but I dunno where this text lives.
Anyway, the semvar page rocks, and I hope you have a great week!
The website is hosted in another repository and the languages seem to be defined and configured here: https://github.com/semver/semver.org/blob/388ffb9bd81fe70f1945c38b38e8f6f74aa04432/_config.yml#L6-L32 A PR to that file should be able to fix all these issues.
as Indonesian, "Indonesia" is ok we can recognize easily that it gonna change to Indonesia language
websites by Indonesian company usually use 'Indonesia' and 'English' as dropdown/button to change locale
Can we please close this thread? It is linked to from the above referenced PR on the semver/semver.org site (where this discussion belongs), will still be read/writable, and closing it in no way indicates the changes the PR will be accepted.
Typically issues with associated PRs on github remain open until the PR is resolved.
Investigated a bit and didn't find any ISO spec with language native names (endonyms) which we are talking about. Instead, I found this doc from Unicode standard: http://cldr.unicode.org/translation/translation-guide-general/capitalization
Beginning with CLDR 22, the guidance is that names of items such as languages, regions, calendar and collation types ... Regarding the capitalization of months and weekdays, please apply middle-of-sentence capitalization rules even on stand-alone items. In your language, if month and day names are generally lower case in the middle of the sentence, then please apply this same rule (lower case) to both formatting and standalone values. ... However, it is also important to ensure that there is consistent casing for all of the items in a section, so before making any changes, be sure to get agreement among all the translators for your language — otherwise the capitalization of items in a section may appear random.
Read more about why it's not standardized here: https://en.wikipedia.org/wiki/Exonym_and_endonym
In addition, few unofficial resources with autonyms that we can use:
Reading these documents now... gee, someone did all the work already, how nice. (Also just discovered Blissymbols from the second document.)
Very instructive. Thanks!
A quibble at the top of https://semver.org/: "Indonesia" is the country, "bahasa Indonesia" the language.
While I'm here, "English" and "Deutsch" should be capitalized.
I quickly checked most of the other language names written in Roman or Cyrillic letters and I'm fairly sure the capitalization is correct for those (ie, capitalized for Türkçe and lower case for the rest).
I would have made a pull request for these, except that I cannot find the language names in any of the four files in this repository...
Thanks for a really useful page!