Journaly / journaly

A foreign language journaling application.
https://journaly.com
55 stars 17 forks source link

Language names in settings inconsistent and in some cases plain wrong #521

Open l8arrival opened 3 years ago

l8arrival commented 3 years ago

Right now the language names in the settings (as shown to you to select your native (source) and learning (target) languages) seem to be an inconsistent mix of some languages shown with in their English spelling (and Latin script), and some languages shown with their native language spelling (and sometimes native script). There is also at least one, Romanian, that is just flat out wrong.

I would first of all argue, that if a person is using Journaly with a certain display language (e.g. English), then language names should match those used by speakers of that display language. E.g. "French", not "Français".

Here are some errors I spotted, in a cursory examination. There are more:

Romanian: This is shown as "Român". This would be referring to a Romanian person (male, or gender unknown), not the language. In Romanian, we call the language "româneşte" or "limba română". Here's a link to a good explanation: https://forum.wordreference.com/threads/rom%C3%A2n-rom%C3%A2nesc-rom%C3%A2ne%C5%9Fte.1093712/#post-5679901

Spanish: This is shown as "Español". In Latin America, it is indeed "español" (lowercase though). But in Spain it would be wrong and probably offensive to use "español" for the name of the language, it is actually "castellano", which is just one of the languages of Spain.

Basque: This is shown as "Basque". But if other languages are shown with local names/spelling, why is it not "euskara", which is what everybody in Spain, including the Basques themselves call it?

Note also, all the languages above are written natively without a leading capital letter. But since this is a dropdown list in Journaly, I still think a leading capital letter makes sense.

There are a number of other languages I can see that use the English language name, e.g. "Bulgarian". So why do some use the native language name?

Lanny commented 3 years ago

Hi @l8arrival , thanks for opening an issue. The short answer to "why do some use the native language name (and others don't)" is that we're a small team of all native English speakers and don't always know what the endonyms for languages are.

As to why we use endonyms rather than English exonyms is that it seems more accommodating to those who don't speak English at a fairly high level. E.g. if you are an Italian native speaker learning German you're probably a lot more likely to recognize "Deutsch" than the English name "German". Ideally we'd show Italian users whatever the Italian name for Deutsch is, but this creates a combinatorial problem where there are a lot of language names. As you can tell from the Romanian case, it can be subtle and error prone just to figure out what a language's own name for itself is, so building an N by N translation matrix of languages is out of scope for the moment.

Given that, we appreciate tickets like this so we can get the endonyms right. I've updated the following languages in the DB:

I'm on the fence about capitalization. I'd actually lean toward using whatever the native convention is (e.g. "español" over "Español"), but maybe it would look odd to have different capitalizations in a list.

Also unsure about how to handle "español" vs. "castellano". While this is sample size one, as a non-native speaker I think "español" is a lot more recognizable than "castellano", and I assume Spaniards would understand why "español" is used, even if they might bristle at it a bit. We could split the language and consider the European and American dialects distinct (I beleive we do this with Portuguese) but that isolates the two populations from each other in terms of feedback/searchability and my understanding is the dialects are pretty highly mutually ineligible so I'm not exactly crazy about that idea.

@robin-macpherson and @candacepowell do either of you have strong feelings on the last two points?

l8arrival commented 3 years ago

In a list of languages in a software UI, I would just use "română" instead of "limba română". It's understood we are talking about languages so the "limba" is extraneous, and it would be weird to have it in front of every language name, e.g. the same list would show "franceză" for French, not "limba franceză".

On "español vs castellano", It is the same language, and it would be a shame to split them up. I don't think comparable to Brazilian vs European Portuguese, as there are many more regional differences in Spanish all over the place. e.g. the Spanish spoken in Andalusia is definitely closer to Carribean Spanish than it is to the Spanish spoken in Castilla and Leon. But you've got 250-300 million or so people calling it by one name. and 50 million people using another. I'm a C1 in Spanish, and have spoken hundreds of hours with people in Spain, and listened to hundreds of hours of podscasts from there, and they pretty much always say "castellano".

Anyway, I think people are used to seeing "español" on web sites. I just asked a few friends and they agree. My friend in Colombia said, "El uso de "castellano" en lugar de español, es una pendejada de la gente de España.", i.e. roughly translated, the use of "castellano" is bullshit from the people of Spain" :-) My friend in Spain said, "Well, in Spain we use both words, "español" and "castellano", to talk about the same thing because they're synonyms. On the other hand, when we talk about the Spanish language in an international context like a website, we always use the word "Español". In short, "castellano" is a word that we only say or use inside Spain and "español" inside and outside. Look at the website of the Lugo city hall or an official government site: http://administracion.gob.es/. In those sites always use the word "Español. So, my recommendation is to use "Español" .

So probably best to leave it as "Español", even at the risk of annoying a few people.

Finally, I still think it's a lot more natural not to use endonyms but rather to use the proper name in the display language. This is what people are used to in software UIs. i.e. I can open up Netflix, where I have a profile in English language, and a profile in Spanish language, and when I got to the former to select a subtitle language, I will see "Spanish" and "European Spanish" as choices, and if I go to the latter, I will instead see "español" and "español de españa".

Lanny commented 3 years ago

In a list of languages in a software UI, I would just use "română" instead of "limba română".

Awesome, made this change.

So probably best to leave it as "Español", even at the risk of annoying a few people.

That's good with me, and great to get input from native speakers in Spain!

Finally, I still think it's a lot more natural not to use endonyms but rather to use the proper name in the display language

I don't disagree, as a monolingual English speaker I'd never have guessed that "Euskara" refers to what I know as "Basque" (although I probably would if I was learning that language). That said, we have the UI translated into two languages right now so if we switched to UI-language exonyms it would mean anyone who didn't speak English or German well enough to identify their target language's exonym in either of those languages would be in a tough spot.

I think eventually we do want to translate language names into a user's UI language, but I'd want us to get a better spread of fully translated UI languages, and see the list of target languages stabilize a bit (I think the rate of adding new languages has slowed down a bit so we're probably getting close there)

La-Catalaneta commented 3 years ago

Hi, Alex from Catalonia here.

Español seems totally fine to me!. There is always an argument between Castellano and Español but I think it's safe to leave it as "Español". People call basque both Euskera and Vasco to Basque here so it feels more nice to me "Euskara" as its native name. I think using the native name of the language is the better option for users not fluent in english language specially as you said.

Keep the good work! :)