osmandapp / OsmAnd

OsmAnd
https://osmand.net
Other
4.59k stars 1.01k forks source link

Bad transliteration from Serbian #13900

Open DujaOSM opened 2 years ago

DujaOSM commented 2 years ago

Description

OsmAnd transliterates Serbian Cyrillic characters ⟨љ⟩, ⟨њ⟩, ⟨џ⟩ and ⟨ј⟩ as ⟨ĺ⟩, ⟨ň⟩, ⟨ď⟩ and ⟨ǰ⟩ respectively (or thereabout; resembles Slovak orthography). The standard, straightforward transliterations are ⟨lj⟩, ⟨nj⟩, ⟨dž⟩ and ⟨j⟩; uppercase ⟨Lj⟩, ⟨Nj⟩, ⟨Dž⟩ and ⟨J⟩ . Even Slovak Wikipedia has articles on Kraljevo, Vranje and Kragujevac.

Automatic transliteration as I stated is simple and universal. There is no need to resort to name:sr-Latn as requested in #9393. Most of problems with transliteration in that issue have been resolved, so I'm opening a new issue instead (I guess that one can be closed).

How to reproduce?

Find Vranje, Kraljevo, Kragujevac or Džep on OSM. While search is successful, they are rendered Vraňe, Kraĺevo, Kraguǰevac and Ďep instead.

Your Environment

OsmAnd Version: 4.1 Android/iOS version: iOS 15.3.1 Device model:

Maps used (online or offline):
Map of Serbia

Flexmaen commented 1 year ago

I think the transliteration is not the same for all languages. For example Ћуприја is transliterated to Tshuprija in German and shown as Cuprija in English.

Also transliteration might remove diacritics, see also https://github.com/osmandapp/OsmAnd/issues/9413 - and it is interesting that I had different results for the Serbian map on different devices, see: https://github.com/osmandapp/OsmAnd/issues/5954#issuecomment-1592884425

DujaOSM commented 1 year ago

Well no, foreign toponyms in (local) Latin-based alphabets are just rendered as-is in modern German practice. So the correct transliteration for most languages is "Ćuprija" (btw, that's how it's rendered for me at OsmAnd 4.4.9 for iOS, on English locale). In no modern German variant it should be spelled "Tshuprija".

Stripping down diacritics is also not proper practice. It is certainly good that the search function can find a normalized string ("Cuprija" for Ćuprija, "Koeln" for "Köln", "Djokovic" for "Đoković").

Speaking of this bug, the situation has been much improved and many places are now spelled properly. However, I also see varying results for transliterations on this very device:

As far as I can tell, it tries to render name:en tag and if it is not present, it falls back to the "Slovak" algorithm I originally complained about.

Flexmaen commented 1 year ago

I also think that diacritics shouldn't be stripped down.

Not sure what you mean by "modern German", but I also wonder if Tshuprija makes sense. In the Wikipedia, Serbian names don't seem to be different than the english transliterations. In Russian there often is a difference in the name of people (e.g. Gorbatschow vs Gorbachev) but it seems this doesn't apply for Serbian. But this might be a different discussion anyway.