hyperknot / openfreemap

Free and open-source map hosting solution with custom styles for websites and apps, using OpenStreetMap data
https://openfreemap.org/
Other
2.93k stars 60 forks source link

Some Japanese place names are transliterated to their alphabetical form using Chinese phonetic readings. #24

Open stuartcw opened 2 months ago

stuartcw commented 2 months ago

I'm scrolling around the town where I live in Japan on https://openfreemap.org and I notice that many Japanese Place names have been transliterated into alphabetical strings in the way that they would be read in Chinese.

It appears that the phonetic reading generation is being done by some automatic process which is selecting the Chinese way of transliterating them instead of the Japanese.

This is a very tricky to get right as there is no foolproof way of obtaining the phonetic reading from Japanese Kanji Characters. (Hiragana and Katakana phonetic scripts are trivial.)

Therefore, a place name in one part of Japan has one reading yet the identical place name in another part of Japan has a different phonetic reading. Even Japanese readers have to check to be sure how to read a place name that they are unfamiliar with.

Difficult place names are quite rare but mean that you have to look up whether a place name is an exception and use the correct version before using a dictionary of the common transliterations.

The current Chinese transliteration of Japanese placenames is not useful at all, actually harmful, so I would suggest turning it off until a better solution or fix is found.

hyperknot commented 2 months ago

Hi @stuartcw and thanks for your report.

Your insight is valuable because I have no idea about Japanese place names, nor does probably most OSM developers.

As a first step, can you find out if there is any language in the dataset with correct names?

You can do this by going to Maputnik / View / Inspect here: https://maputnik.github.io/editor?style=https://tiles.openfreemap.org/styles/bright

Then when you click on one of the little red dots you'll see the names in all possible language. Does any of them look good to you?

hyperknot commented 2 months ago

Like this:

image
stuartcw commented 2 months ago

I have had a quick look and it seems that there is a mix of correct and incorrect data. i.e. for the subclass:bus_stop the name:latin is Chinese. Whereas subclass:station the name:latin is correctly transliterated.

For the incorrect subclasses it seems like the int_name (internationally recognized name) is the Chinese transliteration and this is mirrored to the name:latin. In the more prominent features like example above (prefecture name) all languages seem OK. I think the problem is limited to the more local labels such as bus_stop, pub etc.

I can look through the other subclasses later and check them if you like.

hyperknot commented 2 months ago

I think it's probably changes on an item-by-item basis. The important ones are city names, etc. So the big question is if there is a data transformation issue between OpenStreetMap -> OpenMapTiles or if the problem is in OpenStreetMap itself.

For example for Iwate Prefecture, I found this in OSM: https://www.openstreetmap.org/relation/3792412 and also this: https://nominatim.openstreetmap.org/ui/details.html?osmtype=R&osmid=3792412&class=boundary

hyperknot commented 2 months ago

Wrote a document for debugging international names: https://github.com/hyperknot/openfreemap/blob/main/docs/debugging_names.md

stuartcw commented 2 months ago

Here’s one example. Also all the minor features are in the area are the same. The problem is that the displayed is "jiǔ lǐ bāngF・marinosu tōngri" which is the Chinese rendering of the name. This has no meaning in Japanese or English. When I use OpenStreetMaps I see "Kurihama F. Marinos Dori" as expected. I only see the the Chinese name “ jiǔ lǐ bāngF・marinosu tōngri ” when inspecting maputnik or viewing openfreemap.org

久里浜F・マリノス通り

https://www.openstreetmap.org/way/159187467#map=18/35.232067/139.701945

https://nominatim.openstreetmap.org/ui/details.html?osmtype=W&osmid=159187467&class=highway

https://maputnik.github.io/editor/#17.25/35.232219/139.702426

1591874672

name_int: jiǔ lǐ bāngF・marinosu tōngri name_latin: jiǔ lǐ bāngF・marinosu tōngri

IMG_2921

hyperknot commented 2 months ago

Thanks, I'm trying to understand the issue step-by-step.

  1. So we are talking about this label here:

    image
  2. In the default style, here is how it's constructed:

image

So first it's name_latin and then name_nonlatin.

  1. What you are saying is that name_latin is wrong and shouldn't be displayed, right?

Now, what I cannot see correctly, is name:nonlatin, name_de, name and name_en all correct? The non_latin has one more character in the right side, is that good or bad?

1ec5 commented 3 weeks ago

As reported in https://github.com/onthegomap/planetiler/issues/86#issuecomment-1049998723, name:latin is all but useless for romanized map labels.