plausible / analytics

Simple, open source, lightweight (< 1 KB) and privacy-friendly web analytics alternative to Google Analytics.
https://plausible.io
GNU Affero General Public License v3.0
19.21k stars 1.04k forks source link

Region names are in Swedish language for Finland #3260

Closed jmp closed 1 month ago

jmp commented 11 months ago

Past Issues Searched

Issue is a Bug Report

Using official Plausible Cloud hosting or self-hosting?

Plausible Cloud from plausible.io

Describe the bug

Region names for Finland are for some reason displayed in Swedish. For example:

Region names for other countries seem to correctly use the English names when available.

See also the attached screenshot. See also the related ticket #1740 with a similar issue.

Expected behavior

English names for the regions of Finland should be used. For example:

Screenshots

image

Environment

- OS: All
- Browser: All
- Browser Version: All
Kylmakalle commented 5 months ago

I've found the root cause. Three years ago, the geonames.csv scrapper was improved to have Swedish translations for Swedish regions. The parser uses Swedish (sv) column as English.

Since Finland also has region names in Swedish in ISO, this affects scrapping for Finnish regions.

Perhaps, we need to Skip the Swedish parser for English (are Swedish names better than English, like Estonian?)

  @translations_dest Application.app_dir(:location, "/priv/iso_3166-2.en-translations.json")
  @countries_to_skip [
    "EE",  # For Estonia the local names are better than English ones

Adding more country-specific logic (e.g. using the (fi) column for FI) feels unnecessary.

@ukutaht what do you think? I may try to make a PR.

image
ukutaht commented 5 months ago

Thanks @Kylmakalle ! I agree with your analysis and suggestions (mostly)

Perhaps, we need to Skip the Swedish parser for English (are Swedish names better than English, like Estonian?)

Yeah I wasn't sure about this - I am Estonian and I know that in Estonia everyone says 'Harjumaa' not 'Harju' to refer to the region. This is including non-Estonian speaking people. It would be confusing to say 'Harju' or 'Viru' without the '-maa' suffix to refer to the region. I assumed that in Sweden it's similar - that the -län suffix is always used.

Finland seems different because the regions have completely different names in different languages. IMO we should prefer English in this case (for anyone outside of Finland, Central Finland is much better than Keski-Suomi).

I think the best course of action is to make a surgical change to the location scraper - make sure to only use the sv translation for Swedish regions. This would keep The swedish names the same but fix the Finnish region names. What do you think?

Kylmakalle commented 4 months ago

I've updated the parser to use Swedish names only for Sweden. Finland will have English names. @ukutaht please see https://github.com/plausible/location/pull/10

P.S. It does not make much sense for me to have Swedish translations for Sweden, taking into account that the names are extremely similar to their English representations. I don't know Swedish, so I can't judge the need for the -län suffix. Considering the historical usage of Swedish names for Sweden and the lack of negative feedback about it, I'm leaving it as is.