elixir-cldr / cldr

Elixir implementation of CLDR/ICU
Other
455 stars 34 forks source link

Hard to track down error when locale is not recognized #138

Closed mikl closed 4 years ago

mikl commented 4 years ago

I’ve managed to trace a rather curious (FunctionClauseError) no function clause matching in String.contains?/2 that happens if you have a locale ex_cldr does not recognize.

I have the full traceback here, and the explanation for the error is below:

Full traceback == Compilation error in file lib/heiweg/cldr.ex == ** (FunctionClauseError) no function clause matching in String.contains?/2 The following arguments were given to String.contains?/2: # 1 nil # 2 ["*", "+", ".", "["] Attempted function clauses (showing 3 out of 3): def contains?(string, []) when is_binary(string) def contains?(string, contents) when is_binary(string) and is_list(contents) def contains?(string, contents) when is_binary(string) (elixir 1.10.4) lib/string.ex:2216: String.contains?/2 lib/cldr/config/config.ex:1184: anonymous fn/1 in Cldr.Config.expand_locale_names/1 (elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2 (elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2 lib/cldr/config/config.ex:1183: Cldr.Config.expand_locale_names/1 lib/cldr/config/config.ex:502: Cldr.Config.configured_locale_names/1 lib/cldr/config/config.ex:586: Cldr.Config.requested_locale_names/1 lib/cldr/config/config.ex:522: Cldr.Config.known_locale_names/1 lib/cldr/install.ex:27: Cldr.Install.install_known_locale_names/1 lib/cldr.ex:82: Cldr.install_locales/1 expanding macro: Cldr.Backend.Compiler.__before_compile__/1 lib/heiweg/cldr.ex:1: Heiweg.Cldr (module) (elixir 1.10.4) lib/kernel/parallel_compiler.ex:304: anonymous fn/4 in Kernel.ParallelCompiler.spawn_workers/7

In short, as Cldr reads its configuration, Cldr.Config.merge_locales_with_default/1 from lib/cldr/config/config.ex is called. Locale names are gathered from various sources and fed through this pipeline:

        (locales ++ gettext ++ [default, @default_locale_name, @root_locale])
        |> Enum.reject(&is_nil/1)
        |> Enum.uniq()
        |> Enum.map(&canonical_name/1)

In my case, before the final Enum.map(&canonical_name/1), my locale list looks like this: ["de_CH", "gsw_CH", "fr_CH", "de-CH", "fr-CH", "gsw-CH", "en-001", "root"].

Afterwards, because Cldr does not know of gsw_CH, it ends up looking like this: ["de-CH", nil, "fr-CH", "de-CH", "fr-CH", nil, "en-001", "root"], and since this is after Enum.reject(&is_nil/1) has run, those nils stay in the list, and later Cldr.Config.expand_locale_names/1 is called on said list, which calls String.contains?/2 on each item, which blows up with the aforementioned FunctionClauseError if given a nil.

The easiest way to avoid the error would be to have another Enum.reject(&is_nil/1) after the Enum.map(&canonical_name/1), but that would mean silently ignoring some locales the implementer might expect to be present.

The next step up would be to provide an error message if canonical_name/1 does not recognise a locale.

That is still a bit of a pain, because even if you do not even mention that locale in Cldr’s config, just the mere presence of a priv/gettext/gsw_CH folder causes Gettext to consider that locale as existing, and since Cldr fetches Gettext’s known locales in Cldr.Config.known_gettext_locale_names/1 and joins them to the locale list, causes this error to be unavoidable.

In other words, as far as I can see, the only way for me to currently use Cldr would be to completely drop the gsw_CH locale, or use a fake name for it, that Cldr does recognise.

Hope that is helpful.

kipcole9 commented 4 years ago

@mikl, thanks for the report. This has been on my to-do list recently and it is indeed an insidious error. Thank you very much indeed for getting to the underlying issue - thank you.

I'll get this fixed right away - including make sure that Gettext locales are only included if there is also a corresponding CLDR locale.

I'm especially sorry you ended up diving through Cldr.Config. Its not the module I'm most proud of (its one of the few modules that goes back the whole 4 year history of this project and it shows) - but I am cleaning it up as I make changes and fix bugs so here goes another dive down the rabbit hole.

Should have this fixed within 24 hours.

kipcole9 commented 4 years ago

One more question on this topic. As I suspect you know, the locale gsw does exist. I have deliberately not attempted to do locale resolution in the configuration for now for two reasons:

  1. It might produce unexpected results because the configured locale is not the locale that was requested.
  2. Cldr.Config is a bit of a mess because its at the bottom of the module dependency tree - it makes the assumption that everything depends on it, and it depends on no other module. It would be difficult to do locale resolution in Cldr.Config.

Therefore my preference is to just raise an exception at compile time if a requested locale is not known. And not automatically resolve gsw-CH to be gsw.

What do you think?

BTW, I had a similar issue with pt-BR. In conversation with the CLDR maintainers the indicated that the policy is that the primary language locale (in this case pt) is crafted based upon the country with the highest usage of the language. So pt is the same as pt-BR. And therefore gsw is the same as gsw-CH.

You've probably noticed that in ex_cldr, gsw-CH will resolve to the CLDR locale gsw:

iex> TestBackend.Cldr.validate_locale "gsw-CH"
{:ok,
 %Cldr.LanguageTag{
   backend: TestBackend.Cldr,
   canonical_locale_name: "gsw-Latn-CH",
   cldr_locale_name: "gsw",
   extensions: %{},
   gettext_locale_name: nil,
   language: "gsw",
   language_subtags: [],
   language_variant: nil,
   locale: %{},
   private_use: [],
   rbnf_locale_name: nil,
   requested_locale_name: "gsw-CH",
   script: "Latn",
   territory: :CH,
   transform: %{}
 }}
mikl commented 4 years ago

Well, first of all, thanks for looking at it so quickly.

I wasn’t planning on actually using gsw-CH (or gsw) with Cldr, since for this application it’s only partially supported, so the solution you first proposed (“make sure that Gettext locales are only included if there is also a corresponding CLDR locale”) would be sufficient for my purposes.

Therefore my preference is to just raise an exception at compile time if a requested locale is not known. And not automatically resolve gsw-CH to be gsw.

Yeah, given that this is a very edgy edge case, just having a better error message would probably be good enough™️. I just tried renaming my locale folder to gsw, and that works perfectly, so if the error just points the implementer in that direction, I don’t think anyone has reason to complain. Maybe the Portuguese, but they can take that up with the ICU ;)

kipcole9 commented 4 years ago

As of this commit I believe we now have a more robust handling of configured Gettext locales that do not exist in CLDR. I have published a new release as ex_cldr version 2.17.2.

Changelog

New Behaviour

In the case where a Gettext locale is configured but does not exist in CLDR a warning is emitted and the locale is ignored. For example (assuming the gettext locale has a definition for gsw-ch):

iex> defmodule UnknownGettext do
...>   use Cldr, locales: ["en"], gettext: TestGettext.GettextUnknown
...> end

warning: The locale "gsw-CH" is configured in the Elixir.TestGettext.GettextUnknown gettext backend but is unknown to CLDR. It will be ignored by CLDR.

Updating

mix deps.update ex_cldr