cadmiumcr / language_detector

Detects the language of a text sample
MIT License
7 stars 0 forks source link

Get ceb missing key on Emoji letters #9

Open confact opened 3 years ago

confact commented 3 years ago

I tried to use this emoji and word array and makes it a string:

 "[\"💜\", \"\", \"snart helg 💜\", \"🤩\", \"🤍\", \"Söndagar\", \"☀️🥱\", \"🌌\", \"lördag ☔️💜\", \"Kvälls☀️\"]"

Then run this code:

      code = Cadmium::LanguageDetector.new.detect(sentence)

      Cadmium::Language::IsoCode3To1.new.codes[code] if code

Then I get this error:

Unhandled exception: Missing hash key: "ceb" (KeyError)
watzon commented 3 years ago

I'm guessing the key error is coming from IsoCode3to1 and not LanguageDetector? Not all languages have a 2 character code, specifically lesser used languages like Cebuano which is apparently what the language detector guessed. In this case we should probably throw our own error, but either way you'd want to catch it.