Open einrogerst opened 4 years ago
I'm not a fasttext developer, but I came across this.
langcodes
Python library or on a web page like this: https://r12a.github.io/app-subtags/sh
is the code for Serbo-Croatian. It's vaguely deprecated and is considered equivalent to Serbian (sr
), Croatian (hr
), or Bosnian (bs
), three highly-related but politically-distinct languages that are mostly indistinguishable in text.
The fastText language identification models support language code 'sh' (https://fasttext.cc/docs/en/language-identification.html). However, this code is not listed in the ISO codes (https://www.loc.gov/standards/iso639-2/php/code_list.php). It is unclear if it refers to Shan language (shn), Shona language (sna), or any other language.