Closed Robin-St closed 1 year ago
@Robin-St thank you for asking this question. Unfortunately, I can't support case insensitivity because the names of some languages would match the codes of other languages:
from iso639 import iter_langs, Lang
names = set()
codes = set()
for lang in iter_langs():
lang_dict = lang.asdict()
for k, v in lang_dict.items():
if not v:
continue
if k == "name":
names.add(v.lower())
else:
codes.add(v)
matching_names = [s.capitalize() for s in names & codes if Lang(s) != Lang(s.capitalize())]
print(", ".join(sorted(matching_names)))
Abu, Aer, Agi, Ak, Aka, Ake, Ali, Ami, Are, Ari, As, Ata, Ati, Awu, Bau, Bih, Bit, Bua, Bum, Bun, Bwa, Che, Col, Con, Dai, Dan, Dao, Day, Deg, Dii, Dza, En, Ese, Fas, Fur, Ga, Gaa, Gal, Gao, Goo, Gor, Gua, Gun, Gwa, Ha, Han, Ho, Hu, Igo, Ik, Iko, Ila, Ito, Iyo, Jad, Kam, Kao, Kap, Khe, Kim, Ko, Koi, Kou, Kua, Kuk, Kuo, Kur, Kuy, Kwa, Lau, Laz, Loo, Lou, Maa, Mae, Mal, Mba, Mbe, Mer, Miu, Mok, Mon, Moo, Mpi, Mru, Mum, Na, Nai, Ndo, Nek, Nen, Neo, Nuk, Olo, Ong, Ono, Oro, Pal, Pam, Pei, Piu, Pom, Ron, Rwa, Sa, Sam, Sar, Sha, She, Shi, Sie, Sio, Soi, Som, Soo, Sop, Sou, Sui, Sur, Swo, Tai, Tee, Tem, Tha, Tii, To, Tol, Tso, Uma, Una, Uri, Uru, Uya, Wa, Woi, Yei, Yil, Yom, Zay
@LBeaudoux Fascinating, i really did not know that there are that many three letter languages. Then the case sensitivity makes sense. Thanks!
@Robin-St could you add a short side note about this to the README? I was about to open an issue for case insensitivity as well, before I saw this one. Since this is pretty surprising behaviour when one first stumbles over it, I think it would be beneficial to warn about it more prominently :)
(side note: I love this package, it made my life so much easier when working with data sources and libraries that use different language code variants, sometimes even multiple ISO639-* variants in the same library)
@AdrianSchneble Thank you for your suggestion and support. I've edited the readme to warn of Lang
's case-sensitivity.
Is there any specific reason to have the input not being case insensitive? My use case is to have the user supply the language either as code or name but handling of case makes this tricky. See example below