Open jordimas opened 6 days ago
Hi Jordi,
Thanks for reporting this issue. According to the ISO 639-2/RA Change Notice, the 'jw' identifier was indeed published in error and then deprecated in August 2001.
iso639-lang
already detects deprecated ISO 639-3 identifiers. After the next update it will also detect deprecated ISO 639-3 reference names. Following your report, I will try to make it detect deprecated values from ISO 639-1, ISO 639-2 and ISO 639-5 as well.
Thanks, great library BTW!
Hello.
It may be worth considering adding "jw" as alias for Javanese
I found this in OpenIA Whisper code:
https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L108
I will expect "jv"
I found documented here: https://xml.coverpages.org/iso639a.html
Javanese is rendered as "jw" in table 1, while it is correctly given as "jv" in the other tables.
It seems that may be an error that propaged. I have not done an extensive research, I just sharing what I found.
Thanks