Open CJRzzZ opened 2 weeks ago
Sorry, can you clearly describe (maybe with sample code) what you are doing and what error you get?
"EN-US" as the source language This sounds like the issue - the source language would have to be "EN". Regional variants are only supported for target languages. The error you get seems to be wrong though, I can follow up on this.
You can read more on this differentiation in the documentation here
Sure, here is the sample code,
g = translator.create_glossary("GITCG_en_to_jp", 'EN-US', 'JA', dict_en_to_jp )
result = translator.translate_text(clean_text, source_lang=source_lang, target_lang=target_lang, glossary=g, ).text
In the first line, I tried to store the glossary with "EN-US" as the source language. The function "create_glossary" will automatically convert the source language into "EN". But it brings problem in the second line, when I tried to use "EN-US" as the source_lang, it returned "source_lang and target_lang must match glossary" error; when I tried to use "EN" as the source_lang, it returned "target_lang="EN" is deprecated, please use "EN-GB" or "EN-US" instead" error. So this is the error I have met and I hope I made it clear to you.
Yes, like I said - we differentiate between source and target languages
So in your code, the following should work:
source_lang = "EN"
target_lang = "JA"
g = translator.create_glossary("GITCG_en_to_jp", source_lang, target_lang, dict_en_to_jp )
result = translator.translate_text(clean_text, source_lang=source_lang, target_lang=target_lang, glossary=g, ).text
I've encountered a problem with the translator.create_glossary() function, where it sets the source language of a glossary object to "EN" despite the argument specifying "EN-US". This behavior seems to stem from the code in "translator.py" at line 302, which attempts to strip regional variants and retain only the base language code.
This leads to an issue because "EN" is deprecated in the DeepL API, which then throws a deepl.exceptions.DeepLException stating "target_lang="EN" is deprecated, please use "EN-GB" or "EN-US" instead." Furthermore, if the glossary is set with "EN" and translator.translate_text() is called with "EN-US" as the source language, a ValueError is raised, stating "source_lang and target_lang must match glossary". This inconsistency makes it impossible to use a matching value for the source language.
Could you please look into this? Thank you for your attention to this matter.