nidhaloff / deep-translator

A flexible free and unlimited python tool to translate between different languages in a simple way using multiple translators.
https://deep-translator.readthedocs.io/en/latest/?badge=latest
Apache License 2.0
1.53k stars 178 forks source link

Doesn't work with Chinese as target. #105

Closed Don-Yin closed 2 years ago

Don-Yin commented 2 years ago

Description

What happened:

GoogleTranslator not working when target is set to chinese.

I am trying to translate english paragraphs to Chinese, here is what I did:

class Translator:
    def __init__(self):
        pass

    def translate(self, textInput: str):
        textSliced = SentenceSplitter(language="en").split(textInput)
        textTranslated = GoogleTranslator(source="auto", target="chinese").translate_batch(textSliced)
        textTranslated = " ".join(textTranslated)
        return textTranslated

where SentenceSplitter returns a list of strings each is a sentence from the original paragraph, and:

if __name__ == "__main__":
    to_translate = "Currently, the application of deep learning in crop disease classification is one of the active areas of research for which an image dataset is required. Eggplant (Solanum melongena) is one of the important crops, but it is susceptible to serious diseases which hinder its production."

    print(Translator().translate(to_translate))

where to_translate is piece of random text. It returns the original text:

Please wait.. This may take a couple of seconds because deep_translator sleeps for two seconds after each request in order to not spam the google server.
sentence number  1  has been translated successfully
sentence number  2  has been translated successfully

Currently, the application of deep learning in crop disease classification is one of the active areas of research for which an image dataset is required. Eggplant (Solanum melongena) is one of the important crops, but it is susceptible to serious diseases which hinder its production.

when it is expected to turn its chinese translation.

This method functions perfectly well when GoogleTranslator is replaced with MyMemoryTranslator, MyMemoryTranslator however raises TooManyRequests:

Server Error: You made too many requests to the server. According to google, you are allowed to make 5 requests per second and up to 200k requests per day. You can wait and try again later or you can try the translate_batch function

it looks like its time.sleep is not functioning properly.

To summarise:

  1. GoogleTranslator doesn't work at all when target is chinese.
  2. MyMemoryTranslator needs a proper time.sleep mechanism.
davidsands commented 2 years ago

The problem is that google translate mobile requires the target language to be either "zh-CN" or "zh-TW" (case sensitive). "zh" alone sometimes works (haven't figured out just when or why), but google translate changes "zh" to "zh-CN" internally.

Just to track down the problem and allow Google Chinese translation, I added the _zh_map code to google_trans.py (around line 92)

if self.payload_key:
    self._url_params[self.payload_key] = text
# ---- fix ------
zh_map = {'zh': 'zh-CN', 'zh-cn': 'zh-CN', 'zh-tw': 'zh-TW'}
if self._url_params['tl'] in _zh_map.keys():
   self._url_params['tl'] = _zh_map[self._url_params['tl']]
# ---------
response = requests.get(self.__base_url,
                       params=self._url_params,
                       proxies=self.proxies)

I'm not sure where the language codes are getting lower cased, so I kept this proof-of-concept hack localized to the HTTP request.

Don-Yin commented 2 years ago

The problem is that google translate mobile requires the target language to be either "zh-CN" or "zh-TW" (case sensitive). "zh" alone sometimes works (haven't figured out just when or why), but google translate changes "zh" to "zh-CN" internally.

Just to track down the problem and allow Google Chinese translation, I added the _zh_map code to google_trans.py (around line 92)

if self.payload_key:
    self._url_params[self.payload_key] = text
# ---- fix ------
zh_map = {'zh': 'zh-CN', 'zh-cn': 'zh-CN', 'zh-tw': 'zh-TW'}
if self._url_params['tl'] in _zh_map.keys():
   self._url_params['tl'] = _zh_map[self._url_params['tl']]
# ---------
response = requests.get(self.__base_url,
                       params=self._url_params,
                       proxies=self.proxies)

I'm not sure where the language codes are getting lower cased, so I kept this proof-of-concept hack localized to the HTTP request.

Thanks for your reply! It explains a lot!

travisrecupero commented 2 years ago

I've tried

translated = GoogleTranslator('en', language).translate(sub_list[i])

where language has been

and it still wont translate to chinese.

My code works for english, spanish, french, german, and japanese. Does this code https://github.com/nidhaloff/deep-translator/issues/105#issuecomment-926983184 work for the current version of deep-translator? @davidsands

morpheus65535 commented 2 years ago

Still not working for me neither. It return English (identical to what was sent).

nidhaloff commented 2 years ago

@morpheus65535 Can you show me your example? or open a new issue using your example that it's not working?

This is an example that it is working on my simple test

    trans = GoogleTranslator(source='auto', target='zh-CN')
    res = trans.translate("good")
    print("translation: ", res) 
morpheus65535 commented 2 years ago

@nidhaloff that seems to be working now. You can forget my November comment. Thanks for the followup. :-)