Animenosekai / translate

A module grouping multiple translation APIs
GNU Affero General Public License v3.0
525 stars 60 forks source link

ReversoTranslator BUG #96

Open reddere opened 1 year ago

reddere commented 1 year ago

ReversoTranslator is bugged as when words like única are getting passed in the source text field, it throws error: requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))

Here is a piece of code to replicate:

from translatepy.translators.reverso import ReversoTranslate

translator = ReversoTranslate()

text = 'Una manera única de conseguir la victoria\n\nJonesyTheGoat'
t = translator.translate(text, 'Italian')
print(t)

Note how if you remove única, it works great. Somehow it throw error even if you replace the ú with a normal u. No idea why, @Animenosekai thanks in advance <3

ZhymabekRoman commented 1 year ago

requests doesn't supports HTTP/2, probably that's why it's throws exception ChunkedEncodingError. Or there is some problem in server side. Anyway, if we send exactly same request with curl, we get useful information about error itself:

{"error":"direction_invalid","errorType":"validation"}⏎
ZhymabekRoman commented 1 year ago

@reddere, as a workaround, please specify the source language of the text, as Reverso recognises text in Catalan. However, the translate endpoint doesn't support Catalan translation:

from translatepy.translators.reverso import ReversoTranslate

translator = ReversoTranslate()

text = 'Una manera única de conseguir la victoria\n\nJonesyTheGoat'
t = translator.translate(text, 'Italian', 'Spanish')
print(t)
ZhymabekRoman commented 1 year ago

I made some investigation and came to conclusion that when we send request from Chromium, even from Icnognito mode, Reverso correctly detects language as Spanish, if we export same request as curl format from Network tab of Devtool and execute it in terminal, we get Catalon language as in requests result. Probably Reverso uses some fingerprinting algorithm to detect the source of incoming requests, I have no better explanations.

reddere commented 1 year ago

I deeply appreciate the insightful replies @ZhymabekRoman . Unfortunately I cannot set the source language as I'm using the translator to translate from multiple languages (english, spanish and german mainly) to italian. If reverso detects the source of incoming requests, will passing a Chromium User-Agent in headers work?

ZhymabekRoman commented 1 year ago

will passing a Chromium User-Agent in headers work?

Nope

ZhymabekRoman commented 1 year ago

As a workaround, I think you can use Google to detect the language and pass it to Reverso for translation.

Animenosekai commented 1 year ago

Hmm requests.exceptions.ChunkedEncodingError is not really common, might need to investigate a bit