Closed pmi123 closed 3 years ago
Hello and thank you for taking the time to report a bug from translatepy
translatepy
is currently not using Yandex's Official Public API but rather the API used by Yandex.Translate
But now that you are reporting it, I think that it would be nice to have a way of using the Public APIs and fallback to the other APIs. (Maybe an authentification method added to enter your credentials)
Are you using this API: https://cloud.yandex.com/docs/translate/operations/translate
If so, could you explain to me how to use the API keys (how to pass them in the request as I don't really understand the folder system)
Or are you using another API?
Thank you for your report
Animenosekai
I am using your yandex.py code and your API key (self._id) - I misspoke in my earlier email. I have been testing so many translation apis that I am having difficulty keeping them straight. :(
I could not sign up with Yandex.Translate because you first have to have a Yandex.Cloud account, and the cloud account requires a cell phone number to activate the account. I did set up a yandex mail account, but now I have to add a cell phone before I can log in, so that is not working, either. I hope that helps!
I am happy to help you debug or extend this code.
Mark
Sorry, I don't really know about the Yandex.Translate API ID system as the ID I'm using is from https://github.com/ssut/py-googletrans/issues/268#issuecomment-770628519
I guess that if your ID doesn't work (403 means Forbidden), you might have mis-pasted it or the ID isn't valid.
Let me know if you find any solution!
My apologies for not being clear in my earlier posts. I am writing to you regarding the yandex.py module in this github account - https://github.com/Animenosekai/translate/blob/main/translatepy/translators/yandex.py. I assumed, perhaps incorrectly, that this line in that code represents some sort of API key.
self._id = "1308a84a.6016deed.0c4881a2.74722d74657874-3-0"
This module, yandex.py, does not seem to be working.
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import translatepy
>>> print(translatepy.__version__)
translatepy v1.1.3
>>> import safeIO
>>> from translatepy.translators import yandex
>>> y = yandex.YandexTranslate()
>>> y.translate("Er ist ein kluger Kerl", 'en', source_language='de')
request.status_code=403, request.json()['code']=406
(None, None)
>>> y.translate("Bonjour, je m'appelle Mark", 'en', source_language='fr')
request.status_code=403, request.json()['code']=406
(None, None)
>>> y.translate("È un ragazzo intelligente.", 'en', source_language='it')
request.status_code=403, request.json()['code']=406
(None, None)
>>>
I added a print statement in your yandex.py code to see the status_code and json code.
def translate(self, text, destination_language, source_language="auto"):
"""
Translates the given text to the given language
"""
try:
if source_language is None:
source_language = "auto"
if isinstance(source_language, Language):
source_language = source_language.yandex_translate
url = self._base_url + "translate?id=" + self._id + "&srv=tr-text&lang=" + str(destination_language) +"&reason=" + str(source_language) + "&format=text"
request = get(url, headers=self._headers, data={'text': str(text), 'options': '4'})
**print("request.status_code=%s, request.json()['code']=%s" % (request.status_code, request.json()['code']))**
if request.status_code < 400 and request.json()["code"] == 200:
data = loads(request.text)
return str(data["lang"]).split("-")[0], data["text"][0]
else:
return None, None
except:
return None, None
Are you seeing the same thing? Do you have a solution to make the yandex.py module in this github account working again?
Thanks,
Mark
Yandex.Translate's API is currently used as a fallback when nothing worked as there is a low chance of it being working:
When the YandexTranslate
class is initialised, it calls the refreshSID
method which parses the website to find a Session ID (SID
, referenced as _sid
). That's why it is recommended to use the default translate
method from the high level Translator
class and not directly use the translators' classes.
Example:
Use
>>> from translatepy import Translator
>>> t = Translator()
>>> t.translate("Hello", "Japanese")
Source (English): Hello
Result (Japanese): こんにちは # worked
Instead of
>>> from translatepy.translators import yandex
>>> y = yandex.YandexTranslate()
>>> y.translate("Hello", "jp")
(None, None) # didn't work
If refreshSID
doesn't succeed on getting a Session ID
, the translation won't work and None
will be returned.
For example, it now works fine for me, but Yandex gives very strict Rate-Limiting and bot detection rules and I'm pretty sure that refreshing it more than 10 times won't work
(By the way, you don't need to import
safeIO
when usingtranslatepy
. It should be imported automatically)Animenosekai
Your example works because you are getting the translation from Google or Bing, and not Yandex.
When you call y = yandex.YandexTranslate()
as shown above, refreshSID
is being called, so it should work. I believe there is nothing different in the way I am calling the yandex translator than how your Translator
class calls the yandex translator. If I am wrong, please let me know.
Can you show us how you were able to show that the yandex.py
module is working?
The whole point of my post was to inform you that your yandex code may not be working. Perhaps Yandex no longer honors requests using v1.0 of the API because I believe the current version is 1.5, and you are using version 1.0. There may be other issues as well (e.g. GET versus POST requests).
The translator reverso.py
also appears to not be working. It returns code 400. I have never used the reverso.py
module before today and I only made one request to the server for my example below. I don't think it is an issue of accessing the server multiple times that is causing this error.
Only Bing and Google translators work in your code, as far as I can tell. And, as shown below, they can be called individually without using your Translator
class. Since they are the first translator modules called in your Translator
class, you always get a translation by using that class.
>>> from translatepy.translators import reverso
>>> r = reverso.ReversoTranslate()
>>> r.translate("È un ragazzo intelligente.", 'en', source_language='it')
request.status_code=400
(None, None)
>>> from translatepy.translators import bing
>>> b = bing.BingTranslate()
>>> b.translate("È un ragazzo intelligente.", 'en', source_language='it')
('it', "He's a smart guy.")
>>> from translatepy.translators import google
>>> g = google.GoogleTranslate()
>>> g.translate("È un ragazzo intelligente.", 'en', source_language='it')
('it', 'He is a smart guy.')
>>>
It would help if you included some unit tests for each translation module to show that each translation service you offer in your code actually works.
Also, if you know that a service does not work, I suggest removing that service from your code until you update your code and can show it is working Or, at a minimum, modify the README.md file to say which services work and which ones don't work.
Mark
I've just checked Reverso and it seems to work fine for me:
>>> reverso.ReversoTranslate().translate("Hello", "fra")
('eng', 'Bonjour')
Check that the source_language and the destination_language are in the correct format: You seem to use the ISO 639-1 Alpha 2 code format while Reverso uses ISO 639-2/3 Alpha-3
This is why using the Translator class is better as it fallbacks to different translators in case something goes wrong, provides a caching system and converts the languages to the correct format automatically, wether you gave the language name in any language, the alpha-2 code or the alpha-3 code.
Also, Reverso uses POST requests on their site, so I'm also using it.
As for Yandex, I just checked it does seem to be an issue from a misunderstanding of how their API work.
I just committed a fix but make sure that _sid is set to something else than an empty string ''
as it seems to be also the problem (verify that you are not rate-limited by going on their site in private browsing)
Thanks for looking into the yandex translator! I will check out the changes over the weekend.
I will check for the _sid value. Do you have any idea when the rate limiting times out?
As a suggestion, could you add an optional argument to Translator.translate()
that allows the user to specify the translation service (Bing, Google, Yandex, etc.)? Not all translation services are equal for every language. Some users may want the option to bypass Google or Bing for one of the other translation services. Or, compare translations from each service. One cannot do that with your code as it always starts with Google, and the user has no option to specify which translation service should be used.
I agree your current implementation of always going through your translators one at a time in a fixed order will work for many users. I am just making a suggestion to give your code more flexibility for different users.
Thanks!
Mark
Yea that seem like a nice idea!
I'm going to look into it when I finish some of my stuff on my other projects (and school yikes, holidays are coming soon though).
Hmm it does seem to trigger the captcha system on yandex translate if its called to many times for me. Has anyone got this yet https://gyazo.com/c0e06ba89c8fea6047744ab74e548ad7 ?
Hmm it does seem to trigger the captcha system on yandex translate if its called to many times for me. Has anyone got this yet https://gyazo.com/c0e06ba89c8fea6047744ab74e548ad7 ?
Yes it actually is what happens when you are "rate-limited", when they detect that a bot is using it
Animenosekai,
I have found a possible issue with your code that you may want to look at. I was trying to figure out why Yandex would not do any translations for me while I was testing other services (Google, Bing, Reverso). In looking at the output from the debug runs, I saw I was also hitting the Yandex server every time I made translation, even though I was not accessing the Yandex server for a translation, but was using Bing or Google or Reverso. I looked at your code and found the following.
When you instantiate the Translator class in the init
method you have:
def __init__(self) -> None:
self.google_translate = GoogleTranslate()
self.yandex_translate = YandexTranslate()
self.bing_translate = BingTranslate()
self.reverso_translate = ReversoTranslate()
Which works well because Google, Bing, and Reverso either have pass
in their init
methods, or some local assignments. However, in the YandexTranslate
class, you make a call to the Yandex server to get the _sid
value. This happens every time a Translator class is created. In your code as it is written now, it does not matter because 99.999% of the time your code will use Google or Bing or Reverso for the translation. However, in the event those services decide to not process your request, Yandex will also most likely not process the request because your code has been hitting their server for the _sid
each time a new Translator object is created.
In looking at the output, it takes only ~3 Translator objects instantiations to trigger the Yandex server to stop listening to requests from your code. If you implement my suggestion above to allow a user to decide which translation service to use, then you will have to make sure that using other services does not "turn off" the Yandex service as a side effect.
Instead, you might want to hold off getting the _sid
value from the Yandex server until you really plan to use the Yandex service for a translation. Perhaps, make if the first step in your translate
function for the YandexTranslate
class instead of in the init
method.
A further enhancement would be to add an exponential backoff when using Yandex for translations. In the translate function for the YandexTranslate
class, you could call self.refreshSID
method, check the return value, and start an exponential backoff timer to see if waiting will open the connection. It would take some testing to see if this id feasible, as the Yandex server might block for an hour or more, to the backoff will not be feasible. Does the documentation for Yandex say anything about this?
What do you think?
Mark
Hmm it does seem to trigger the captcha system on yandex translate if its called to many times for me. Has anyone got this yet https://gyazo.com/c0e06ba89c8fea6047744ab74e548ad7 ?
Yes it actually is what happens when you are "rate-limited", when they detect that a bot is using it
Yes true, have you got that? Might have to make it store the _sid into a txt file and use it that way, and make it update it, if it gives a error.
Waiting for CodeQL but 1ff4acb seems to work.
I'll upload it soon!
@pmi123 What do you mean "exponential backoff"?
@Animenosekai "In a variety of computer networks, binary exponential backoff or truncated binary exponential backoff refers to an algorithm used to space out repeated retransmissions of the same block of data, often to avoid network congestion." source: https://en.wikipedia.org/wiki/Exponential_backoff
In your particular use case, you want to retry getting the _sid
after waiting a certain amount of time. While waiting in a retry loop, instead of calling time.sleep(x)
where x = const
(i.e. for a fixed amount of time), use time.sleep(y)
, where y = an exponentially growing variable
based on how many times one has waited for a response.
For example, y = (2 ** n) + (random.randint(0, 1000) / 1000)
, where n = the number of retries
. The random part is just to keep the timing of the retires, well, random to a degree. If your retry code is in a method that accesses an api, and there are many users using that method at the same time, then all the retries could end up in lockstep, and no user will get a response. The random bit prevents this lockstep. It is not really needed in your use case, but it doesn't hurt. It would also defeat a Yandex server from "thinking", "Well, this request from url=z is coming in every x=2 seconds, so lets assume it is a bot and block it."
Be sure to include a MAX_RETRIES
value in the retry loop to break out of the loop and admit the retires failed, or your retry method may run forever.
Finally, there are many many python implementations of this type of algorithm, usually as a decorator to a method, that performs the retry loop on either values or exceptions. Just google "python exponential backoff retry example" for lots of examples. It is also not hard to roll your own, as your use case is not complicated.
I hope that helps!
Mark
@Animenosekai regarding https://github.com/Animenosekai/translate/commit/1ff4acbf3d72ac39e2419bb09f1655cc77db1841, it might be cleaner to update the refreshSID()
method to check the value of _sid
, refresh it, and also retry refreshing until you get a value, or return None for the translation. You only have to make a few simple changes to translate
, language
, etc. instead of sprinkling if self._sid == ""
throughout your code. Just an idea.
def __init__(self):
self._base_url = "https://translate.yandex.net/api/v1/tr.json/"
self._sid = ""
self._headers = self._header()
def refreshSID(self):
while self._sid in ["", " "]:
data = get("https://translate.yandex.com/", headers=self._headers).text
sid_position = data.find("Ya.reqid = '")
if sid_position != -1:
data = data[sid_position + 12:]
self._sid = data[:data.find("';")]
break;
else:
# sleep some amount of time
# keep track of retries and stop after MAX_RETRIES
return self._sid
def translate(self, text, destination_language, source_language="auto"):
if self.refreshSID():
....continue with the existing code.....
else:
return None, None
This seem like a great idea but it might won't it mean making function where the user waits for the SID and therefore making a very long blocking operation?
Wouldn't it be better for example to add a lastTried
variable and refresh only if (2 ** n) + (random.randint(0, 1000) / 1000)
passed until last _sid retrieving trial?
I am not sure I understand your question, but let me try to answer what I think you are asking.
The blocking stops after (1) a valid code is retrieved from Yandex, or (2) the MAX_RETRIES
has been reached. Blocking is roughly 1 sec, 2 sec, 3 sec. That value of MAX_RETRIES
is a trade off between the value of the translation, how long a user will wait, how long Yandex will block, and how long sid
is valid.
I am assuming that you don't know (1) how long Yandex blocks a bot, and (2) how long the self._sid
is valid. Some testing might shed some light on the values of these two quantities, and help determine if it is feasible to wait or the code should pick another translation service to use. If the results of some testing show that Yandex blocks a bot for too long (based on the user's expectations), then the strategy should be to use another translation service instead of waiting for Yandex.
I am not sure what information the lastTried
variable holds. The self._sid
has enough information to determine whether the _sid
is valid (except as noted above).
If you mean lastTried
is the number of retries used the last time the refreshSID
was run, I think it would be better to bake that value into the initial value of n
. In other words, if some testing shows that it takes ~2 seconds to get a valid_sid,
(e.g Yandex blocks bots for roughly 2 seconds) then the code should start with n=3
, and avoid the first two retires, as they will only aggravate the Yandex servers.
If you mean lastTried
is the "age" of the _sid
, then one could use that value to determine if a new value for _sid
is needed. However, if you don't know the lifespan of the_sid
, I am not sure how to use this value. The code above will eventually get a valid _sid
if the current one has expired.
You could also reset _sid
after each operation is concluded (ie. translate
, language
, etc.), if the documentation or some testing says that the _sid
is just a one shot value and has to be determined for every access to the Yandex servers.
An untested attempt at fleshing out the code above:
def refreshSID(self):
n = 0
while self._sid in ["", " "]:
data = get("https://translate.yandex.com/", headers=self._headers).text
sid_position = data.find("Ya.reqid = '")
if sid_position != -1:
data = data[sid_position + 12:]
self._sid = data[:data.find("';")]
break;
else:
# sleep some amount of time
# keep track of retries and stop after MAX_RETRIES
if n < self.MAX_RETRIES:
time.sleep((2 ** n) + (random.randint(0, 1000) / 1000))
n += 1
else:
break;
return self._sid
The function either returns the _sid
value, or "".
in the translate
function, one can test the self._sid
value to see if it is valid.
Let me know if I missed the idea behind your questions.
Mark
Basically the problem is that this would block the user for too long:
Admitting that the user wants to translate with Yandex (Translator(use_google=False, use_bing=False, use_reverso=False, use_yandex=True)
):
# the user request a translation
t.translate("Hello", "Japanese")
# then the refreshSID method will be called
# It will try to download the webpage (which is already quite long)
data = get("https://translate.yandex.com/", headers=self._headers).text
# it will do his lightweight operations
sid_position = data.find("Ya.reqid = '")
if sid_position != -1:
data = data[sid_position + 12:]
self._sid = data[:data.find("';")]
break; # great, the user had to wait only ~ the time to download the webpage
else: # but here comes what I consider too long for the user to wait
time.sleep((2 ** n) + (random.randint(0, 1000) / 1000)) # the user needs to wait ~ 2 seconds first
# then it will redownload the webpage
# then if the sid is not found the user will need to wait ~ 4 seconds and the download time
# then ~ 8 seconds
# And imagining that self.MAX_RETRIES == 2, the user will need to wait at worst more than ~ 15 seconds without any return
What I meant with lastTried
is this:
# the user request a translation
t.translate("Hello", "Japanese")
# if the SID is blank or is not working, the refreshSID method is called
if time() - self.lastTried > 600: # if the duration between the last time we tried to get the SID and now is greater than 10 minutes
# it will do the stuff to get the sid
data = get("https://translate.yandex.com/", headers=self._headers).text
sid_position = data.find("Ya.reqid = '")
if sid_position != -1: # yay it got it!
data = data[sid_position + 12:]
self._sid = data[:data.find("';")]
self.lastTried = time() # maybe keep that in a file
# we will retry the translation and it will return it
else:
self.lastTried = time() # maybe keep that in a file
# the translation will return None
else:
pass # do nothing as we know that yandex will rate-limit us if we ping too much their website
# the translation will basically return None
I started the code for the retry with n = 0
, so the first wait time is around 1 second. The wait time on the 2nd iteration (MAX_RETRIES = 2) is ~3 seconds.
A sample set of wait times (seconds) for 5 iterations of the wait cycles:
>>> for n in range(0,5):
... print("n=%s, wait=%s sec" % (n, (2**n) + (random.randint(0,1000)/1000)))
...
n=0, wait=1.003 sec
n=1, wait=2.9699999999999998 sec
n=2, wait=4.244 sec
n=3, wait=8.597 sec
n=4, wait=16.457 sec
The only reason for the wait is that the _sid
is not valid. You don't know the reason why a valid _sid
was not returned. Perhaps a Yandex server is going through a reboot or db cleanup, or too much traffic on the network, or Yandex thinks you are a bot, or..... One way to automatically fix the problem is to wait and try again. However, if your wait time is too short, then each time you hit the Yandex server, you will reset the "bot wait time" on the server and never get through.
The reason for the growing wait time is to allow the "bot wait time" on the Yandex server to time out, so your code can get a valid _sid
. You can change the parameters of the wait time to make them smaller. But you run the risk of always resetting the Yandex "bot wait time" because your intervals are too short and you will never get a response. Bottom line, we don't know the Yandex server's "bot wait time", nor do we know when it might change. It could be 3 seconds today, 10 seconds tomorrow, or based on some algorithm that throttles responses to these requests based on current network traffic. Hence the growing wait time. The goal is to get a Yandex translation in the least amount of time, up to a certain limit, then quit trying.
Your second solution is also valid, but you are only allowing the user to access the Yandex server once every 10 minutes. This is equivalent to setting n=10
in the first example, and you found n=4
to be too long to wait. This solution would be better if you knew the life of an _sid
, and how many times I can hit the server with the same _sid
and get a valid response. However, _sid
could easily be one a time value that has to be refreshed on each request to the server. Maybe I can hit the Yandex server every 3 seconds to get a new code, so in your 10 minute wait I could have made 200 translations. Is there any documentation on this?
Some testing would tell us a little bit about the Yandex server's "bot wait time", but as I said earlier that number can change at any time.
I don't think there is a need to store the self.lastTried
in a text file. When the program starts, just go get the_sid
. If it fails, then set your self.lastTried
. In your code, you aren't reusing the _sid
, so if it is initialized at program start, you are free to go get it.
@pmi123
This "bot wait time" is a captcha preventing requests from getting the desired webpage (with the SID in it) and displaying an action which requires the action of a human (typing the word shown).
This "rate-limit" occurs quite frequently and is quite long (more than 10 minutes).
The snippet I wrote isn't waiting for a good answer for 10 minutes but rather is saying "Well I've already tried not so long ago, I should a little before retrying to get an SID and I should just say that I couldn't proceed with the request".
Also, the SID seems does not seem to be a one-time token as I already used it for multiple requests. It could maybe have a timeout of a day, an hour, etc.
MAX_RETRIES=4
(32 seconds, the trying time adds up at each iteration) is wayyy too long for the user: imagine writing a program which needs translations and having your program stopped just because another module is trying to refresh his ID while he might not succeed.
Is there any documentation on this? Nope sorry, I searched for it but couldn't find anything (quite normal since it's not a public API)
but you are only allowing the user to access the Yandex server once every 10 minutes In fact, not really. I'm allowing the module to go try to fetch a new SID every 10 minutes. If it couldn't find any, the
translate
method will just returnNone
Sorry for not being very clear...😔
Another method would be to implement sort of a mix between my snippet of code and your, running it endlessly (before finding a working SID), resuming it when the SID isn't working again.
The refreshing function would be running in another thread, in the background so that it is seamless to the user + it would mean quicker results for the user
The _sid seems to last like a few days or so before it expires, when it does it throws this error: {u'message': u'Session has expired', u'code': 406}.
@Animenosekai Apologies for not fully understanding how Yandex treats the sID
. Given the 10 min delay before Yandex will return a new sID
, I agree your approach in this instance is much better than the exponential backoff method I proposed.
Based on your post, my "bot_wait_time" is the same as your "rate limit" - the time the Yandex server refuses to service a request (get the sID
). I should have defined my terms better...it would have created less confusion.
I like your idea of the background thread. Programming threads is fun and challenging to take care of all the timing issues and edge cases. Or, just save lastTried
to disk and check it as needed.
If you save the value to disk, you will have to figure out what to do in the use case that the lastTried
value as deleted from the file system by some other program than yours. At that point, the options are (1) sID=""
, or (2) sID=some value
. How do you decided to refresh the sID
or not in the second case? If it has been less than 10 minutes since the last refresh, will you get a new valid sID
, or will Yandex freeze you out for 10 minutes? If it has been more than 10 minutes, then it should be OK to refresh the sID
.
Mark
Nice! I'm planning on implementing it in the next week as I was working on another project ~
If it has been less than 10 minutes since the last refresh, will you get a new valid sID, or will Yandex freeze you out for 10 minutes?
Maybe return None until I can get a new SID?
Wow the _sid I had stored into a txt file lasted 8 days before it expired, my bot only updated it once, by removing the old _sid and replacing it with the new _sid.
@pmi123 What do you think of the fe22afe ?
Wow, seems like yandex.com banned my ip, it won't let me connect to it, watch out lol
Wow, seems like yandex.com banned my ip, it won't let me connect to it, watch out lol
Lmao, yea that's why it's better using the base Translator()
class as it will change to other services when one is returning errors.
Also, what do you mean by "banned my ip": Is it just that you need to solve captchas when you go to their site or you are even forbidden to go to the website and verify that you are a human?
Wow, seems like yandex.com banned my ip, it won't let me connect to it, watch out lol
Lmao, yea that's why it's better using the base
Translator()
class as it will change to other services when one is returning errors.Also, what do you mean by "banned my ip": Is it just that you need to solve captchas when you go to their site or you are even forbidden to go to the website and verify that you are a human?
Yeah they cut off the connection to my ip completely! Like "this site can't be reached" was what I was getting lol for the whole yandex.com.
Wow, seems like yandex.com banned my ip, it won't let me connect to it, watch out lol
Lmao, yea that's why it's better using the base
Translator()
class as it will change to other services when one is returning errors. Also, what do you mean by "banned my ip": Is it just that you need to solve captchas when you go to their site or you are even forbidden to go to the website and verify that you are a human?Yeah they cut off the connection to my ip completely! Like "this site can't be reached" was what I was getting lol for the whole yandex.com.
Can you access the website with a VPN though?
Wow, seems like yandex.com banned my ip, it won't let me connect to it, watch out lol
Lmao, yea that's why it's better using the base
Translator()
class as it will change to other services when one is returning errors. Also, what do you mean by "banned my ip": Is it just that you need to solve captchas when you go to their site or you are even forbidden to go to the website and verify that you are a human?Yeah they cut off the connection to my ip completely! Like "this site can't be reached" was what I was getting lol for the whole yandex.com.
Can you access the website with a VPN though?
Yes, thats how I noticed I got ip banned aha
lmaoooo well I guess people shouldn't try to use Yandex Translate too much
Yeah, I couldn't even imagine how hard it would be to fix Yandex Translate module, and yet could not fix. The only thing I have implemented so far is refatorining the code.
Yeah, I couldn't even imagine how hard it would be to fix Yandex Translate module, and yet could not fix. The only thing I have implemented so far is refatorining the code.
Yes they have a pretty strict rate limiting/bot detecting system which triggers captcha and even bans your IP if you use it too much (that's why I'm calling it last in the Translator
class)
Yes they have a pretty strict rate limiting/bot detecting system which triggers captcha
Yes, I noticed that Yandex very greedily does not want to give out SID (session ID). I did not know that the Russians are so greedy (although I myself am Russian)
even bans your IP if you use it too much
Wow! It's overkill
After 5 hours of experimentation, today I managed to get Yandex Translator - I found a bug (or feature) in the REST API method tr.json, which allows you not to use (and not parse) SID. In a few hours I think I will finally write it all into Python code.
Hey everyone, is the yandex translate still working for you?
Hey everyone, is the yandex translate still working for you?
Yea we fixed it and now that you are reminding me let me just publish the new version ~~
Hey everyone, is the yandex translate still working for you?
I just published v1.7 on PyPI.
You should now be able to update translatepy
with the usual command:
pip install --upgrade translatepy
Yeah I thought there was an update on yandex's end but it was just my file system messing up lol. But I think there was an update on the Bing translate now.
But I think there was an update on the Bing translate now.
The Bing translator has very strict limits on the number of requests per minute/per hour/per day (not exactly known), there are no methods to bypass the Bing API restriction yet (for more information, see the following message). As an option to use a proxy, or to use other, more stable services, such as Yandex, he did not care about the number of requests and the quality of the translation seems to me he is better than Google translator (at least the languages of the post-Soviet Union). But the restriction of requests of the Bing translator is not even nearby with restrictions DeepL the translator, this is just some kind of hell
Let's go back to Bing Translator. In principle, there is one loophole that I think will allow to bypass all the restrictions of requests - it is to use Microsoft Translate. As far as I know, they both use the same engine. The only difference is that Microsoft Translate requires an API key that is linked to the account and charged for use, and I understand it is intended for the corporate segment. But if we look at the Microsoft Translate mobile application, we can see that the application generates the x-mt-signature and x-clienttraceid header based on some data, and the server makes a free translation. x-clienttraceid is just a regular v4 UUID, but x-mt-signature looks like the value of hmac sha256, time and some other unknown data. If we can solve this riddle, we will have a stable Bing translator.
fromLang: auto-detect
text: kanker
to: en
token: 1D03dhmjKLPeQvzr4OpEdrGFhDA-hPC9
key: 1622167441181
bing translate added in a token and key, they seem to expire pretty fast, maybe every 5 minutes.
bing translate added in a token and key
Yes, that's exactly right. See https://github.com/Animenosekai/translate/issues/13
maybe every 5 minutes.
I ran some tests - the token and the key are valid for at least 10 minutes
Hi guys, after looking into the bing translate to find where the token and key is stored I have found it!
params_RichTranslateHelper = [1622472153451,"VvgFaimiFuqUEoaS5Z8r9IyKcNoGVkPO",900000];
after looking into the bing translate to find where the token and key is stored I have found it!
Yes, thanks. I have already implemented this in the upcoming alpha version 2.0:
Yes, thanks. I have already implemented this in the upcoming alpha version 2.0 https://github.com/ZhymabekRoman/translate/blob/80ce159757f1b6a5ac20c2559474f66cae8488b8/translatepy/translators/bing.py#L69-L73
Nice!!
I'm very sorry for not helping much...
I'll have holidays on July and August so I'll fully be able to code on translate
there
wow such a great thread sad to be needing to close this as v2 will come in no time solving the issue ~
I acquired a key from Yandex per the documentation, and added it to the self._id in yandex.py. I then tried the following test:
YandexTranslate().translate("Er ist klug.", 'en', source_language='de')
And received this response:
When I googled for the error code from Yandex, I found this page: https://yandex.com/dev/translate/doc/dg/concepts/api-keys.html, which implies the free API was discontinued in May 2020 for non-residents of the Russian Federation.
What am I doing wrong, or has the api been discontinued?
Thanks!
Mark