Closed GoogleCodeExporter closed 9 years ago
Thanks, I'll take a look.
Original comment by rich.mid...@gmail.com
on 21 Nov 2007 at 3:06
I get the same problem. A workaround would be to add Thread.sleep(x), so that
at most
y requests are sent in a minute. The question is: how many requests per minute
are
allowed? How does the service decide it is too much?
Original comment by thomas.t...@gmail.com
on 24 Nov 2007 at 12:45
I guess we could let users do that, or perhaps add a translate(List<String>,
String,
String) method which would do the same thing. Also, it looks to me like a
request
every second triggers it, but a request every 2 doesn't. Two seconds is an
awfully
long time to wait though if anyone wants a lot of different bits of text
translated.
We could also throw a particular error containing the captcha image and add a
method
to take the captcha string, submit it and keep the cookie. That might be a bit
awkward for others to catch and handle though, but it does give people an
option.
What does anyone else think?
Original comment by rich.mid...@gmail.com
on 24 Nov 2007 at 4:50
> perhaps add a translate(List<String>, String, String)
I like this. Internally it could convert the list into one (large) request by
adding
'untranslatable' separators (for example large numbers), so that the result can
be
split as well there.
> a request every 2 doesn't
For me it's OK if the translate API can only process one request every two
seconds -
the API could automatically sleep as much as required. As long as there is a
way to
'bulk translate'.
Original comment by thomas.t...@gmail.com
on 25 Nov 2007 at 12:12
That'll work, but I expect other people will encounter the same problem by
invoking
translate(String, String, String) in a loop too many times, and then they'll
have to
wait for the web site to "cool down" before allowing their requests again.
Another thought is the API can keep track of the number of translation-requests
per
second, and when the user's program starts doing "too many" translations, the
API
would throttle back the requests automatically. That'll allow some programs to
experience no slow-down when performing "small quantities" translations, and
will
allow bulk translation programs to run without requiring any additional coding.
Original comment by panu...@gmail.com
on 25 Nov 2007 at 6:38
I've added throttling to subversion for this which should appear in the next
build.
Throwing captcha errors looks more difficult to handle - for some reason the
captcha
image locations don't appear in the HTML downloaded.
Original comment by rich.mid...@gmail.com
on 1 Dec 2007 at 1:01
The throttle looks good, just one question - in 'retrieveTranslation', it opens
the
HttpURLConnection and retrieves the InputStream (which establishes the network
connection to google.com), then it sleeps (in the 'toString' method), then it
reads
the response from the InputStream. Seems like the network IO would be a tad more
efficient to sleep before opening the HttpURLConnection. But that's pretty
minor.
Nice work Rich - thanks for taking to time to add in the throttle!
Original comment by panu...@gmail.com
on 3 Jan 2008 at 1:19
Then again, after 242 translations the web site freaked out and blocked me. I
just
ran a sample program that translated "Hello" from English to Russian 1000 times
(using the throttle), and after 242 translations, the site blocked my IP
address.
Another thought would be to go back to the idea of translate(List<String>,
String,
String) and send up a batch of Strings separated by new-lines, then parse the
returned contents of "result_box" and split it using "<br>". I just tried it
manually, and was able to translate a batch of 25 words via just one HTTP
request.
Original comment by panu...@gmail.com
on 3 Jan 2008 at 1:37
Solved Problem:
Hey, I did run to that problem too. The reason you run to it is because you are
overwhelming the server by doing it like this. I think google implements back-up
algorithm to handle such thing.
Here how i solved the problem. I did a loop like you did except that i make the
program sleeps 5 seconds after each translation, 10 second after 10
translations and
5 minutes after 200 translations which works fine for me. You should do
something
similar. It does take a little more time, but who cares ha.
Original comment by almojor@gmail.com
on 10 Mar 2008 at 5:28
I ran into this problem as well :(. I need to translate a bunch of XLS files
from
Japanese to English. I tried doing it cell by cell (and it worked beautifully
for the
first document) but Google soon blocked me. I'm now thinking the solution is to
compile the cells into one large string to be translated and then parse the
translation back into the correct parts of the document. The thing I want to
know is,
what is the largest string you can send to be translated? Is there a known
limit?
Thanks in advance.
Original comment by DragonWi...@gmail.com
on 25 Mar 2008 at 10:04
I don't know what the limit is, but I believe it's quite long. (> 2k
characters?)
Original comment by rich.mid...@gmail.com
on 26 Mar 2008 at 8:16
Also, if you do find the limit please let me know! Thanks.
Original comment by rich.mid...@gmail.com
on 26 Mar 2008 at 8:16
[deleted comment]
Is the solution almojor provides working or is google still blocking their
service
after some time?
Original comment by daniel.j...@gmail.com
on 11 Apr 2008 at 11:19
I discovered that sending the entire document as one long string doesn't seem to
work, but segmenting it down to about 500 characters *mostly* works. For some
reason,
about 5% of my concatenated strings return an error (always returns an error
for the
same string) but if I send each cell individually (and use a conservative delay
of 30
sec) it works just fine. I'm trying to isolate the exact sub-string that causes
the
crash but so far it has eluded me.
Original comment by DragonWi...@gmail.com
on 11 Apr 2008 at 4:21
Oh yeah. The almojor solution works for bulk translations. Just 5 seconds delay
between every request, at the 10th request 10 seconds and at the 200th request 5
minutes. I don't know if this is the most optimal but it works.
Btw my articles are on average about 4000 characters long and that works fine.
I am
not using the Java solution but my own implementation in PHP. Yes in php
because I
want to run it on a cheap hosting server with a cron tab.
Original comment by daniel.j...@gmail.com
on 11 Apr 2008 at 8:26
Attachments:
I believe this issue with repeated translations should be fixed as of version
0.4.
Original comment by rich.mid...@gmail.com
on 11 May 2008 at 12:27
Original issue reported on code.google.com by
panu...@gmail.com
on 20 Nov 2007 at 8:59Attachments: