Open rupeshkumaar opened 1 year ago
Can you post the output?
@nidhaloff yeah sure, I am posting the scenarios I tried. I cannot share the document but I am using one of the chinese characters and then replicating it for n number of times just for the argument's sake.
from deep_translator import GoogleTranslator
import sys
# keeping the char size limited to 1800
content="""学"""
content *= 1800
len(content)
>> 1800
sys.getsizeof(content)
>> 3674
# # Using GoogleTranslator
translated = GoogleTranslator(source='auto', target='en').translate(content)
translated
'study study study study study studying studying studying studying studying studying studying studying studying studying studying studying studying scholastic scholastic scholastic scholastic scholastic scholastic scholastic scholastic scholastic scholastic'
# keeping the char size limited to 1900
content="""学"""
content *= 1900
len(content)
>> 1900
sys.getsizeof(content)
>> 3874
# # Using GoogleTranslator
translated = GoogleTranslator(source='auto', target='en').translate(content)
# keeping the char size limited to 4999
content="""学"""
content *= 4999
len(content)
>> 4999
sys.getsizeof(content)
>> 10072
# # Using GoogleTranslator
translated = GoogleTranslator(source='auto', target='en').translate(content)
# I got the below error
deep_translator.exceptions.RequestError: Request exception can happen due to an api connection error. Please check your connection and try again
# keeping the char size limited to 5000
content="""学"""
content *= 5000
len(content)
>> 5000
sys.getsizeof(content)
>> 10074
# # Using GoogleTranslator
translated = GoogleTranslator(source='auto', target='en').translate(content)
and for the 5000 characters I got the deep_translator.exceptions.NotValidLength error which was expected. But I think it should be for 5001st character and not for 5000th character. Please guide me if I am wrong.
And for the above issue I have gone through various articles and posts and stackoverflows questions and I found out that we are using GET method which results to the character limit of 2k though I am not able to achieve the result for 2k characters but that is what I found and for the POST method the character limit is 5k. So, maybe that could be the reason of capping it upto 2k. But, I am not sure. Please guide me if I am wrong. (source:[https://stackoverflow.com/questions/18754905/google-translate-api-cannot-send-more-than-2000-characters-per-request])
@rupeshkumaar Hm I didn't know about the GET request limitation. Can you hack up and test that using post?
@nidhaloff I had tried using post but earlier I was getting 411, it needed Content-length as the request header, and after that I started getting 405. So, I was not able to work it out. But I tried another module translators and under the hood it was using the 5k characters limit. But only drawback I found was if the limit is exhausted or somehow you got 429 then you are done for the day. So, it didn't fulfill my requirement. deep-translator could, so I am currently using deep-translator with 1800 characters limit but if you could look into this and work it out somehow. Hope this helps.
I was trying to request a document text that was in Chinese but I was not able to send a request with more than 1800 characters. Though it says it has 5k character limit. I am getting RequestError with 400 status code. I have latest version of deep-translator.