pndurette / gTTS

Python library and CLI tool to interface with Google Translate's text-to-speech API
http://gtts.readthedocs.org/
MIT License
2.28k stars 362 forks source link

ValueError: Unable to find token seed! Did https://translate.google.com change? #232

Closed Mohammed-Shoaib closed 3 years ago

Mohammed-Shoaib commented 3 years ago

After I send 10-15 requests, I get the following error:

Traceback (most recent call last):
  File "text_to_speech.py", line 25, in <module>
    text_to_speech(data)
  File "text_to_speech.py", line 17, in text_to_speech
    tts.save(f'audio/{key}/def-{i + 1}.mp3')
  File "/home/shoaib/anaconda3/lib/python3.7/site-packages/gtts/tts.py", line 111, in save
    self.write_to_fp(f)
  File "/home/shoaib/anaconda3/lib/python3.7/site-packages/gtts/tts.py", line 124, in write_to_fp
    'tk' : self.token.calculate_token(part)}
  File "/home/shoaib/anaconda3/lib/python3.7/site-packages/gtts_token/gtts_token.py", line 28, in calculate_token
    seed = self._get_token_key()
  File "/home/shoaib/anaconda3/lib/python3.7/site-packages/gtts_token/gtts_token.py", line 59, in _get_token_key
    "Unable to find token seed! Did https://translate.google.com change?"
ValueError: Unable to find token seed! Did https://translate.google.com change?

I thought this might be a bug in the latest release, i.e., 2.1.1 so I tried 1.2.0 and I still get the same error.

I also thought it could be that my IP address is being blocked. I read up on the Quotas & limits and I don't think that should be an issue as well.

System information:

Thank you for your timely help. :relaxed: :sparkles:

thedaynos commented 3 years ago

I started getting this last week periodically. It happened about 10 times to me today. I have no clue.

Mohammed-Shoaib commented 3 years ago

The error is not continuous, it happens randomly from what I can tell. For me it's definitely within 10-15 requests.

Since I observed that it doesn't happen continuously, I just wrote a while loop in python to try again if the error occurred:

import os
from gtts import gTTS

text = 'Your sentence requiring text to speech'
file_path = 'text.mp3'

while not os.path.exists(file_path) or os.path.getsize(file_path) == 0:
    try:
        tts = gTTS(text=text, lang='en', slow=False)
        tts.save(file_path)
    except Exception as e:
        print(e.message)

Obviously, this is a very bad solution and I would never recommend it. But it does the job for now as a temporary fix until this issue gets permanently resolved.

thedaynos commented 3 years ago

That's interesting. Mine was breaking on the tts.save line, so this wasn't working for me. This is how I got it working...

import os
from gtts import gTTS
count=1
text = 'Your sentence requiring text to speech'
file_path = 'text.mp3'
tts = gTTS(text=text, lang='en', slow=False)
while True:
    try:
        tts.save(file_path)
                break
    except: 
        print('got the issue '+str(count))
                count+=1

I have the count in there just out of curiosity, obviously you don't need that. I just tried this for about 20 minutes and got the error maybe 20% of the time. Only once did the count get to 2.

Seems to be working without crashing now.

afonsosantos commented 3 years ago

Happening with me also, using Python 3.9 on Windows 10. Used this library before and not issues, check the code here: https://github.com/afonsosantos/alexis/blob/master/main.py

The error:

Traceback (most recent call last):
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\main.py", line 25, in <module>
    respond(voice_data)
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\main.py", line 9, in respond
    date()
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\commands\date_time.py", line 9, in date
    speak('A data de hoje é ' + str(today_date))
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\utils\audio.py", line 51, in speak
    tts.save(audio_file)
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\venv\lib\site-packages\gtts\tts.py", line 295, in save
    self.write_to_fp(f)
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\venv\lib\site-packages\gtts\tts.py", line 251, in write_to_fp
    prepared_requests = self._prepare_requests()
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\venv\lib\site-packages\gtts\tts.py", line 194, in _prepare_requests
    part_tk = self.token.calculate_token(part)
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\venv\lib\site-packages\gtts_token\gtts_token.py", line 28, in calculate_token
    seed = self._get_token_key()
  File "C:\Users\afons\Desktop\pyvoice\pyVoice\venv\lib\site-packages\gtts_token\gtts_token.py", line 58, in _get_token_key
    raise ValueError(
ValueError: Unable to find token seed! Did https://translate.google.com change?

EDIT: after some analysis, seems to be crashing on the tts.save line as below:

image

thedaynos commented 3 years ago

Yes it's crashing at the save. Take a look at the fix I presented above. It will keep the code working. Just use a while loop and put the "try" call on the save with a break after. For the except, you can put anything in there. I just liked to see an error message but it's completely optional.

This is obviously not ideal but the code is working fine on my end after implementing the while loop.

afonsosantos commented 3 years ago

Hi @thedaynos , thanks for the help. I will make that "fix". Maybe this is a issue related to Python 3.9? What version are you using?

thedaynos commented 3 years ago

no problem @afonsosantos ... I'm using 3.6 on ubuntu. I updated all of the libraries/packages that I'm using as well.

Mohammed-Shoaib commented 3 years ago

@afonsosantos I am using Python 3.8.5, I doubt it could be an issue related to Python.

@thedaynos You are right, the error isn't that often. For me, the max number of continuous failed attempts were 3 and it is occurring on the line tts.save function call.

badjano commented 3 years ago

found a guy saying to add client=tw-ob in the url ( can´t be sure why, and if it works 100% of the time)

https://translate.google.com/translate_tts?ie=UTF-8&q=Annoying%20bug&tl=en-US&client=tw-ob

EDIT: BTW, you can use this url to download, so you don´t even need this repo

afonsosantos commented 3 years ago

I can make a PR with that change and see if that improves anything. Opened PR #234

Mohammed-Shoaib commented 3 years ago

I found the post,

Add the qualifier '&client=tw-ob' to the end of your query. https://translate.google.com/translate_tts?ie=UTF-8&q=test&tl=zh-TW&client=tw-ob

This answer no longer works consistently. Your ip address will be blocked by google temporarily if you abuse this too much.

Seems like adding client=tw-ob doesn't work consistently though.

XuanKien-Nguyen commented 3 years ago

I'm also stuck in this bug. anyone can fix this

afonsosantos commented 3 years ago

@badjano seems like using that link too much gets you locked.

I will for alternative services until this issue is fixed.

Fabian42 commented 3 years ago

@afonsosantos What is "too much"? Google Translate TTS in general has a limit of 2M characters per day.

afonsosantos commented 3 years ago

@Fabian42 as @Mohammed-Shoaib stated:

Your ip address will be blocked by google temporarily if you abuse this too much.

I don't want the risk of being blocked by using that parameter, and If that does not work either, I will find alternatives.

Mohammed-Shoaib commented 3 years ago

@afonsosantos You don't really need to find or use alternatives though.

As mentioned by @thedaynos in his comment, you could use the same trick temporarily for now until this gets fixed. I am sure the developers will eventually fix this soon, let's be patient.

afonsosantos commented 3 years ago

@Mohammed-Shoaib I will, but I need alternartives in case the issue does not get fixed soon. I don't want to put pressure on the projects devs. Thanks for all the solutions provided.

fidelhuang1 commented 3 years ago

i have same problem

silvertree-arc commented 3 years ago

I have the same problem, at first was intermittent now all the time. @alexrink96 has posted a good workaround here using the requests library >> (https://github.com/pndurette/gTTS/issues/226#issuecomment-719461765) that uses a different way to access the api and it is working for me at the moment.

afonsosantos commented 3 years ago

That looks like a good alternative, even to this library itself. Will test that and report the results. Thanks @silvertree-arc

marvin-w commented 3 years ago

Multiple users in the mycroft chat including me also reported this error. I'll check if I can test the workaround.

marvin-w commented 3 years ago

The underlyting gtts-token library was just updated with this change: https://github.com/Boudewijn26/gTTS-token/commit/8d681855214b093c78d5a5da4fa19f92c4e233ef and it was released already.

However, this still doesn't fix the actual problem but instead just hides it in the background. (Retry 5 times or throw error)

pndurette commented 3 years ago

Yeah, this is starting to get tough to maintain/counter. It's a pretty unorthodox usage of this translate endpoint after all..

But will pull in the updated gtts-token and other fixes and do a release later today.

pndurette commented 3 years ago

found a guy saying to add client=tw-ob in the url ( can´t be sure why, and if it works 100% of the time)

https://translate.google.com/translate_tts?ie=UTF-8&q=Annoying%20bug&tl=en-US&client=tw-ob

EDIT: BTW, you can use this url to download, so you don´t even need this repo

This is the URL this library uses. gTTS just helps out with everything that goes around it: maximum length, languages, tokenizing long sentences, etc), basically what you'd have to do to make just requesting this URL work well for you.

Over the years, I've had to change the tw URL param a few times. Hadn't had to in a while.

However, looking at network traffic when using the newer https://translate.google.com, it's pretty obvious that Google has switched to another way of generating speech. It doesn't hit /translate_tts anymore. So it probably will disappear eventually. I haven't been able to understand their new method yet.

Considering how many libraries use gTTS (or derivatives of it, like Home Assistant, see: https://github.com/home-assistant/core/issues/42911, e.g. https://github.com/home-assistant/core/blob/dev/homeassistant/components/google_translate/tts.py), for sure their SREs are aware of this by now and possibly very annoyed by the usage this generates.

I enjoy a good challenge of trying to find a way to make this work reliability again, but as I was saying, this will become more and more difficult—perhaps impossible.

pndurette commented 3 years ago

gTTS has been updated to 2.1.2!

It updates gTTS-token to 1.1.4 and a few other various fixes, see changelog.

I've also reworked the GitHub Action workflows and "marked" with PyTest the tests that try to access the /translate_tts URL (most of them) and allowed them to fail for now. Like I stated above, I will try to look into a better way to work around this.

Boudewijn26 commented 3 years ago

To add to this discussion: I, as maintainer of gTTS-token, also intent to look into this further in order to, if possible, ensure compatibility with the new changes Google is rolling out. This has been complicated by the simple fact I haven't received the new Google Translate yet. It's not at all uncommon for larger corporations to roll out changes gradually and that indeed appears to be happening here. We don't have any insight into which regions get this new "feature" first, so we'll just have to wait for now.

That being said, this is an undocumented API, so we really can't blame Google for changing this without notice. For what I gather Google has turned a blind eye towards this whole thing. It wouldn't be too difficult for them to obfuscate the API to such an amount it'd be tremendously difficult to reverse engineer, which is something they haven't done (yet?) in the many years these projects exist.

As stated, should this problem persist, everyone is free to comment on https://github.com/Boudewijn26/gTTS-token/issues/20 or raise a new issue there (or continue the discussion here, I suppose).

devmaster-terian commented 3 years ago

Ok! We'll be sticked here so as we can hear news about this issue.

kshitij98-Cpp commented 3 years ago

pip install gTTS-token --upgrade upgrading the token fixed the issue, at least for me.

Boudewijn26 commented 3 years ago

Prompted by https://github.com/home-assistant/core/issues/42911#issuecomment-724712969, I've decided to further my investigation. You can follow along https://github.com/Boudewijn26/gTTS-token/pull/23, coincidentally the CI for that PR hit the same issue. Initial findings do point to big changes being needed to maintain compatibility. I'll update the PR regularly, so everyone can follow along.

afonsosantos commented 3 years ago

New update from gTTS-token: https://github.com/Boudewijn26/gTTS-token/blob/fix/november-changes/docs/november-2020-translate-changes.md#ladies-and-gentlemen-weve-got-him

Boudewijn26 commented 3 years ago

Yes, that pull request has now been merged, so the url is https://github.com/Boudewijn26/gTTS-token/blob/master/docs/november-2020-translate-changes.md#ladies-and-gentlemen-weve-got-him. What's perhaps more interesting is what's going on in https://github.com/Boudewijn26/gTTS. Apart from the lookup of locales, I've pretty much gotten it working. I think you can expect a pull request by tomorrow. All in all this wasn't as bad as I was expecting, so I'm pretty relieved.

pndurette commented 3 years ago

@Boudewijn26 Wow, you're amazing. This is quite a feat of reverse engineering! I wonder if jQ1olc is a value that's generated by their build system (during obfuscation, minimization). But it shouldn't be to hard to update (since you made it a constant) if that's the case.

pndurette commented 3 years ago

Adding to this (I will put in a more official document or place), I want to:

Boudewijn26 commented 3 years ago

Thank you @pndurette, you're too kind. I was wondering the same thing about jQ1olc, perhaps it could be a future task for gTTS-token to determine the name of the RPC.

On the language retrieval: I haven't had a look at the changes there, so it might perhaps be something we can do reliably. I do see a good case for checking them into the repo to prevent flakiness.

deltaflyer4747 commented 3 years ago

Again (occasionally) not working with gtts 2.1.2 and gtts-token 1.1.4

gets stuck on this:

Traceback (most recent call last):
  File "/usr/local/sbin/talk", line 37, in <module>
    tokenizer_cases.period_comma,
  File "/usr/local/lib/python3.5/dist-packages/gtts/tts.py", line 129, in __init__
    langs = tts_langs(self.tld)
  File "/usr/local/lib/python3.5/dist-packages/gtts/lang.py", line 40, in tts_langs
    langs.update(_fetch_langs(tld))
  File "/usr/local/lib/python3.5/dist-packages/gtts/lang.py", line 76, in _fetch_langs
    soup = BeautifulSoup(page.content, 'html.parser')
  File "/usr/local/lib/python3.5/dist-packages/bs4/__init__.py", line 279, in __init__
    markup, from_encoding, exclude_encodings=exclude_encodings)):
  File "/usr/local/lib/python3.5/dist-packages/bs4/builder/_htmlparser.py", line 237, in prepare_markup
    exclude_encodings=exclude_encodings)
  File "/usr/local/lib/python3.5/dist-packages/bs4/dammit.py", line 366, in __init__
    for encoding in self.detector.encodings:
  File "/usr/local/lib/python3.5/dist-packages/bs4/dammit.py", line 264, in encodings
    self.chardet_encoding = chardet_dammit(self.markup)
  File "/usr/local/lib/python3.5/dist-packages/bs4/dammit.py", line 34, in chardet_dammit
    return chardet.detect(s)['encoding']
  File "/usr/local/lib/python3.5/dist-packages/chardet/__init__.py", line 38, in detect
    detector.feed(byte_str)
  File "/usr/local/lib/python3.5/dist-packages/chardet/universaldetector.py", line 211, in feed
    if prober.feed(byte_str) == ProbingState.FOUND_IT:
  File "/usr/local/lib/python3.5/dist-packages/chardet/charsetgroupprober.py", line 71, in feed
    state = prober.feed(byte_str)
  File "/usr/local/lib/python3.5/dist-packages/chardet/sjisprober.py", line 75, in feed
    self.context_analyzer.feed(byte_str[i + 1 - char_len:i + 3
pndurette commented 3 years ago

Hi all,

I've merged in #244 with @Boudewijn26's new audio download code (thanks again! 🙏)

Languages

Still working on the languages, which I'm making good progress on. I've identified where the language codes that provide TTS are (vs. all language codes—not all languages in Google Translate provide text-to-speech), I just need to fetch them, with some RegEx magic. Still aiming to push gTTS 2.2.0 Nov 14 (GMT-5).

marvin-w commented 3 years ago

Thank you for your work on this!

Feel free to tag me when the new release is out so I can open the necessary PRs in the upstream repositories (Home Assistant, Mycroft).

pndurette commented 3 years ago

Hi all, @marvin-w & @Boudewijn26

gTTS 2.2.0 has been published! 🎉 Big changes are #244 & #245

(FYI, gTTS 2.2.1 is out, bug fix)

On languages

Optimize the language retrieval bit since it's not reliable enough anymore. Perhaps switching back to a more static language list (obtained via the same means, but checked in into the repo).

This is exactly what I did. I figured out a way to programmatically extract it using the new Google Translate (code not yet "GitHub-ready") and I will add this to the repo, but out of the main code. After that, my idea is to have a GitHub Action run periodically, which could create a PR if it detects a change. A new patch version of gTTS could then be released.

What's next—

@marvin-w:

Feel free to tag me when the new release is out so I can open the necessary PRs in the upstream repositories (Home Assistant, Mycroft).

Here you go! 😄 As I wrote earlier, I know that HA uses its own version of an old gTTS, I'm wondering if 2.2.0 can be integrated as-is, or if it needs to enhanced (i.e. Python 3, async/await, etc). I would love to see HA use gTTS directly, and I'm really open to do any changes in this regard.

marvin-w commented 3 years ago

Thank you! I'll have a look!

marvin-w commented 3 years ago

@pndurette Your lib uses the requests package. I think one reason for them to not go with it might be the fact that you aren't using the aiohttp package instead. Also, one should probably be able to inject the websession directly into the library.

I'm not quite sure on how I'd tackle the update on HA side, I'll give them a ping in their discord and let them decide. I hope someone of them will reply here :).

pndurette commented 3 years ago

aiohttp client is pretty nice and maybe what I should head towards—gTTS has been using requests since 2014! But the work for async requests is all over the place, a lot of it isn't maintained and/or is for older Python 3.

hmmbob commented 3 years ago

Seems like the old solution in HA finally broke completely now.... :(

https://github.com/home-assistant/core/issues/43801

pndurette commented 3 years ago

@hmmbob Ah yeah that was imminent. Sadly did not have any time to work on the above to make gTTS more HA-friendly. 😭

hmmbob commented 3 years ago

Just checked with one of the core devs (Frenck): there is no need for you to redo your package to use aiohttp instead of requests (basically, the answer was "he can use anything he likes" lol) Work will need to be done in integration the package correctly in the HA Google TTS component

Wish I could be of help, but I don't have the programming experience to build this myself....

marvin-w commented 3 years ago

@hmmbob Could you properly test it if I do it?

hmmbob commented 3 years ago

I'd say yes (it basically involves installing it as custom_component, I guess?)

Have a few people around me that could test too.

marvin-w commented 3 years ago

I'd create a feature branch - you'd basically have to check it out and run HA from there and then test google_translate.

hmmbob commented 3 years ago

That's ok. Maybe we should switch to your issue at HA for this effort?

devmaster-terian commented 3 years ago

Seems like the old solution in HA finally broke completely now.... :(

home-assistant/core#43801

Yes, now is completely broken... I couldn't use the API since yesterday.

ghost commented 3 years ago

This is the error I get before I add the new gTTS master (that is, with the gtts file and the gtts_token):

Error An error occurred. Please start Anki while holding down the shift key, which will temporarily disable the add-ons you have installed. If the issue only occurs when add-ons are enabled, please use the Tools > Add-ons menu item to disable some add-ons and restart Anki, repeating until you discover the add-on that is causing the problem. When you've discovered the add-on that is causing the problem, please report the issue on the add-on support site. Debug info: Anki 2.1.35 (84dcaa86) Python 3.8.0 Qt 5.14.2 PyQt 5.14.2 Platform: Windows 10 Flags: frz=True ao=True sv=1 Add-ons, last update check: 2020-12-02 18:26:16

Caught exception: Traceback (most recent call last): File "aqt\webview.py", line 37, in cmd File "aqt\webview.py", line 123, in _onCmd File "aqt\webview.py", line 547, in _onBridgeCmd File "aqt\editor.py", line 403, in onBridgeCmd File "aqt\gui_hooks.py", line 1487, in call File "anki\hooks.py", line 594, in runFilter File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\edit.py", line 74, in onFocusLost if update_fields(note, field, allFields): File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\behavior.py", line 270, in update_fields fill_sound(hanzi, copy) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\behavior.py", line 169, in fill_sound s = sound(hanzi, config['speech']) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\sound.py", line 51, in sound return '[sound:%s]' % AudioDownloader(hanzi, source).download() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\tts.py", line 50, in download self.func() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\tts.py", line 56, in get_google tts.save(self.path) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts\tts.py", line 243, in save self.write_to_fp(f) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts\tts.py", line 183, in write_to_fp part_tk = self.token.calculate_token(part) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts_token\gtts_token.py", line 28, in calculate_token seed = self._get_token_key() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts_token\gtts_token.py", line 58, in _get_token_key raise ValueError( ValueError: Unable to find token seed! Did https://translate.google.com change?

However, this is what I get when I add the new gTTS master one and delete the other two files: Error An error occurred. Please start Anki while holding down the shift key, which will temporarily disable the add-ons you have installed. If the issue only occurs when add-ons are enabled, please use the Tools > Add-ons menu item to disable some add-ons and restart Anki, repeating until you discover the add-on that is causing the problem. When you've discovered the add-on that is causing the problem, please report the issue on the add-on support site. Debug info: Anki 2.1.35 (84dcaa86) Python 3.8.0 Qt 5.14.2 PyQt 5.14.2 Platform: Windows 10 Flags: frz=True ao=True sv=1 Add-ons, last update check: 2020-12-02 18:26:16

Caught exception: Traceback (most recent call last): File "aqt\webview.py", line 37, in cmd File "aqt\webview.py", line 123, in _onCmd File "aqt\webview.py", line 547, in _onBridgeCmd File "aqt\editor.py", line 403, in onBridgeCmd File "aqt\gui_hooks.py", line 1487, in call File "anki\hooks.py", line 594, in runFilter File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\edit.py", line 74, in onFocusLost if update_fields(note, field, allFields): File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\behavior.py", line 270, in update_fields fill_sound(hanzi, copy) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\behavior.py", line 169, in fill_sound s = sound(hanzi, config['speech']) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\sound.py", line 51, in sound return '[sound:%s]' % AudioDownloader(hanzi, source).download() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\tts.py", line 50, in download self.func() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\tts.py", line 56, in get_google tts.save(self.path) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts\tts.py", line 243, in save self.write_to_fp(f) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts\tts.py", line 183, in write_to_fp part_tk = self.token.calculate_token(part) File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts_token\gtts_token.py", line 28, in calculate_token seed = self._get_token_key() File "C:\Users\Justin Perez\AppData\Roaming\Anki2\addons21\1128979221\lib\gtts_token\gtts_token.py", line 58, in _get_token_key raise ValueError( ValueError: Unable to find token seed! Did https://translate.google.com change?

Please help and explain in simple dumb terms, as I am not a coder in any way, shape or form but a simple man just trying to use anki to learn Chinese. Please help thank you