AwesomeTTS / awesometts-anki-addon

AwesomeTTS text-to-speech add-on for Anki
GNU General Public License v3.0
478 stars 99 forks source link

NAVER Translate doesn't work #61

Closed emvnuel closed 4 years ago

emvnuel commented 5 years ago

Request got text/html instead of audio/mpeg

image

krassowski commented 5 years ago

Thanks for reporting this. I can confirm that there is an issue with NAVER Translate service.

krassowski commented 5 years ago

It looks that NAVER Translate does not exist any longer. It was replaced by NAVER Papago which offers a translation API (which require registration to get keys) but I do not see anything on TTS API.

krassowski commented 5 years ago

Found it, it's called NAVER Clova Speech Synthesis now and the keys can be purchased at ncloud.com/product/aiService/css. A new service to support Clova needs to be writtten but it should be fairly simple as they provide a nice REST API.

Magneticdud commented 5 years ago

Can just update the naver service to use the new papago?

I saw it sends via POST this string similar to a json:

UC,Uzdpitch":0,"speaker":"yuri","speed":0,"text":"幾分"}

(probably something wrong with the first bytes of base64?)

encoded in base64 to https://papago.naver.com/apis/tts/makeID

which answers with another JSON:

{"id":"1.10.9_2815-1315b8318932e75896c1b070a03dd961-1546806478016"}

then the mpeg is downloaded from:

https://papago.naver.com/apis/tts/1.10.9_2815-1315b8318932e75896c1b070a03dd961-1546806478016

sjhuang26 commented 5 years ago

I did some investigation on reverse-engineering their "demo" page.

Using the python Requests library, this is an example of a working query.

requests.post('https://papago.naver.com/apis/tts/makeID',headers={"accept-language":"en-US"},data='data=rlWuoUObLFV6ZPjvcGl0Y2giOjAsInNwZWFrZXIiOiJreXVyaSIsInNwZWVkIjowLCJ0ZXh0Ijoi7ZqM7Zal7ZKAIn0%3D').content.decode()

The accept-language header is not strictly necessary, but my guess is that it will set the error message language to English.

The data string consists of a data= prefix followed by percent encoded base64 of the following: a constant sequence of bytes (b'\xaeU\xae\xa1C\x9b,Uzd\xf8\xef'), concatenated to a UTF8 encoded JSON request with the opening brace removed (pitch":0,"speaker":"kyuri","speed":0,"text":"회향풀"})

This request returns JSON like '{"id":"1.10.10_27182-f6f03420086ccc1e862d31ffbf569a0c-1547504480274"}'. Once the ID is retrieved, a simple GET request to https://papago.naver.com/apis/tts/<id> returns the audio file.

Pet20q commented 5 years ago

Any news on the fix?

Magneticdud commented 5 years ago

The fix works great on chinese words, but gives "error 501 not implemented" on chinese sentences. Is this normal?

Because I sniffed the traffic on papago.naver.com and the server responds with a single mpeg audio file, not many single audio fragments joined together by the client

YOnoda commented 4 years ago

@aquach That triggered an error - The following problem was encountered: a bytes-like object is required, not 'str' when generating the audio.