pgmichael / wavenet-for-chrome

Chrome extension that transforms highlighted text into high-quality natural sounding audio using Google Cloud's Text-to-Speech.
http://wavenet-for-chrome.com
MIT License
134 stars 50 forks source link

This voice currently only supports LINEAR16 output. #91

Open sandglass14 opened 1 week ago

sandglass14 commented 1 week ago

"Failed to synthesize text. This voice currently only supports LINEAR16 output."

I am using Wavenet for Chrome extension just to text-to-speech some text using voice Journey O. For some reason I get the above error. I was using that voice last week, everything was perfect. Now, all of a sudden the above error. The error seems to appear only for Journey voices, other listed voices work, but they are nowhere near the human-like quality of Journey voices, in my opinion.

Is there a way to resolve the above error somehow?

Appreciate any help. Thank you in advance.

pgmichael commented 1 week ago

Hello! This is an error message returned by Google itself. It means that the audio format asked for this voice isn't supported. I believe changing the playback or download audio format (ex: MP3, OGG) in the UI should fix the issue.

Let me know!

superluig164 commented 1 week ago

It seems like Google changed something with the Journey voices. Changing the output format to WAV should work, but now the error message says that these voices don't support pitch and speed adjustments. They always didn't, but before they just did nothing, now it fails to read anything.

sandglass14 commented 1 week ago

It seems like Google changed something with the Journey voices. Changing the output format to WAV should work, but now the error message says that these voices don't support pitch and speed adjustments. They always didn't, but before they just did nothing, now it fails to read anything.

Yes, I believe you're right. The issue seems to stem from both Google and Wavenet. Google recently seems changed the audio encoding format for Journey voices, which might explain why pitch and speaking rate adjustments aren't supported for those voices yet. Unfortunately, switching the output format in Wavenet to WAV doesn't resolve the issue, resulting in this error:

"Failed to synthesize text. This voice does not support speaking rate or pitch parameters at this time."

For example, that's how currently JSON looks like in the official Google demo specifically for Journey voices (regardless how I change speech rate or pitch, it always passes the request with zero values for those settings for Journey voices): https://cloud.google.com/text-to-speech?hl=en

{
  "audioConfig": {
    "audioEncoding": "LINEAR16",
    "effectsProfileId": [
      "small-bluetooth-speaker-class-device"
    ],
    "pitch": 0,
    "speakingRate": 0
  },
  "input": {
    "text": "Movies, oh my gosh, I just just absolutely love them. They're like time machines taking you to different worlds and landscapes, and um, and I just can't get enough of it."
  },
  "voice": {
    "languageCode": "en-US",
    "name": "en-US-Journey-O"
  }
}

While WAV format should theoretically work for encoding format, but Wavenet extension UI does not allow to set speed to zero (the minimum allowed value is 0.5x). Maybe this is the reason for the generated error: ""Failed to synthesize text. This voice does not support speaking rate or pitch parameters at this time." ???

Is it something that Wavenet needs to fix asap?

sandglass14 commented 1 week ago

So basically Journey speech voices are impossible to generate/download anymore using Wavenet for Chrome extension, rendering this extension pretty much useless (because all other voices in the list are just out right no go robot voices from 1970's), is that correct? Thanks.

sandglass14 commented 5 days ago

Hi there,

I’m pleased to notice that Journey voices are now working again in the Wavenet Chrome extension for the “When reading aloud” feature. To get it working, simply switch the audio format from default OGG to WAV.

However, Journey voices still do not function with the “When downloading” feature, which is a key point of this extension in my opinion. Currently, downloading is only supported in MP3 format. Please add support for downloading in WAV format to this feature.

Thank you!

superluig164 commented 5 days ago

You also need to set the pitch and speed parameters back to 1x and 0, otherwise you will still get the error that they are not supported with Journey voices.

Thank you, Tommy Sebestyen

Sent from the Gmail website.

On Thu, Sep 12, 2024 at 10:01 PM sandglass14 @.***> wrote:

Hi there,

I’m pleased to notice that Journey voices are now working again in the Wavenet Chrome extension for the “When reading aloud” feature. To get it working, simply switch the audio format from default OGG to WAV.

However, Journey voices still do not function with the “When downloading” feature, which is a key point of this extension in my opinion. Currently, downloading is only supported in MP3 format. Please add support for downloading in WAV format to this feature.

Thank you!

— Reply to this email directly, view it on GitHub https://github.com/pgmichael/wavenet-for-chrome/issues/91#issuecomment-2347897140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIRI325UWJ7GWTGNUYEEHLZWJBQZAVCNFSM6AAAAABNYEAODCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBXHA4TOMJUGA . You are receiving this because you commented.Message ID: @.***>