Byron / google-apis-rs

A binding and CLI generator for all Google APIs
http://byron.github.io/google-apis-rs
Other
1.02k stars 136 forks source link

Google TTS no longer works. #442

Open xd009642 opened 1 year ago

xd009642 commented 1 year ago

It doesn't matter what the request is we get responses like this:

Invalid byte 47, offset 18110. at line 3 column 1: {
    "audioContent": "UklGRuyoAQBXQVZFZm10IBAAAAABAAEA8FUAAOCrAAACABAAZGF0YcioAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

A snippet of the code, the error message comes from the result of awaiting .doit().

        let voice = VoiceSelectionParams {
            language_code: Some(language.to_string()),
            name: params.voice.clone(),
            ssml_gender: None,
            custom_voice: None,
        };

        // Set to s16le PCM https://cloud.google.com/speech-to-text/docs/encoding this returns a
        // wav file
        let audio = AudioConfig {
            audio_encoding: Some("LINEAR16".to_string()),
            effects_profile_id: None,
            pitch: params.pitch,
            sample_rate_hertz: Some(params.sample_rate as _),
            speaking_rate: params.speed,
            volume_gain_db: None,
        };

        self.tts
            .text()
            .synthesize(SynthesizeSpeechRequest {
                audio_config: Some(audio),
                input: Some(input),
                voice: Some(voice),
            })
            .doit()
            .await
xd009642 commented 1 year ago

Okay and we've done some bisecting and found that images made with version 3.1.0 work and it's just the 5.0.3 image that is broken.

Byron commented 1 year ago

Thanks for reporting! I think there was a change related to the way certain kinds of data are represented. It should be possible to find this commit and maybe make it apply only to certain APIs. On the other hand, maybe there is an updated version of TTS available that would work with these changes, too.

xd009642 commented 1 year ago

I had a brief look for updates and didn't see any (but the google docs are pretty poor for finding this kind of info). It seems it's having trouble when checking the base64 response, but as it doesn't decode it and leaves that to me the user such a check seems kind of worthless? Like I have to check the result when converting it to bytes anyway - if this client won't decode the base64 for me why check it :man_shrugging:

xd009642 commented 1 year ago

Okay I've done a bit of bisecting and can confirm the issue isn't present in 4.x and was introduced in 5.x. We're rolling back for now but I'm willing to help PR a change for this if it could expedite things, I just need pointing in the right direction :eyes:

Byron commented 1 year ago

I recommend to use the local source of the crate in question to make it editable, and compare the working version and the non-working one. They will be close enough to make that possible. From there a local fix can be conceived and tested, and with it I can certainly be a guide to get a more permanent fix done, by some means supported by the generator. After all, per-API overrides are possible, should the fix be very specific.

Please also note that generally there should be a better way than using these crates, it's astonishing that Google doesn't support Rust officially yet.

emarcotte commented 1 year ago

Think I'm facing something similar with the secret manager secret access endpoint. Interesting thing I've found so far (haven't figured out WHY yet) is that if you handle the JsonDecodeError, take the payload and pass it thru serde_json::from_str it will decode just fine. Must be some encoding parameters somewhere.

emarcotte commented 1 year ago

Ahha. It seems like the payload data was previously String and now its Vec<u8> going between v4 and v5. There's an encoder for base64 bytes, which I think is trying to use the 'url safe' variant of the encoding on the payload field. For me there's a + symbol in the text, which suggests I think that its not a url-safe encoding? Its possible this is related to https://github.com/Byron/google-apis-rs/pull/379.

Currently trying to find a reference somewhere for what encoding variant google uses.

emarcotte commented 1 year ago

I can see some reference online (eg here https://stackoverflow.com/questions/28100601/decode-url-safe-base64-in-javascript-browser-side) that some APIs are URL safe which might indicate that an API-by-API choice is necessary for how to decode these values... grumble. If I find more I will post here.