Open xiaoniu-578fa6bff964d005 opened 4 years ago
Thanks for the report. Sadly the text to speech api Discovery document says that resource is a type: "string" and format: "bytes". Google's documentation claims that should always be a urlsafe base64 encoding. Without any other indicator in the discovery document I'm not sure how to generate the correct logic. I'll take a look at what some of the other language binding generators do with this api to see if they've solved this in a reasonable way
Seems a lot of libraries under googleapis uses Protobuf, so they don't need to handle base64 problem.
I am not quite familiar with the design of this repo. Is there any specific design consideration that discourage us from using grpc interface or using async interface of reqwest?
A lot of google apis' now support both grpc and rest. grpc uses protocol buffers as the serialization mechanism, while rest apis use json. This generated is only concerned with rest api's (at least for now).
Now back to the issue of base64 encoding. I looked at python and golang and they both treat the data as just a series of bytes and the user is responsible for decoding it themselves. I wanted to be a little more convenient and handle the decoding internally since it was documented clearly that it should be urlsafe. I've thought for a while that we will need to have a mechanism to manually tweak some of the api's during generation to give us a way of handling these non-conforming api's. An alternative in this case is that we could provide a base64 decoder that seamlessly handles both the standard alphabet as well as the urlsafe alphabet, but that doesn't feel like the correct way to handle this to me.
As a stopgap measure to allow you to make progress using the texttospeech api, I would recommend that you use the generator to produce your own version of the texttospeech bindings. You can then hand edit the part that uses google_api_bytes::Bytes to instead just treat it as a Vec
Creating the bindings should be simple:
git clone https://github.com/google-apis-rs/generator.git
cd generator
cargo run --example any_api -- --name=texttospeech --version=v1 --output_dir=/tmp
The above commands will put a new crate in /tmp/google_texttospeech_v1/ (you can specify a different directory from /tmp using --output_dir or simply move the directory where you want it after creation). Open /tmp/google_texttospeech_v1/src/lib.rs in your favorite editor and substitute ::google_api_bytes::Bytes
with Vec<u8>
.
That's it. Now just use the local version of the crate in your app and your version of SynthesizeSpeechResponse
will have audio_content that's simply a Vec
I agree, handling both alphabets seamlessly seems tricky and not the correct way.
I see the value in doing it both of the ways you suggested—configure or even hard-code per library vs leaving it up to the user. Although, it might make sense to just punt and be consistent with the Python and Go libraries?
I followed @ggriffiniii's instructions, and created a library for datastore1
. However, I did not have any luck with Vec<u8>
.
AFAIK serde can only (de)serialize Vec<u8>
from an array of numbers e.g. { "myField": [1, 0, 2, 5] }
.
I chose to replace ::google_api_bytes::Bytes
with String
. Since I don't care about the actual transaction id (yet, anyway) I can just ignore that fact and let it be a String
all the time. If I need the actual data later I will pull in Base 64 decoding.
Thanks for the step by step, it worked (almost) flawlessly!
Current generator only support URL-safe version. ref
It will panic when some apis(like text-to-speech) return standard version of base64 data.