tihu-nlp / tihu

Persian Text-To-Speech
http://lilak-project.com/tihu_demo.php
Other
84 stars 11 forks source link

Wave headers #45

Open shotor opened 4 years ago

shotor commented 4 years ago

I'm having issues using actual responses from the grpc api. They don't have a wave header from what I can tell. So I have to add this in order to be able to play them.

In my case: If I pre-process the uint8 response on nodejs and add a waveheader, then send the full wave file over grpc, deserialize as b64 on expo/android it works.

So far all my attempts to add a wave header dynamically in the app, as b64 have failed. This is because even though it's a mobile app I only have access to web-apis and native APIs that expo exposes. I can't process them with Node how I did previously. Or use iOS/Java.

So I wanted to ask (1) how did you process wave files on lilak-project to be web playable? I might be able to re-use some of this. (2) is it possible to add the wave header to the grpc response, possibly with a param? This would make my life easier actually. But I'm not sure how this affects streaming.

b00f commented 4 years ago

I understand what difficulties your are dealing right now, because I had same before. I purposely didn't add wave header to the output for some reasons: 1- The ultimate plan is compressing the audio data, i.e. mp3. So we might have two different output type. #15 2- The output data is stream of data. I think adding header to the stream is not a good idea. 3- For adding the audio header, we need to know the exact size of output data, which we don't have it.

Look at tihu_console app as an example. It's a C++/Qt application. It plays audio stream therefore there is no wait to finish the data then play it: here

About how lilak-project (tihu demo page) . That time I hadn't created a docker so I used the native API. Later when I added the gRPC I switch to gRPC. The client is in PHP. I don't think that code will help you. But in case you need to take a look, let me add you to that project which is hosted in gitlab. It's private repo which has some sensitive data and I don't want to share it. I manually created a wave file, add wav header, append the sampling data, update the header and then play the wave file.

I think you should find library which can play wav stream. Did you take a look at this

b00f commented 4 years ago

@shotor Do you need any help on playing raw data?

shotor commented 4 years ago

Sorry been pre occupied with other stuff. Going to work on it this week.

Situation is still the same:

I have to generate a header with web and expo standard library in base64 add it to the beginning of the string and play it. It should be possible because I did the same using nodejs and a library I found to generate the header and was able to play back the resulting base64 string in expo.

My first attempt was to blindly convert the wave header library I found to use the web standard library but it didn't work. So I need to dig a little deeper and find out what a wave header actually is and how to build one.

b00f commented 4 years ago

did you check this library: https://www.npmjs.com/package/webaudio-wav-stream-player

shotor commented 4 years ago

Might work. But don't know if it's compatible with react native. Also requires a more complex proxy in between to convert grpc data.

Want to check if I can fix it using native libs first.

b00f commented 4 years ago

gRPC was a good choice for streaming data (It uses HTTP2).

shotor commented 4 years ago

yep, but the first thing I had to do was setup a proxy that converts it to http 1.1 so I can use it in Expo https://github.com/tihu-nlp/tihu-native/tree/master/proxy

If I implement it in native java android swift ios modules I will have http2 (and no problems with headers anymore)

b00f commented 4 years ago

I am not familiar with that libraries, however I think it's not a good idea converting to HTTP1.1. That is really bad! If you don't have any other choices, we can think about adding RPC methods to Tihu like Capnp. It support HTTP1.1