Azure-Samples / Cognitive-Speech-TTS

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/
Other
904 stars 512 forks source link

curl_setopt_array(): cannot represent a stream of type Output as a STDIO FILE* #190

Closed Benoit1980 closed 4 years ago

Benoit1980 commented 4 years ago

Hello,

I am trying to get the tts to work with Guzzle/Laravel without luck:

        $response = Http::withToken($access_token)
            ->withOptions([
                'debug' => true,
                'verify' => false
            ])
            ->withHeaders([
                'Content-type' => 'application/ssml+xml',
                'cache-control' => 'no-cache',
                'X-Microsoft-OutputFormat' => 'riff-24khz-16bit-mono-pcm',
                'X-Search-AppId' => '07D3234E49CE426DAA29772419F436CA',
                'X-Search-ClientID' => '1ECFAE91408841A480F00935DC390960',
                'User-Agent' => 'Text-to-speech', //My resource name
                'content-length' => strlen($data),
            ])
            ->post($ttsServiceUri, [
                'content' => $data,
            ]);

I keep getting this error:

curl_setopt_array(): cannot represent a stream of type Output as a STDIO FILE*

Any idea what could cause this issue please? Is there anything missing in my header? I am not really sure about those:

                'X-Search-AppId' => '07D3234E49CE426DAA29772419F436CA',
                'X-Search-ClientID' => '1ECFAE91408841A480F00935DC390960',

I kept the numbers that are set in your example, is this correct?

I have also tried the way you showed it in your PHP example as:

            $options = array(
                'http' => array(
                    'header'  => "Content-type: application/ssml+xml\r\n" .
                        "X-Microsoft-OutputFormat: riff-24khz-16bit-mono-pcm\r\n" .
                        "Authorization: "."Bearer ".$access_token."\r\n" .
                        "X-Search-AppId: 07D3234E49CE426DAA29772419F436CA\r\n" .
                        "X-Search-ClientID: 1ECFAE91408841A480F00935DC390960\r\n" .
                        "User-Agent: Text-to-speech-video-maker\r\n" .
                        "content-length: ".strlen($data)."\r\n",
                    'method'  => 'POST',
                    'content' => $data,
                ),
            );

            $context  = stream_context_create($options);
            $result = file_get_contents($ttsServiceUri, false, $context);

Where I am confused is how to download the file because at the moment if I return $result, I get something like this: sasasasa

In your PHP demo, would it be possible to have an example on how to return the file as a "download" and also how to return the file as a response for a front end script like vue.js please?

Thank you.

boltomli commented 4 years ago

The app ID and client ID parts are not mandatory. You can remove them (and I can remove them from all samples).

All demos are intentionally kept as simple as possible to leave the choice of front-end, service, app design, etc. to developers and users. But it's a good suggestion to provide vue or other samples if possible.

Benoit1980 commented 4 years ago

Thank you for the reply. Let me give you a good example. I have used the Google text to speech API for 2 years, I integrated it super quickly because everything was simple and included in the API.

To get the wave output in PHP was simply: $response = $client->synthesizeSpeech($synthesisInputText, $voice, $audioConfig); $audioContent = $response->getAudioContent(); $fileName = time() . '.'.$request->input('format');

This is added to their example because it really makes sense to have it as 99.99% of the people want to listen to their text to speech output. If we do not have examples in PHP and JS that shows us a way to efficiently listen to the output, we struggle and may not even do it the right way which may cause(from what I have seen in different forums) people to have their output not sounding right which in turn gives a bad name to the product.

I think you should follow the Google Text to Speech API path on github where they also give a super simple example but at least include the output processing with it(no headaches to developers, occasional coders like myself, kids learning how to code and so on).

3 technologies which would need samples would be Laravel, Vue.js and react, all becoming really popular.

If things are integrated quickly, people use it more. I am still using the Google API for now as I cannot switch to Azure, I cannot work it out, been 3 days on it, opened a lot of forum posts and stack overflow posts, no replies nor help. Completely stuck :-)

boltomli commented 4 years ago

Thanks, these are valuable suggestions. I'll see if we can get some examples on trending frameworks. There're few listed in wiki. Welcome to suggest more suitable projects to add.

Benoit1980 commented 4 years ago

You are welcome :-) I think if you had full examples with: -Angular -Vue -React -Laravel

You would target a big chunk of the market.