CatalystCode / cordova-plugin-cogsvcsspeech

Cordova plugin for Microsoft Cognitive Services speech services.
MIT License
2 stars 3 forks source link

Async Functionality in Cordova #6

Open esgraham opened 4 years ago

esgraham commented 4 years ago

Description

As an architect, I want to understand how Cordova implements Async functionality, so that I can determine if plugin should implement the Speech SDK async functions.

Acceptance Criteria

rozele commented 4 years ago

Android

Recognize from microphone

Recommendation: recognizeOnceAsync

Stop recognizing from microphone

Recommendation: stopContinuousRecognitionAsync

Play text to speech audio

Recommendation: SpeakTextAsync and SpeakSsmlAsync

Stop playing text to speech audio

Recommendation: AudioTrack

iOS

Recognize from microphone

Recommendation: recognizeOnce

Stop recognizing from microphone

Recommendation: stopContinuousRecognition

Play text to speech audio

Recommendation: speakText and speakSsml

Stop playing text to speech audio

Recommendation: AVAudioPlayer.stop

rozele commented 4 years ago

A minor modification to the above, I actually recommend we do not use the Speech SDK for text-to-speech playback, because it does not support cancellation. Using the Speech SDK to get the audio data to play on something like AVAudioPlayer (iOS) or AudioTrack (Android) is not as efficient as piping the bytes directly from a REST call to Cognitive Services.

I believe we can efficiently use the Speech SDK to stream audio, if we leverage the PushAudioOutputStream and direct the bytes written to the output stream directly to the AVAudioPlayer / AudioTrack.

SpeechConfig speechConfig = SpeechConfig.fromSubscription(...);
CustomPushAudioOutputStreamCallback callback = new CustomPushAudioOutputStreamCallback();
PushAudioOutputStream outputStream = PushAudioOutputStream.create(callback);
AudioConfig audioConfig = AudioConfig.fromStreamOutput(outputStream);
SpeechSynthesizer synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);