Closed Mahesh5645 closed 4 years ago
Hi Mahesh,
Thanks for the note about adding JSON.stringify
to the README, that's an oversight by me and will help others in the future!
With regard to not getting an onRecognize
event, I see there is some confusion about what each event means.
In your pipeline activation closure:
_startRecognizing = async () => {
console.log("Inside voice recognising")
try {
// Start and stop the speech pipeline. All methods can be called repeatedly.
Spokestack.start(); // start speech pipeline. can only start after initialize is called.
console.log("Log Spokestack.start(); ");
Spokestack.activate();
You start
the pipeline, which begins listening for a wakeword. Then you immediately activate
the pipeline, which stops listening for a wakeword and begins a streaming ASR request to Google's cloud speech-to-text api.
The debug tracing you're seeing are VAD
, which means that voice activity detection is triggered, then activate
, which means that the ASR is activated.
Assuming that you don't want wakeword-activated ASR, this is fine so far. If you do want to not stream ASR from Google until a wakeword is heard, then just start
ing the pipeline is sufficient. For more information on the speech pipeline, this (android-specific) doc might help: https://spokestack.io/docs/Android/speech-pipeline.
These lines are concerning:
Spokestack.onSpeechRecognized = this.speechDetected;
Spokestack.onRecognize = e => {
logEvent(e);
console.log("onRecognize :: ",e.transcript); // "Hello Spokestack"
};
Spokestack does not emit an onSpeechRecognized
event, so your speechDetected
function will never be called. I think what you want is to rewrite the above lines into just:
Spokestack.onRecognize = this.speechDetected;
Finally, I see from your pasted google API request log that your latency is extremely high. It should be in the millisecond range, not the minute range. Perhaps network conditions are interfering with Google's cloud speech-to-text api requests? There was a similar issue with long latency requests: https://github.com/spokestack/react-native-spokestack/issues/52
Hello @noelweichbrodt ,
Thanks for your earnest reply. I apologize for the typo error which I took from the example mentioned in https://github.com/rtmalone/spokestack-example/blob/master/App.js .
I have replaced the code as you have mentioned and below is the code which I am using now. I have added the stop function to stop as the trace is going forever.
import React, { Component } from 'react';
import { StyleSheet, Text, View, Image, TouchableOpacity, ImageBackground, TouchableHighlight, Platform } from 'react-native';
import Spokestack from "react-native-spokestack";
class VoiceTest extends Component {
state = {
spoken: "",
recording: false,
message: null
};
constructor(props) {
super(props);
const logEvent = e => console.log("Log is ::",e);
Spokestack.onRecognize = e => {
logEvent(e);
console.log("onRecognize :: ",e.transcript); // "Hello Spokestack"
};
Spokestack.onError = e => {
Spokestack.deactivate()
Spokestack.stop();
logEvent("onError "+e);
};
Spokestack.onTrace = e => { // subscribe to tracing events according to the trace-level property
logEvent(e);
if(e.message=="vad: true"){
console.log("Activating speech")
Spokestack.activate();
}
console.log(e.message);
};
}
_startRecognizing = async () => {
console.log("Inside voice recognising")
try {
// Start and stop the speech pipeline. All methods can be called repeatedly.
Spokestack.start(); // start speech pipeline. can only start after initialize is called.
console.log("Log Spokestack.start(); ");
} catch (e) {
console.log("error", e)
}
}
stopAudio(){
Spokestack.deactivate()
Spokestack.stop();
}
startAudio() {
if (Spokestack && Platform.OS === "android") {
console.log("inside component")
Spokestack.initialize({
input: "com.pylon.spokestack.android.MicrophoneInput", // required, provides audio input into the stages
stages: [
"com.pylon.spokestack.webrtc.VoiceActivityDetector", // voice activity detection. necessary to trigger speech recognition.
"com.pylon.spokestack.google.GoogleSpeechRecognizer" // one of the two supplied speech recognition services
// 'com.pylon.spokestack.microsoft.BingSpeechRecognizer'
],
properties: {
"vad-mode": "aggressive",
"vad-rise-delay": 30,
"vad-fall-delay": 40,
"sample-rate": 16000,
"frame-width": 20,
"buffer-width": 20,
"locale": "en-US",
"google-credentials": JSON.stringify(google-credentials.json), // Android-supported api
// "google-api-key": "", // iOS supported google api
// 'bing-speech-api-key': YOUR_BING_VOICE_CREDENTIALS,
"trace-level": Spokestack.TraceLevel.DEBUG
}
});
}
}
render() {
const { recording, message, spoken } = this.state;
console.log("state is ", this.state);
return (
<ImageBackground
resizeMode={'cover'} // or cover
style={{ flex: 1, justifyContent: "center", alignItems: "center" }} // must be passed from the parent, the number may vary depending upon your screen size
source={require('../../../assets/images/meditate.jpg')}
>
<TouchableOpacity style={styles.MessageBox} onPress={() => { this.setState({ recording: recording?false:true }, recording?this.stopAudio():this.startAudio()) }} >
<Text style={{ paddingLeft: 10, paddingBottom: 10, marginBottom: 10 }}>Please click to Start Voice initialise</Text>
</TouchableOpacity>
{recording && <TouchableOpacity style={styles.MessageBox} onPress={() => { this._startRecognizing() }} >
<Text style={{ paddingLeft: 10, paddingBottom: 10, marginBottom: 10 }}>Say "Start" to start listening</Text>
</TouchableOpacity>}
<View style={[styles.MessageBox, { height: 300, justifyContent: "center", alignContent: "center", alignItems: "center" }]} >
{recording && <Text>Heard: "{spoken}"</Text>}
{message && <Text style={styles.message}>{message}</Text>}
</View>
</ImageBackground>
);
}
}
const styles = StyleSheet.create({
button: {
width: 50,
height: 50,
},
container: {
flex: 1,
justifyContent: 'center',
alignItems: 'center',
backgroundColor: '#F5FCFF',
},
welcome: {
fontSize: 20,
textAlign: 'center',
margin: 10,
},
action: {
textAlign: 'center',
color: '#0000FF',
marginVertical: 5,
fontWeight: 'bold',
},
instructions: {
textAlign: 'center',
color: '#333333',
marginBottom: 5,
},
stat: {
textAlign: 'center',
color: '#B0171F',
marginBottom: 1,
},
MessageBox: {
backgroundColor: "rgba(255,255,255,0.7)",
minWidth: '90%',
paddingTop: 10,
// height:100,
margin: 20,
paddingBottom: 15,
borderBottomLeftRadius: 10,
// borderBottomRightRadius: number
borderTopLeftRadius: 10,
borderTopRightRadius: 10,
shadowColor: "#000",
shadowOffset: {
width: 0,
height: 4,
},
justifyContent: "center",
alignItems: 'center',
shadowOpacity: 0.30,
shadowRadius: 4.65,
elevation: 8,
marginBottom: 10,
},
});
VoiceTest.navigationOptions = ({ navigation }) => {
return {
header: null,
};
};
export default VoiceTest;
After making the above changes still Spokestack.onRecognize
is not detecting anything I am speaking. It still gives me above mentioned logs of trace. In trace I have kept condition to activate listening when VAD is true.
Sorry to bother you much as I am new to this IT World, I want to build the app with a feature like, the app will listen to what I am saying but will activate/start an event only when it will detect/listen a wake word. For eg: I will start the app and keep it on while someone will speak something but when I will say "Party time" then it will start playing songs.
I hope I am not troubling you much and request you to kindly take some of your valuable time to help me out in this.
Currently my code is not able to listen/detect/recognize anything I am speaking. there is no transcript I am receiving from my speech.
Hello @noelweichbrodt , Hope I am not bothering you much by my questions and requirements. I have gone through the documentation of Spokestack and found that the wakeup word is "Spokestack" which will get recognised and will activate. Am I correct on this.? Request you to kindly help me out in my app, I am stuck at this point.
Hi Mahesh,
the wakeup word is "Spokestack" which will get recognized and will activate. Am I correct on this.?
That is correct. To create a custom wakeword model (such as "Party time"), get in touch for an estimate by using the "Talk to us" button on https://spokestack.io. If you do have the machine learning background, https://spokestack.io/docs/Concepts/wakeword-models provides the specifications for creating your own model that can be dropped into Spokestack.
After making the above changes still Spokestack.onRecognize is not detecting anything I am speaking.
Given what you've mentioned so far, the Google Speech API latency indicates that Spokestack is sending speech successfully, but never getting a response from Google. That would be the first place to begin troubleshooting never receiving an onRecognize
event.
Hope this helps!
Hello @noelweichbrodt ,
Thanks for your help, really appreciate. I will try the ML Model for wake word and if I get any problem then I will open up this Issue again.
Hello,
Really Appreciate this plugin when I studied it (I am new to this IT World, So might sound childish, please bear with me). My environment is :-
I was looking for something like this (Steaming Audio & Wake Word Detetction) for my App. (Tip: Please edit the readme to replace google-credential with JSON.stringify(google-credential.json), as it took a long time to realise this)
I want to build an app where the audio will be streaming like for an hour or so and if it catches the wake word like "App Listen this" then it starts an event. The problem I am facing is that after configuring everything and putting the code in place the Start & Activate command works but there is no response after that. I am getting below log for onActivated event.
{isActive: true, error: "", message: null, transcript: "", event: "ACTIVATE"}
I am getting below log for onTrace event.I don't know how to proceed further so as the app detect my wake word. Please help me in achieving my goal. Will really appreciate your efforts.
I have also followed the example given in Issue #14 https://github.com/rtmalone/spokestack-example/blob/master/App.js
Below is my code of voiceTest.
Below is my android/app/build.gradle file :-
My android/build.gradle is as below
I am not getting where I went wrong even after following everything correctly. I have enabled cloud speech-to-text api in my google project console too. After resolving all errors I am stuck with no recognition, though my API shows request is going.
Method | Requests | Errors | Avg latency | latency google.cloud.speech.v1.Speech.StreamingRecognize | 26 | 57.69% | 2 minutes | 8 minutes
Any help in this is really appreciated and looking for an early reply. In the meantime I am going through the java code of "com.pylon.spokestack"