TalAter / annyang

💬 Speech recognition for your site
https://www.talater.com/annyang/
MIT License
6.63k stars 1.05k forks source link

Can annyang used to output text dictation [aka "what the user said"] onto the page in addition to voice commands? #293

Closed popbijoux closed 7 years ago

popbijoux commented 7 years ago

GOAL: Adding text dictation aka "what the user said" to a div in addition to using annyang commands I already do
I tried using he webkit speech recognition API in addition to annyang but it did not work. Maybe there is a way I can accomplish this easily with Annyang? CURRENTLY WORKING PART 1: annyang works perfectly, until I try adding dictation. I have a separate page with dozens of annyang commands which works perfectly. Below is a small sample. [SEE BELOW]

if (annyang) {
// Let's define our first command. First the text we expect, and then the function it should call
var commands = {

'block (the wall)': function() {
  $("#blockTheWall_001").show().delay(4000).fadeOut();
}

     };

annyang.addCommands(commands);
annyang.start();
}

CURRENTLY WORKING: PART 2: webkit speech recognition API that outputs text dictation I have also tested a page to output simple dictation (only with no voice commands or annyang) which also works perfectly. [SEE BELOW.]

//javascript voice to text API  

var recognition = new webkitSpeechRecognition(); //get new instance
recognition.start(); //start it
recognition.onend = function() { //a function to restart it when it stops
    recognition.start();
}
recognition.onresult = function(event) { 
    var whatWasHeard = event.results[0][0].transcript; //get what was heard    
    //document.body.innerHTML = whatWasHeard; //original version. Update below
    document.getElementById("voiceToText").innerHTML = whatWasHeard; //updated version
};

HTML

<div>

   <form>
     <input id="voiceToText" type="text" name="lastname">
   </form>

 </div>

COMBINING THE TWO, DOES NOT WORK, UNFORTUNATELY: I love annyang but do need to have the dictated text show on the page for purposes of a demo I am putting together. Combining them into one page does NOT work. [SEE BELOW]

//javascript voice to text API  

var recognition = new webkitSpeechRecognition(); //get new instance
recognition.start(); //start it
recognition.onend = function() { //a function to restart it when it stops
    recognition.start();
}
recognition.onresult = function(event) { 
    var whatWasHeard = event.results[0][0].transcript; //get what was heard    
    //document.body.innerHTML = whatWasHeard; //original version. Update below
    document.getElementById("voiceToText").innerHTML = whatWasHeard; //updated version
};

//annyang start
if (annyang) {
// Let's define our first command. First the text we expect, and then the function it should call
var commands = {

'block (the wall)': function() {
  $("#blockTheWall_001").show().delay(4000).fadeOut();
}

     };
// Add our commands to annyang
annyang.addCommands(commands);
// Start listening. You can call this here, or attach this call to an event, button, etc.
annyang.start();
}

All I want is a div that will show the output of dictated text. Annyang and Webkit API work separately, but when I try to combine the code, neither of them work. I love annyang and would love to be able to still use annyang for commands and webkit for dictation on the same page, if this is at all possible.

If anyone knows a way to work around it, I'd be super grateful:)

UPDATE: Just wondering if I could accomplish what I am after using annyang alone? Via callbacks? I will test and come back here to report if this works.

annyang.addCallback('result', function(userSaid) {
document.getElementById("voiceToText").innerHTML.(userSaid);

});
BetaStacks commented 7 years ago

Yes it can. Do you have a link to published versions of the code you posted?

I'm away from my code at the moment so I can't look at exactly what I did. If you want to look at the src code of my project it is over on https://maia.site Creating an account will give you access to the demo. All available commands are in the consol.

Also, what browser and operating system are you using?

BetaStacks commented 7 years ago

One of my commands is "reepeat *splat" so you can make it say anything you want.

popbijoux commented 7 years ago

thank you:) I will try it tomorrow and re-post the page on github, i'll add a link here. My problem is I am not sure how to write on the html (I want to output it to the "voiceToText" ID below). It doesn't have to be a text input box, any div could work.

`<div>

   <form>
     <input id="voiceToText" type="text" name="lastname">
   </form>

 </div>`

I am not sure what I should write with annyang. It's probably something simple. I am a relative beginner so I will make simple mistakes:). I could also be the CSS I will double check everything to make sure.

(The equivalent to this is regular web api is what I am looking for )

recognition.onresult = function(event) { 
    var whatWasHeard = event.results[0][0].transcript; //get what was heard    
    document.getElementById("voiceToText").innerHTML = whatWasHeard; //updated version
};
BetaStacks commented 7 years ago

Gotchya, sorry I thought you were wanting to trigger speech synthesis. The demo on my site has a "Log" that outputs what was said when a command is matched. I've been meaning to add output when a command isn't recognized.

So do you want to only the most recently heard speech or do you want to run a ongoing list?

TalAter commented 7 years ago

The browser does not support creating more than one instance of webkitSpeechRecognition on the same page.

Have you tried adding a result callback to annyang to get the raw text of what was said?

annyang.addCallback('result', function(whatWasHeardArray) {
  console.log(whatWasHeardArray)
});

You can read more about these callbacks in the API docs.

P.S. - I am closing the issue because I believe this should help. Please feel free to reopen it if it does not.

popbijoux commented 7 years ago

hi, thank you so much. I managed to make the text show, BUT the text is displayed in an array, as explained on your FAQ:[ "This event will fire with a list of possible phrases the user may have said, regardless of whether any of them matched an annyang command or not."

This is the code:

annyang.addCallback('result',function(whatWasHeard) {
document.getElementById("voiceToText").innerHTML = whatWasHeard;
});

It displays both the actual matched command and other possible phrases.
Is there a way to display everything the user said, NOT in an array, regardless of whether it is a match or not?

For example if I say the word "toast", I get first, toast, 1st, post

When I say I had toast for breakfast I get ask for breakfast, can I ask for breakfast, contact for breakfast as a result. which is not what I said:(

All I want is, if I say I had toast for breakfast I get exactly the same as a result: I had toast for breakfast

I tried using splats and such but it did not work for me:(

Thank you again!

TalAter commented 7 years ago

Yes, the result callback returns an array of the things the speech recognition thinks you may have said (for example, when you say "toast" it might be 60% confident you said "toast", and 30% confident you said "first").

If you want to return just the most likely recognition, just output the first element in the array.

annyang.addCallback('result',function(whatWasHeard) {
  document.getElementById("voiceToText").innerHTML = whatWasHeard[0];
});
uc-ndh commented 7 years ago

After getting the command what we speak how to call a function on it? I tried this but not working: annyang.addCallback('result',function(whatWasHeard) { var str = whatWasHeard[0]; });

if (annyang) { var commands = { str : function () { //function to perform } } };

SachinDev108 commented 6 years ago

@TalAter It's return undefined inside the console.log annyang.addCallback('result',function(whatWasHeard) { console.log(whatWasHeard) });