JamesBrill / react-speech-recognition

💬Speech recognition for your React app
https://webspeechrecognition.com/
MIT License
645 stars 116 forks source link

Adding prefixes to command array #109

Closed Nitzahon closed 2 years ago

Nitzahon commented 3 years ago

I'm trying to allow users to answer survey questions with speech recognition, but currently, I can only get the command to work as intended if every question has unique choices (the choices are sent as an array for speech recognition).

    {
      command: [...commandArr],
      callback: (command) => videoCommandCallback(command),
      isFuzzyMatch: true,
      fuzzyMatchingThreshold: 0.7,
      bestMatchOnly: true
    },

However, I want to try and allow the user to choose a question by number or by some other thing to maybe send the speech component. something like this perhaps:

    {
      command: 'question :number' [...commands],
      callback: (command, number) => videoCommandCallback(number, command),
      isFuzzyMatch: true,
      fuzzyMatchingThreshold: 0.7,
      bestMatchOnly: true
    },

I don't think that this is possible currently, I'm trying to think of other solutions. I thought about mapping prefixes to an array, but that would require my code to be able to later deconstruct the prefix, and I assume commands in an array can't contain splats or named variables?:

const array = [...commands];
const prefix = 'question :number '
const prefixArray = (array, prefix) => array.map(e => prefix+e);
    {
      command: [...prefixArray],
      callback: (command) => prefixedVideoCommandCallback(command),
      isFuzzyMatch: true,
      fuzzyMatchingThreshold: 0.7,
      bestMatchOnly: true
    },

Another idea I have, probably the one with the most promise, is chaining commands:

    {
      command: 'question :number *',
      callback: (number, command) => sendAnsComplex(number,command),
      isFuzzyMatch: true,
      fuzzyMatchingThreshold: 0.7,
      bestMatchOnly: true
    },
sendAnsComplex(number,command){
setActiveQuestion(number);
sendCommandBackTroughSpeechRecognitionSomehow(command)
}

The point of sendCommandBackTroughSpeechRecognitionSomehow is so that speech recognition checks command against the original command check:

    {
      command: [...commandArr],
      callback: (command) => videoCommandCallback(command),
      isFuzzyMatch: true,
      fuzzyMatchingThreshold: 0.7,
      bestMatchOnly: true
    },

Do any of these options have any hope of succeeding? I want everything to be done with one voice command if possible, I could make a command 'question :number ' and have it set an active question in a state, but then the user would need to wait for the first command to take effect before trying the second command

JamesBrill commented 2 years ago

Hi again @Nitzahon Yes, this use case isn't possible due to the limitations of combining splats and fuzzy matching (which I assume you need to reliably match your commands). I'll see what I can come up with.

In the meantime, you'll need to have separate commands for selecting questions and processing answers. From a user experience perspective, this might not be the end of the world: (a) if I say "question one", it doesn't take long for the Speech Recognition engine to produce a final result and be ready to process the answer, and (b) it's probably natural for the user to pause between selecting questions and giving their answers. You could show some visual indicator in the brief pause where the question is being selected and answers cannot be processed yet.

Be warned, transcribing numbers is not particularly reliable - the engine will arbitrarily switch between "4", "four", and "for" (if you're unlucky). It is smart enough to guess that the user is saying "four" when it's preceded by the word "question", so that might mean this isn't an issue. From my little test just now, letters seemed a bit more predictable - i.e. "question a", "question b", etc.

JamesBrill commented 2 years ago

Another short-term solution while I figure this out is for you to use the non-fuzzy command 'question :number *' and do the fuzzy matching yourself on the *. The fuzzy matching logic ultimately comes from here, which you can reuse.

JamesBrill commented 2 years ago

I've published a prerelease version 3.9.0-rc1 for you to try out. Be warned, this is experimental and only manually tested. I've not decided if this is a good idea yet, so this is subject to change!

This version allows you to specify a fuzzy matcher on part of a non-fuzzy command. The part of the command on which the fuzzy matcher applies is marked by a special command symbol I'm calling a "fuzzy splat". These fuzzy splats are surrounded by angled brackets and look a bit like this: <answer>. Here, "answer" is the name of the fuzzy splat. For each fuzzy splat, you must provide a fuzzy matching function with the same name. Fuzzy matchers are specified in a new command options called fuzzyMatchers. These have a key for the name of each fuzzy splat, mapped to the phrases to match against, very similar to a normal fuzzy command.

Fuzzy splats function like normal splats, but have a fuzzy matching function applied to them after the value of the splat has been computed. If you specify any of these fuzzy splats, the command callback will only trigger if all of them get matched within the given thresholds. The fuzzy matching functions look a bit like fuzzy commands, though if you provide an array, bestMatchOnly is always applied.

The results of the fuzzy matching functions are returned to the callback as a field in the final argument called fuzzyResults. Like fuzzyMatchers, this object has a key for the name of each fuzzy splat, mapped to the result.

Best I provide an example, in the context of your use case:

    {
      command: 'question :number <answer>',
      callback: (number, transcribedAnswer, { fuzzyResults }) => {
        const { answer } = fuzzyResults
        console.log(`Question: ${number}, Answer: ${answer.command}`)
      },
      fuzzyMatchers: {
        answer: {
          command: ['strongly agree', 'agree', 'disagree', 'strongly disagree'],
          fuzzyMatchingThreshold: 0.7,
        }
      }
    }

Let me know how you get on with this. If it's useful for you, I'll consider adding it to the library.

JamesBrill commented 2 years ago

Closed due to inactivity.