Add keywords dynamically to MRTK speech commands

johntran-git commented 4 years ago

Hello, I'm trying play a game of chess using voice commands and realised I'd need to create hundreds of commands corresponding to each possible move I can make. An example of a keyword I'm using right now is "Pawn to A 4". Is there a way to reduce the amount of commands I'd need? One possible solution I've thought of is having the speech command be just "Move Pawn", and then catching the last part of the phrase "to A 4" and passing it through a script, but I'm not sure how to do this.

Any ideas?

Unity 2018.4.7f1 MRTK 2.0

Troy-Ferrell commented 4 years ago

Our documentation for handling speech is here: https://microsoft.github.io/MixedRealityToolkit-Unity/Documentation/Input/Speech.html

Unfortunately, MRTK's voice providers and infrastructure uses pre-configured voice commands from the speech profile....it would be quite tedious to setup for every keyword permutation you want.

Underneath the covers, the speech data provider is utilizing Unity's KeywordRecognizer class https://docs.unity3d.com/ScriptReference/Windows.Speech.KeywordRecognizer.html

According to Unity's docs, you can create multiple KeywordRecognizer instances "There can be many keyword recognizers active at any given time, but no two keyword recognizers may be listening for the same keyword."

Easiest solution might be to create a custom class that instantiates a KeywordRecognizer with a dynamically created list. So you would write code ot take a list of all pieces {"Pawn", "Knight"...} and all spaces {"A1", "A2" ...whatever} and then create the full keyword strings "Pawn to A 4" for every permutation. You provide this keyword list to your KeywordREcognizer when initializing.

Then you respond to the event callback and just string analyze the keyword said to know what piece and position to move

private void OnPhraseRecognized(PhraseRecognizedEventArgs args)
{
args.text.split(). blah blah blah
}

keveleigh commented 4 years ago

I think the best course of action would be using a GrammarRecognizer. We don't currently have support for it in MRTK, but that'd be a good feature addition!

It allows you to define an SRGS XML file with speech recognition rules. Holograms 212 has an example of this, with an SRGS file defining colors and shapes. It allows you to go through each SemanticMeaning in the recognized grammar and parse out the individual recognitions.

In this case, I'd think you'd define a rule with a list of chess piece names and a rule with a list of chess board spot names (or maybe two, one for letters and one for numbers, to help cut down on how many list items you have to type out). Then, your root rule would have something like

<item>
    <one-of>
        <item>move</item>
        <item/>
    </one-of>
    <ruleref uri="#chess_piece"/>
    to
    <ruleref uri="#board_position"/>
</item>

(i'm not positive that'll be valid SRGS! but something similar, at least)

Troy-Ferrell commented 4 years ago

Also there is this documentation which is a bit simpler: https://docs.microsoft.com/en-us/windows/mixed-reality/voice-input-in-unity

Not sure how outdated the Holograms 212 is

johntran-git commented 4 years ago

@keveleigh @Troy-Ferrell Thanks for the responses! I'd love contribute to the repo by helping add this feature once I've got it working myself. I created a script (shown below) which instantiates a GrammarRecognizer, similar to the Holograms 212 example. I've also created an SRGS grammar for the moves as well.

However it doesn't work when I try to say any phrases. I've tried turning off the MRTK speech commands incase the underlying KeyWordRecognizer was interfering with the GrammarRecognizer. Also tried using a very simple grammar with only one word and that doesn't work either. I've also tried this on a new unity project. Any ideas?

powerschaf commented 3 years ago

@keveleigh @davidkline-ms @Troy-Ferrell @johntran-git

Hello everyone, i also tried to use the GrammarRecognizer, with simple stuff or web examplels but nothing worked. I am programming an App for my bachelor thesis and the Voice Recognition is a core feature, so i need to get this to work :(

For example in my App the user says "Bloodpressure 120 to 80" , "Temperature 36,5°c"... and i need the numbers out of that sentence. The user should be able to say any valid number.
I got the DictationHandler to work but in my App dictation is used as a seperate Feature where the nurse can dictate optional things within the care routine like "Patient is feeling sick". (Works with Start/Stop Button).

Here is a picture of the care routine part:

Can anybody please tell me why the SRGS File doesn't work with the Grammar Recognizer? I got no error and the File is set correctly but nothing is recognized, even the simpelst word.

My App is for the HoloLens2 (UWP), everything is set up, Microphone checked, i have all input providers configured. I use Unity 2020.3.6f1 and MRTK 2.6.1

Please anybody help me out

IssueSyncBot commented 9 months ago

We appreciate your feedback and thank you for reporting this issue.

Microsoft Mixed Reality Toolkit version 2 (MRTK2) is currently in limited support. This means that Microsoft is only fixing high priority security issues. Unfortunately, this issue does not meet the necessary priority and will be closed. If you strongly feel that this issue deserves more attention, please open a new issue and explain why it is important.

Microsoft recommends that all new HoloLens 2 Unity applications use MRTK3 instead of MRTK2.

Please note that MRTK3 was released in August 2023. It features an all new architecture for developing rich mixed reality experiences and has a minimum requirement of Unity 2021.3 LTS. For more information about MRTK3, please visithttps://www.mixedrealitytoolkit.org.

Thank you for your continued support of the Mixed Reality Toolkit!

microsoft / MixedRealityToolkit-Unity

Add keywords dynamically to MRTK speech commands #6369