Limit valid transcriptions to only a subset during each speech input phase

First of all, sorry for posting this as an issue as it's not an actual issue, but I didn't spot another way to talk with voice2json users/contributors.

I am currently trying to control a chess game with voice2json (with pocketsphinx as the backend). A proof-of-concept does work and it recognizes my intents if I speak very clearly, but sometimes it's also slightly off. From the chess board it's clear that the recognized sentence is not a valid move, but of course the speech recognition engine cannot know that. For example the engine might recognize "Move pawn from g3 to g4", but the pawn is on g2, so a valid move would be "g2 to g4".

From tracing down how audio is transcribed it seems that already in rhasspyasr_pocketsphinx/transcribe.py we return only one recognized sentence, not the possibly second, third and so on closest recognitions.

I saw that there is the possibility to limit the intents in recognize-intent, but if I see it correctly this also has to work with the one recognized transcription from transcribe-stream and thus comes too late in the pipeline.

In my point of view there seem to be two methods how I could solve my problem:

ingest the list of allowed movements a priori into the speech recognition (i.e. tell it that during this round only a1a3, b3b5, ... are allowed)
- in the worst-case this could be possible by retraining the model from scratch each round with new intents, because training takes only a few seconds
retrieve a list of the k top recognitions from the transcriber so that I can search through them for a valid movement

Is one of these methods possible with voice2json or could I plug in somewhere to achieve it? I know Python, but of course if you say that it totally goes against the core of voice2json I wouldn't even try.

synesthesiam / voice2json

Limit valid transcriptions to only a subset during each speech input phase #32