talonvoice / talon

Issue Tracker for the main Talon app
85 stars 0 forks source link

expose or log secondary audio parse hypotheses besides chosen recognition #530

Open lahwran opened 2 years ago

lahwran commented 2 years ago

this would be most useful if it is also able to activate when there wasn't an actual recognition yet, but it would also be very useful if it was only able to show how close a recognition was to being confused with other possible recognitions. ideally it would expose sentence or even word level probabilities but if it only showed a list of alternate parses that would still be useful. I imagine this information would be exposed as a simple list attribute on some event context or something.

I leave the details of implementation completely open, I just want something that can show me how much audio conflict there is between what I said and what I could have said.

lahwran commented 2 years ago

original idea:

# idk how talon api does events tbh, I'm just making this up
@talon.events.all_recognitions
def on_recognition(event):
    print(event.truncated_match_distribution)
    # does not add up to 1 on purpose
    # -> {"eggs": 0.8, "x": 0.1, "ex": 0.07}

alternate implementation idea: expose this as a type of capture that can be used to search, rather than trying to expose it for every command as a separate event.

help voice search <audio_clip>: user.help_search_by_audio(audio_clip)
def help_search_by_audio(audio_clip):
    similar_commands = registry.search_commands_by_audio(audio_clip)
    # does not add up to 1 on purpose, I imagine you'd truncate it at like,
    # six results and 90% prob, or something small like that
    # similar commands -> {"save": 0.7, "say": 0.1, "dave": 0.1}
    similar_words = registry.search_vocabulary_by_audio(audio_clip)
    # {"rave": ..., "sage": ..., "same": ..., ...}
    # set up help ui with results here or something, showing all plausible parses
    # the audio clip could have as a command or dictation