Add ability to decode words without language model context when using w2l

talonvoice / beta

Issue tracker for the private Talon Beta

10 stars 0 forks source link

Add ability to decode words without language model context when using w2l #1

Closed ckamm closed 4 years ago

ckamm commented 4 years ago

Currently rules using <dgndictation> etc apply a language model to the emissions that is trained on sentences. For some usecases like dictating identifier names (e.g. "@snake word ordering formatter") that's undesired and it would be preferable to not keep language model context between words.

On the dfa level this is already available via the TOKEN_LMWORD vs TOKEN_LMWORD_CTX flag.

Suggestion: Add a new <standalone_word>or <word_no_context> that makes this behavior available to users.

lunixbochs commented 4 years ago

Looks like <word> already does this. It compiles to the word token:

    token_dict['word'] = 0xff
    token_dict['context_word'] = 0xfe

<word> compiles to <word> <phrase> compiles to <context_word>+

ckamm commented 4 years ago

Hah, excellent!