synesthesiam / rhasspy

Rhasspy voice assistant for offline home automation
https://rhasspy.readthedocs.io
MIT License
944 stars 101 forks source link

_text and _raw_text are the same #151

Closed frkos closed 4 years ago

frkos commented 4 years ago

I tried to use kaldi with enabled open transcription mode to recognize all words But what I’ve found is that in event I have _text and _raw_text the same…

The first line in my log shows full sentence please turn my desk lamp on, but in the event _raw_text contains only words from sentences.ini, ignoring unknown please, deskand my 'raw_text': 'lamp on'…wierd

[DEBUG:950613] DialogueManager: {'text': 'lamp on', 'intent': {'name': 'TurnOn', 'confidence': 0.9}, 'entities': [{'entity': 'device', 'value': 'lamp', 'raw_value': 'lamp', 'start': 0, 'raw_start': 0, 'end': 4, 'raw_end': 4}], 'raw_text': 'lamp on', 'tokens': ['lamp', 'on'], 'raw_tokens': ['lamp', 'on'], 'speech_confidence': 1, 'wakeId': 'snowboy/snowboy.umdl', 'siteId': 'default'}
[INFO:950029] quart.serving: 192.168.1.99:56562 GET / 1.1 200 1029 92220
[DEBUG:946375] DialogueManager: decoding -> recognizing
[DEBUG:946295] DialogueManager: please turn my desk lamp on (confidence=1)
[DEBUG:946181] KaldiDecoder: please turn my desk lamp on

According to the docs I expect _raw_text contains full sentence... Am I right?

_text - spoken voice command text with substitutions
_raw_text - literal transcription of voice command

Also mentioned this issue here: https://community.rhasspy.org/t/training-for-unknown-words/228/2?u=frkos

synesthesiam commented 4 years ago

That's an interesting bug! Let me take a look tonight and get it fixed. I think the intent recognizer is doing some filtering by default on the words, and it needs to have that disabled in the case of open transcription.

synesthesiam commented 4 years ago

Fixed in 2.4.17