synesthesiam / voice2json

Command-line tools for speech and intent recognition on Linux
MIT License
1.08k stars 63 forks source link

Output contains "doors" #82

Open LukaHarambasic opened 1 year ago

LukaHarambasic commented 1 year ago

Heyho,

I'm running voice2json via docker on an M1 Mac. I used multiple .wav files, all produced by Davinci resolve, all in English in perfect audio quality. I can't upload the .wav files directly, but the episodes are published via .mp3 here. And every time I get an output with something regarding doors and lights... I'm very confused :D

{"text": "off open green open the living set on off hot set me door the door set to set temperature open the green open hot open living room lamp whats lamp hot how tell tell lamp set living turn is it door open set tell the set to is garage door open is it living me it whats it to red blue whats the temperature living blue me cold is it lamp off the living set cold make set lamp me whats door how hot is red on whats how off it turn off tell whats how whats turn the living what off garage light red living off is how on how turn on the living turn time living open to the on whats how lamp set to whats set what blue off closed whats the temperature is it living make room lamp whats me tell lamp cold room on time on whats room on off open door closed garage door open set turn off on whats the on time open make set on red the on living the what is it cold hot on on light to light to how blue green set living closed garage whats to the off the is light tell make bedroom light blue whats turn off tell door whats blue set living make the living room lamp the off red is lamp whats set living room lamp how temperature on the is is the time to off make the is is it open on cold it how hot on the the open closed living tell me on whats light to open closed red cold open cold is is what door it lamp cold the turn set garage make garage garage is cold bedroom living how on the open cold is on to living turn off open what turn off off hot is the door closed living garage whats red the me set the garage on the what is it green how blue off off whats time light the is on living garage light is it on turn off light it lamp turn it living room lamp off the whats it on living cold is the garage door set on living how the", "likelihood": 1, "transcribe_seconds": 9.57908892100022, "wav_seconds": 105.6426875, "tokens": null}

Do you have any idea what the problem could be? Thank you! Luka

synesthesiam commented 1 year ago

By default, voice2json only transcribes sentences from sentences.ini (pre-trained). Use open-ended transcription instead.

Keep an eye out for Rhasspy3, which will have support for Whisper :slightly_smiling_face: