mkiol / dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Mozilla Public License 2.0
584 stars 20 forks source link

Dictation Functionality #161

Open unithqcc opened 2 months ago

unithqcc commented 2 months ago
  1. Does functionality like saying "comma" or "period" to type "," or "." exist within the libraries or within dsnote/Speech Note? I would like to have this ability.
  2. Similarly, is there a way for the text to appear as one single paragraph until "new paragraph" or similar command is spoken? This would be preferred as opposed to a new line after each sentence.
  3. How are hotkeys assigned? I am using a Nuance PowerMic II, and would like to start/stop recording using this device.

I am using the English (Vosk Large)/en library in my testing. I use voice-to-text software for medical dictation and writing.

Thank you in advance.

mkiol commented 2 months ago

Hi, thanks for the question.

  1. Does functionality like saying "comma" or "period" to type "," or "." exist within the libraries or within dsnote/Speech Note? I would like to have this ability.
  2. Similarly, is there a way for the text to appear as one single paragraph until "new paragraph" or similar command is spoken? This would be preferred as opposed to a new line after each sentence.

Unfortunately, not yet. Currently there is no support for “voice commands”. This kind of function has been requested many times, so I think it is needed. Perhaps a simple replacement of “comma” with “,” and so on would be a good solution. Thanks for both ideas, I will try to implement something (very basic) in the next version.

  1. How are hotkeys assigned? I am using a Nuance PowerMic II, and would like to start/stop recording using this device.

If these buttons can generate x11 key-events, you can assign them to actions in Accessibility->Use global keyboard shortcuts setting. To detect what key code generates certain button, you can use xev linux tool. For instance when I press "Audio stop" special button on my keyboard, xev reports:

KeyRelease event, serial 40, synthetic NO, window 0x7800001,
    root 0x1e2, subw 0x0, time 1518656, (-986,770), root:(844,1744),
    state 0x0, keycode 174 (keysym 0x1008ff15, XF86AudioStop), same_screen YES,
    XLookupString gives 0 bytes: 
    XFilterEvent returns: False

In this example, XF86AudioStop is a x11 key event. To assign it to Speech Note action, you have to put "Stop" as a key combination. image Mappings for other special keys is here.