savbell / whisper-writer

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.
GNU General Public License v3.0
326 stars 52 forks source link

Immediately abort typing when active window changed #38

Open Enyium opened 6 months ago

Enyium commented 6 months ago

I use the settings

    "activation_key": "ctrl+win",
    "recording_mode": "hold_to_record",

Unfortunately, this sometimes (too often) causes Windows to perform actions before Whisper Writer starts to auto-type:

Of course it would be good if, when Whisper Writer does its thing, no other software including the OS would even notice that Ctrl+Win has been pressed - if that's possible. But you could also stop auto-typing as soon as the active window changed.

Just starting to type with the start menu open isn't bad, however, because a text box simply appears that is auto-typed into. Since it's desirable to not lose your recording, you should let this happen by accepting an active-window change with the start menu window now being active. Then, one can at least copy-paste the text from the start menu text box. So, please make an exception for the start menu window (executable SearchApp.exe, window class Windows.UI.Core.CoreWindow, found with AutoHotkey "Window Spy" tool).

It may also be appropriate to completely deny auto-typing when the taskbar or the desktop window is active, as that may have contributed to the opening of various windows after creating a new desktop, and this could still be a problem when you click on the desktop before auto-typing, e.g. However, the real problem may also be that a modifier key stays pressed while Whisper Writer auto-types. In that case: Could Whisper Writer first wait until there isn't any interference anymore?

savbell commented 4 months ago

Hi, thank you for your issue and apologies for being late to respond!

I think that some of these problems are very specific to your use case and the keyboard shortcut you chose, and that some of the changes to fix these might end up breaking someone else's use case. For example, I'm hesitant to block typing on any specific application because someone else might want to use it for exactly that (e.g. I sometimes use WhisperWriter to search the Start menu on purpose).

However, I can see it being a problem if the active window is changed between when a user presses the keyboard shortcut to start recording and when the app actually finishes the transcription and types it out. It's hard to know if the user made that change on purpose though, so if I did put in a blocking feature, it would be under a setting that can be toggled on and off. The same would be true if the app waited for no modifier keys to be pressed before typing, because again I'd worry about interfering with someone else's use case.

I'll think on this!

Thanks, Sav