dictation-toolbox / dragonfly

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
GNU Lesser General Public License v3.0
388 stars 75 forks source link

pynput can't do some OS shortcuts on X11 #79

Closed dwks closed 5 years ago

dwks commented 5 years ago

I switched from the xdotool linux port to pynput (on master), and discovered that my ctrl-alt-m shortcut for launching a terminal no longer works. ctrl-alt-right to switch desktops is fine. I made the keypress delay very large (1s) and it still didn't work. This might be a problem with pynput that we can't address; I used xev to observe the X11 events that are actually generated, and the 'm' is marked synthetic and same_screen is NO. I'm guessing one of these is the problem.

KeyPress event, serial 49, synthetic NO, window 0x3a00001,
    root 0x1e1, subw 0x0, time 2089007475, (-323,425), root:(548,904),
    state 0x0, keycode 37 (keysym 0xffe3, Control_L), same_screen YES,
    XLookupString gives 0 bytes: 
    XmbLookupString gives 0 bytes: 
    XFilterEvent returns: False

KeyPress event, serial 50, synthetic NO, window 0x3a00001,
    root 0x1e1, subw 0x0, time 2089007477, (-323,425), root:(548,904),
    state 0x4, keycode 64 (keysym 0xffe9, Alt_L), same_screen YES,
    XLookupString gives 0 bytes: 
    XmbLookupString gives 0 bytes: 
    XFilterEvent returns: False

KeyPress event, serial 50, synthetic YES, window 0x3a00001,
    root 0x1e1, subw 0x0, time 0, (0,0), root:(0,0),
    state 0xc, keycode 58 (keysym 0x6d, m), same_screen NO,
"   XLookupString gives 1 bytes: (0d) "
"   XmbLookupString gives 1 bytes: (0d) "
    XFilterEvent returns: False

I can use xdotool to press ctrl-alt-m, and then 'm' isn't sent to xev because it's captured by my window manager. The Control_L and Alt_L events are otherwise identical. Here is a small script demonstrating the problem.

#!/usr/bin/python
import time
import os
from pynput.keyboard import Controller, KeyCode

c = Controller()

def test(l):
    time.sleep(2)
    for (k,d,_) in l: c.touch(k, d)

# ctrl-alt-m
l1 = [(KeyCode(65507), True, 0), (KeyCode(65513), True, 0), (KeyCode.from_char('m'), True, 0), (KeyCode.from_char('m'), False, 0.0), (KeyCode(65513), False, 0), (KeyCode(65507), False, 0)]

# ctrl-alt-right
l2 = [(KeyCode(65507), True, 0), (KeyCode(65513), True, 0), (KeyCode(65363), True, 0), (KeyCode(65363), False, 0.0), (KeyCode(65513), False, 0), (KeyCode(65507), False, 0)]

print "pynput ctrl-alt-m fails..."
test(l1)

print "xdotool key ctrl+alt+m works..."
time.sleep(2)
os.system("xdotool key ctrl+alt+m")

print "pynput ctrl-alt-right works, should move to next desktop"
test(l2)

I haven't read any docs on pynput, but perhaps there is a way to affect the types of keystrokes it generates.

dwks commented 5 years ago

P.S. This may be a simpler problem description. If I ask pynput to generate 'm', I get

KeyPress event, serial 37, synthetic YES, window 0x3a00001,
    root 0x1e1, subw 0x0, time 0, (0,0), root:(0,0),
    state 0x0, keycode 58 (keysym 0x6d, m), same_screen NO,
    XLookupString gives 1 bytes: (6d) "m"
    XmbLookupString gives 1 bytes: (6d) "m"
    XFilterEvent returns: False

But with xdotool I get different values for synthetic and same_screen. This is the behavior I want.

KeyRelease event, serial 38, synthetic NO, window 0x3a00001,
    root 0x1e1, subw 0x0, time 2090250097, (-469,-80), root:(402,399),
    state 0x0, keycode 58 (keysym 0x6d, m), same_screen YES,
    XLookupString gives 1 bytes: (6d) "m"
    XFilterEvent returns: False
drmfinlay commented 5 years ago

Thanks for reporting this problem. I can confirm the same behaviour occurs for me. It would be nice if this could be fixed upstream. There are some problems with OSX too. I'll have a look into it.

GoNZooo commented 5 years ago

Is this perhaps also the reason a "a-" key input seems to not really only hold Alt? "a-d" gives "^[d". I've been trying to make macros that allow me to interact with screen/workspace navigation in XMonad and I assumed I was simply doing something odd, but I noticed that the supposedly working bindings for browser navigation ("a-d" for address bar) were exhibiting the same issues.

I'm using the silvius version that takes advantage of dragonfly and can confirm that the previous version using xdotool managed to send these events properly.

In https://github.com/moses-palmer/pynput/issues/59 someone again highlights the fact that for Linux, xdotool behaves correctly with modifier keys and in https://github.com/moses-palmer/pynput/issues/4 the maintainer details why certain keys are synthetic and non-synthetic (XFake vs. XSendEvent).

drmfinlay commented 5 years ago

@GoNZooo Sorry I've taken a little while to get to this.

This would be the reason that "a-d" gives "^[d". Fixing this is a little beyond me really. I didn't anticipate these issues when choosing pynput over xdotool, so I'm happy to just use Aenea's xdotool or libxdo implementations instead.

In hindsight, I should have expected these issues given how complicated xdo's source code is! pynput couldn't possibly work as well as xdo given the age difference.