asweigart / pyautogui

A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
BSD 3-Clause "New" or "Revised" License
10.06k stars 1.22k forks source link

Support for Wayland #111

Open k4j8 opened 7 years ago

k4j8 commented 7 years ago

A series of errors occurred while trying to import the pyautogui module, but only while using Wayland.

>>> import pyautogui
Traceback (most recent call last):
  File "/home/k4j8/Python/env/lib/python3.5/site-packages/Xlib/xauth.py", line 43, in __init__
    raw = open(filename, 'rb').read()
FileNotFoundError: [Errno 2] No such file or directory: '/home/k4j8/.Xauthority'

Can Wayland be supported? Or is Wayland so vastly different from X11 that a whole new module is needed? If it's the latter, I'd like to modify the README to include X11 in the dependencies and then close this issue.

Thanks for creating this awesome module! (And pyperclip too!)

zfsamzfsam commented 7 years ago

+1 Support for Wayland

asweigart commented 7 years ago

I've taken a cursory look at pywayland, and found this documentation about generating click events: https://pywayland.readthedocs.io/en/latest/module/protocol/wayland/pointer.html?highlight=click

I personally don't have time to implement this, but I could see it being done in the future. Really, it comes down to finding code snippets to generate mouse/keyboard events and combining them into a new _pyautogui_wayland.py (similar to the current _pyautogui_x11.py).

If you'd like to start work on this, we can create a new "wayland" branch to work on this feature.

k4j8 commented 5 years ago

After some research on the subject, I think it best pyautogui not support Wayland. From what I can tell, supporting Wayland would require a very different approach that might split the project and efforts would be better spent on existing projects. For these reasons, I'm closing the issue.

For those looking for keyboard and mouse emulation on Wayland... I found 3 resources I verified work on Wayland. evemu and python-evdev lack a pyautogui.typewrite() feature to convert strings to commands. keyboard requires sudo to run and has been inconsistent in use (but maybe that's just me). Of the 3, keyboard looks to be the most similar to pyautogui.

Each of the three examples below types the letter "a".

evemu

#!/bin/bash
device="/dev/input/event7" # event number is computer-specific

function press {
evemu-event ${device} --type EV_KEY --code $1 --value 1 --sync
evemu-event ${device} --type EV_KEY --code $1 --value 0 --sync
sleep 0.01
}

sleep 0.5
press KEY_A

python-evdev

#!/usr/bin/env python3
from evdev import UInput, ecodes as e

ui = UInput()

# accepts only KEY_* events by default
ui.write(e.EV_KEY, e.KEY_A, 1)  # KEY_A down
ui.write(e.EV_KEY, e.KEY_A, 0)  # KEY_A up
ui.syn()

ui.close()

keyboard

#!/usr/bin/env python3
import keyboard
keyboard.write('a')
asweigart commented 5 years ago

Thanks for taking a look at this! We can reopen this issue if we figure out a way to seemlessly add this.

cjbassi commented 4 years ago

For Wayland there's a tool called ydotool that is similar to xdotool:

https://github.com/ReimuNotMoe/ydotool

Would you be open to adding wayland support using this or does this not fit into the design of this library? Note that it also requires sudo.

k4j8 commented 3 years ago

In case anyone comes across this issue, I've found a very reliable way to get pyautogui.typewrite() functionality on Wayland, although it only works by using subprocess: wl-clipboard. It cannot do hotkeys, unfortunately.

Example of printing "Hello world":

import subprocess
subprocess.call(['wl-copy', "Hello world"])
subprocess.call(['wl-paste'])
tbm commented 3 years ago

Can this ticket be reopened? Wayland will only get more important over time.

I just tried to use pyautogui and it didn't work. Having an open issue about lack of support might reduce future duplicate reports.

I read @KyleWJohnston's argument against supporting Wayland. I don't know how pyautogui works but could you use these other tools as a backend in pyautogui? In any case, if there are no plans to support Wayland, I think this should at least be documented since you claim to support "Linux" (maybe that should be changed to X11).

mystiquewolf commented 3 years ago

Wayland is the default login in Ubuntu 21.04 and Wayland will be the default Wayland login in Kubuntu 21.10. Wayland is also the default for Fedora.

Please see: https://bugs.kde.org/show_bug.cgi?id=439971 https://gitlab.com/dogtail/dogtail https://gitlab.gnome.org/ofourdan/gnome-ponytail-daemon https://github.com/autokey/autokey/issues/87 https://lists.freedesktop.org/archives/wayland-devel/2017-July/034459.html

pendragons-code commented 2 years ago

I agree, wayland support should be something the devs might wanna consider

asweigart commented 2 years ago

Yes, pyautogui should support Wayland. I haven't had time to look at the issue myself though. Although for pyscreeze (which pyautogui uses to take screenshots), the thing holding back screenshots on Wayland was that the gnome screenshot tool had a visual flash effect when run and there was no way to disable it. I looked at the source code and it's an easy fix, but I can't seem to get a hold of the developers to have them add it.

asweigart commented 2 years ago

Thanks to everyone who is contributing to this discussion by the way. There have been a lot of projects taking up my attention in the past few months.

pendragons-code commented 2 years ago

If we look at the wayland security module we can see how pyautogui may not work on wayland at all.

if there is a workaround, i would like to see what it is.

pendragons-code commented 2 years ago

Thanks to everyone who is contributing to this discussion by the way. There have been a lot of projects taking up my attention in the past few months.

Thank you too! you created a simple to use and fascinating project that is open to everyone! thank you for making this tool!

VictorGimenez commented 1 year ago

+1 Support for Wayland

Jianshui commented 1 year ago

+1 thanks

pendragons-code commented 1 year ago

That is very interesting...

On Fri, Nov 4, 2022, 03:07 Victor Borghi Giménez @.***> wrote:

Today I was able to adapt the screenshot function (local/lib/python/pyscreeze/init.py) where it calls the gnome-screenshot tool instead of scrot and I was able to make the screenshot function work on wayland, plus the locate functions. Now, the difficulty is being to find any mouse handling function or tool that works on Wayland, is there anyone that found it? I tested ydotool but I can't able to test it separately that I am getting an error about ydotoold backend unavailable

— Reply to this email directly, view it on GitHub https://github.com/asweigart/pyautogui/issues/111#issuecomment-1302550952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVEGAZIISDC7IY6ND2EPTRTWGQEOLANCNFSM4CXUUJGQ . You are receiving this because you commented.Message ID: @.***>

mikigo commented 1 year ago

+1

support for wayland

thanks

glowingsword commented 1 year ago

+1

support for wayland

asweigart commented 1 year ago

Ooof. Okay, so I have pyscreeze's screenshot functionality working on Wayland (it uses gnome-screenshot for the screenshotting) but I can't figure out how to get the keyboard & mouse operations working. I'd appreciate any help or pointers people have.

asweigart commented 1 year ago

I've found wtype as a Wayland version of xdotool, but the problem is that 1) it only does typing and not mouse or window operations that xdotool can do and 2) it causes a "Compositor does not support the virtual keyboard protocol" error when I run it on Ubuntu 23.04.

pendragons-code commented 1 year ago

I've found wtype as a Wayland version of xdotool, but the problem is that 1) it only does typing and not mouse or window operations that xdotool can do and 2) it causes a "Compositor does not support the virtual keyboard protocol" error when I run it on Ubuntu 23.04.

Hey that's pretty cool, hopefully we get to see more features from that soon.

mritunjaymusale commented 1 year ago

As @cjbassi mentioned ydotool might be the better option since it's display server agnostic and works just fine as a virtual input device and interms of getting it to work without sudo here's what I did sudo chmod u+s /usr/bin/yodotool /bin/ydotoold There is a python binding for the same which currently has a bug for which I have submitted a patch hopefully they accept it.

Heath123 commented 1 year ago

https://flatpak.github.io/xdg-desktop-portal/#gdbus-org.freedesktop.portal.RemoteDesktop

This seems like a reasonable way to fake inputs on Wayland, though it's designed for remote desktop. Here's a proof-of-concept of using this from Python that presses the Start/Super key and works on KDE at least, I'm probably doing many things wrong here (like the handler function being a hack because I couldn't work out how to remove handlers) so I'll try to clean it up later. It's roughly based on https://github.com/KDE/krfb/blob/master/framebuffers/pipewire/pw_framebuffer.cpp and https://github.com/python-pillow/Pillow/issues/6392

import dbus
import dbus.mainloop.glib
from dbus.mainloop.glib import DBusGMainLoop

from gi.repository import GLib

import random
import time

step = 0

handle = ""

def handler(*args, **kwargs):
  global step
  if step == 0:
    handle_xdp_session_created(*args, **kwargs)
  elif step == 1:
    handle_xdp_devices_selected(*args, **kwargs)
  elif step == 2:
    handle_session_start(*args, **kwargs)
  else:
    print(args, kwargs)
  step += 1

def handle_session_start(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not start session")

  # https://www.cl.cam.ac.uk/~mgk25/ucs/keysymdef.h
  inter.NotifyKeyboardKeysym(handle, {}, 0xffeb, 1)
  time.sleep(0.1)
  inter.NotifyKeyboardKeysym(handle, {}, 0xffeb, 0)

def handle_xdp_devices_selected(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not select devices")

  start_options = {
      "handle_token": "krfb" + str(random.randint(0, 999999999))
  }
  reply = inter.Start(handle, "", start_options)
  print(reply)  

def handle_xdp_session_created(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not create session")
  print(results)
  handle = results["session_handle"]

  # select sources for the session
  selection_options = {
      "types": dbus.UInt32(7),  # request all (KeyBoard, Pointer, TouchScreen)
      "handle_token": "krfb" + str(random.randint(0, 999999999))
  }
  selector_reply = inter.SelectDevices(handle, selection_options)
  print(selector_reply)

def main():
  global bus
  global inter
  loop = GLib.MainLoop()
  dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
  bus = dbus.SessionBus()
  obj = bus.get_object("org.freedesktop.portal.Desktop", "/org/freedesktop/portal/desktop")
  inter = dbus.Interface(obj, "org.freedesktop.portal.RemoteDesktop")

  bus.add_signal_receiver(
    handler,
    signal_name="Response",
    dbus_interface="org.freedesktop.portal.Request",
    bus_name="org.freedesktop.portal.Desktop",
    path_keyword="object_path")

  print(inter)
  result = inter.CreateSession({
    "session_handle_token": "abcdefg",
    "handle_token": "hijklmnop"
  })
  print(result)
  loop.run()

main()

It doesn't need root access or anything but it causes this popup to open:

image

I might be able to make a PR to add this to pyautogui

saifeiLee commented 1 year ago

ydotool requires root permission. It may be not a good design.

mak448a commented 1 year ago

ydotool requires root permission. It may be not a good design.

For some reason it works without root for me, but it could just be my system

wang-qa commented 11 months ago

Unfortunately, ydotool v1.0.4(latest) currently only has relative mouse movements and cannot obtain absolute positions

JermellB commented 9 months ago

Found myself here after trying this on Wayland w/ Ubuntu 22 myself.

JermellB commented 9 months ago

ydotool requires root permission. It may be not a good design.

https://github.com/ReimuNotMoe/ydotool/issues/207#issuecomment-1724204933

Not actually true. I have it working currently without root.

JermellB commented 9 months ago

The python bindings that exist are very bad.

What would it take to modify pyautogui to use ydotool underneath?

ElectricRCAircraftGuy commented 9 months ago

ydotool requires root permission. It may be not a good design.

From my comment here:

Tutorial: Getting started with ydotool to automate key presses (or mouse movements) in Linux

I go over installation, launching the daemon in a way that doesn't require your user to use root to connect to it, pressing key sequences, and where to find them, how to view the help menus, etc.

JermellB commented 9 months ago

ydotool requires root permission. It may be not a good design.

From my comment here:

Tutorial: Getting started with ydotool to automate key presses (or mouse movements) in Linux

I go over installation, launching the daemon in a way that doesn't require your user to use root to connect to it, pressing key sequences, and where to find them, how to view the help menus, etc.

Yes, between you and the other commenter I've been able to get a reinforcement learning environment driving this tool. So thank you :)

saifeiLee commented 8 months ago

I found a GUI auto test solution for both x11 & wayland. deepin-autotest-framework The basic idea is, KDE implements KWayland::Client::FakeInput class in KWayland, which provides the ability to simulate mouse and keyboard input.

Hope to be helpful.

mak448a commented 7 months ago

https://flatpak.github.io/xdg-desktop-portal/#gdbus-org.freedesktop.portal.RemoteDesktop

This seems like a reasonable way to fake inputs on Wayland, though it's designed for remote desktop. Here's a proof-of-concept of using this from Python that presses the Start/Super key and works on KDE at least, I'm probably doing many things wrong here (like the handler function being a hack because I couldn't work out how to remove handlers) so I'll try to clean it up later. It's roughly based on https://github.com/KDE/krfb/blob/master/framebuffers/pipewire/pw_framebuffer.cpp and python-pillow/Pillow#6392

import dbus
import dbus.mainloop.glib
from dbus.mainloop.glib import DBusGMainLoop

from gi.repository import GLib

import random
import time

step = 0

handle = ""

def handler(*args, **kwargs):
  global step
  if step == 0:
    handle_xdp_session_created(*args, **kwargs)
  elif step == 1:
    handle_xdp_devices_selected(*args, **kwargs)
  elif step == 2:
    handle_session_start(*args, **kwargs)
  else:
    print(args, kwargs)
  step += 1

def handle_session_start(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not start session")

  # https://www.cl.cam.ac.uk/~mgk25/ucs/keysymdef.h
  inter.NotifyKeyboardKeysym(handle, {}, 0xffeb, 1)
  time.sleep(0.1)
  inter.NotifyKeyboardKeysym(handle, {}, 0xffeb, 0)

def handle_xdp_devices_selected(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not select devices")

  start_options = {
      "handle_token": "krfb" + str(random.randint(0, 999999999))
  }
  reply = inter.Start(handle, "", start_options)
  print(reply)  

def handle_xdp_session_created(code, results, object_path):
  global handle

  if code != 0:
    raise Exception("Could not create session")
  print(results)
  handle = results["session_handle"]

  # select sources for the session
  selection_options = {
      "types": dbus.UInt32(7),  # request all (KeyBoard, Pointer, TouchScreen)
      "handle_token": "krfb" + str(random.randint(0, 999999999))
  }
  selector_reply = inter.SelectDevices(handle, selection_options)
  print(selector_reply)

def main():
  global bus
  global inter
  loop = GLib.MainLoop()
  dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
  bus = dbus.SessionBus()
  obj = bus.get_object("org.freedesktop.portal.Desktop", "/org/freedesktop/portal/desktop")
  inter = dbus.Interface(obj, "org.freedesktop.portal.RemoteDesktop")

  bus.add_signal_receiver(
    handler,
    signal_name="Response",
    dbus_interface="org.freedesktop.portal.Request",
    bus_name="org.freedesktop.portal.Desktop",
    path_keyword="object_path")

  print(inter)
  result = inter.CreateSession({
    "session_handle_token": "abcdefg",
    "handle_token": "hijklmnop"
  })
  print(result)
  loop.run()

main()

It doesn't need root access or anything but it causes this popup to open:

image

I might be able to make a PR to add this to pyautogui

Well, I've figured out how to click mouse buttons with this:

inter.NotifyPointerButton(handle, {}, 0x110, 1)
time.sleep(0.1)
inter.NotifyPointerButton(handle, {}, 0x110, 0)

You can change which mouse button is clicked by using the map here: https://github.com/torvalds/linux/blob/master/include/uapi/linux/input-event-codes.h

Also, I recommend https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.impl.portal.RemoteDesktop.html and https://github.com/flatpak/xdg-desktop-portal/blob/main/data/org.freedesktop.impl.portal.RemoteDesktop.xml for anyone trying to figure out more

Heath123 commented 7 months ago

A problem I had when trying to add my idea was that there’s still no way to get the mouse pointer position, so it’s not enough to implement all of pyautogui’s functionality

mak448a commented 7 months ago

A problem I had when trying to add my idea was that there’s still no way to get the mouse pointer position, so it’s not enough to implement all of pyautogui’s functionality

Ah ok. Do you know how to move the mouse based on absolute position in this method though? I'd like to know how for a program I'm making.

calllivecn commented 5 months ago

First of all, thanks to pyautogui. I started using it because it was easy to use, but then I found that it didn't work on Wayland. So I spent some time researching keyboard/mouse automation solutions that would work on Wayland.

keyboardmouse: https://github.com/calllivecn/keyboardmouse/tree/devel

Yes, it is the development branch. Since I am the only one using it so far, it is still a preview version, but it is already fully functional.

The working principle is roughly as follows:

Linux uinput simulates input function libevdev (python library) The principle is the same as the previously discussed ydotool project, which is also a C/S architecture (required by the working principle). However, keyboardmouse is a pure python implementation.

Installation:

git clone -b devel https://github.com/calllivecn/keyboardmouse cd keyboardmouse && pip install keyboardmouse/ Permissions:

The current user needs to be in the input user group Or run the server side with root Usage:

mouse.py --server mouse.py --help list-exper.py:

Can view the keys and values of the currently active input devices. The corresponding key name of the key pressed on the keyboard will output the corresponding value, which can be used in mouse.py --key . checkkey.py:

View the keys and values of a specified input device. Usage: checkkey.py /dev/input/eventX

Additional Information

The keyboardmouse project is still under development, but it is already fully functional. The project is a pure python implementation of the linux uinput and libevdev libraries. The project can be used to automate keyboard and mouse input on Wayland.

mak448a commented 5 months ago

Looks cool! But I can't read chinese......

totalnooob commented 4 months ago

My solution

Origin Python code working on X11 RPI 3

 def process_card_id(card_id: str):
        # Type the converted card ID into the active window.
        converted_id = convert_card_id(card_id)
        logging.info(f"Typing Converted ID: {converted_id}")
        pyautogui.typewrite(converted_id)
        pyautogui.press('enter')
        logging.info("Typing completed")

Wayland RPI 4 Solution

sudo apt install ydotool

Create a new udev rule file: sudo nano /etc/udev/rules.d/70-ydotool.rules

Add KERNEL=="uinput", MODE="0660", GROUP="input"

Add your user to the input group: sudo usermod -aG input sudo udevadm control --reload-rules && sudo udevadm trigger

Reboot

def process_card_id(card_id: str):
    # Type the converted card ID into the active window.
    converted_id = convert_card_id(card_id)
    logging.info(f"Typing Converted ID: {converted_id}")
    type_command = f"ydotool type {converted_id}"
    enter_command = "ydotool key Enter"
    subprocess.call(type_command, shell=True)
    subprocess.call(enter_command, shell=True)
    logging.info("Typing completed")
Sunshine-dev-forever commented 2 months ago

After much struggles, I got screenshots working on wayland perfectly

from jeepney.io.blocking import open_dbus_connection

import generated
import os

DEBUG = True

# returns a string to the created screenshot
def TakeScreenShot():
    scren = generated.Screenshot()
    msg = scren.Screenshot("", {})
    connection = open_dbus_connection()
    print("caling the bus")
    connection.send_and_get_reply(msg)
    reply = connection.receive()
    connection.close()
    file_string = str(reply.body[1]["uri"][1])
    # removes the file://
    file_string = file_string[7:]
    if(DEBUG):
        print("Bus responded, IMG is at: " + file_string)
    return file_string

file = TakeScreenShot()
if(os.path.exists(file)):
    os.remove(file)
    print("clean up success")

quit()

im using the jeepney library here

the "generated" import file is created with:

python3 -m jeepney.bindgen --name org.freedesktop.portal.Desktop \
        --path /org/freedesktop/portal/desktop

I'm happy to report some success :)

LunaBlueSky commented 1 month ago

@mak448a commented on Mar 25, 2024, 1:22 AM GMT+1:

Looks cool! But I can't read chinese......

Here is the translation of the keyboardmouse README file from Chinese to English:


Library for Simulating Keyboard and Mouse using libevdev

Running the Development Environment:

virtualenv Venv
or
python3 -m venv Venv

# Activate the virtual environment
. Venv/bin/activate

pip install .

Running the Test Demo

mouse.py --help
usage: mouse.py [option] <KeyName>

Virtual mouse and keyboard using C/S. You need to start the server first (user needs to be in the input user group or run as root).

options:
  -h, --help            Display this help message
  --list                List some example keys that can be used
  --secret SECRET       Specify the communication secret
  --server              Start the server; requires input user group or root permissions.
  --key KEY             Press a key and then release it.
  --keydown KEYDOWN     Press a key without releasing it.
  --keyup KEYUP         Release a key.
  --ctrlkey CTRLKEY     Press the ctrl key combined with this key.
  --altkey ALTKEY       Press the alt key combined with this key.
  --shiftkey SHIFTKEY   Press the shift key combined with this key.
  --mouseclick {left,right,wheel}
                        Click the mouse button: must be one of ('left', 'right', 'wheel').
  --mousedown {left,right,wheel}
                        Press the mouse button: must be one of ('left', 'right', 'wheel').
  --mouseup {left,right,wheel}
                        Release the mouse button: must be one of ('left', 'right', 'wheel').

Viewing Corresponding Key Names and Values

Run input-expoer.py and then press keys on the keyboard or move the mouse to see output similar to the following:

InputEvent(EV_KEY, KEY_C, 1) # When the C key is pressed
InputEvent(EV_KEY, KEY_C, 0) # When the C key is released

InputEvent(EV_KEY, KEY_B, 1) # When the B key is pressed
InputEvent(EV_KEY, KEY_B, 0) # When the B key is released

Packaging with pyproject.toml: https://packaging.python.org/en/latest/tutorials/packaging-projects/

frankhuurman commented 3 weeks ago

Have been using pyautogui ever since I picked up your book somewhere in 2016 @asweigart so thanks for your hard work on it :)

Just tried to build something for my mom using pyautogui but also noticed it's not working on Ubuntu 22.04 on Wayland. Would be amazing if we could find a well developed and maintained tool like xdotool to use on Wayland since almost every distro is switching or has switched to it.

Seems that ydotool is on for a big release this year but unsure if it's feature-rich enough that we could use it:

Our ultra-lightweight JavaScript runtime, Resonance, will be released in Q2 2024 in LGPL license. ydotool will then be rewritten in JavaScript afterwards, to enable more people to understand the code & contribute.