BoboTiG / python-mss

An ultra fast cross-platform multiple screenshots module in pure Python using ctypes.
https://pypi.org/project/mss/
MIT License
1.03k stars 94 forks source link

Mouse support #55

Open itsdax opened 6 years ago

itsdax commented 6 years ago

Any way to include mouse on screen grab?


Edit from @BoboTiG:

Upvote & Fund

Fund with Polar

BoboTiG commented 6 years ago

As of now, no. But it may be an interesting feature to add.

jamespreed commented 6 years ago

As an interim solution, could you capture the mouse position on screen and add it to the .png file metadata on write out?

A good example of getting the mouse position is given in the 2nd answer here: https://stackoverflow.com/questions/3698635/getting-cursor-position-in-python

BoboTiG commented 6 years ago

It could be a good start.

Perhaps adding CLI arguments like --with-cursor and --cursor-file=FILE. Add adding keywords/attributes with_cursor=False and cursor_file='' to the MSS class.

I think it can be interesting to merge the cursor into the pixels directly and not only in the final PNG.

What do ou think @itsdax and @jamespreed ?

itsdax commented 6 years ago

Yes! That sounds like the right approach.

In Java, there's a method for getting the cursor's state (text hover, resize, move, etc) java.awt.Cursor.getType()

If there's a similar one in python, it can be used to trigger different icons.

jamespreed commented 6 years ago

Integrating a cursor directly into the pixels would be great. You could have several choices: cursor, crosshair, or concentric circles.

BoboTiG commented 6 years ago

FI we can find some clues from the pyinput module.

BoboTiG commented 5 years ago

This is how we could do it for GNU/Linux: https://github.com/MaartenBaert/ssr/blob/master/src/AV/Input/X11Input.cpp

skjerns commented 4 years ago

I've written a solution for Windows.

The script gets the current mouse cursor icon to bitmap and adds it to the screenshot. It's very fast, so no performance problems.

However, I'm not quite sure how to implement it. The dependencies to win32gui can be exchanged for call to cTypes for sure. Maybe someone else can add the code in a beautiful fashion?

import numpy as np
import win32gui, win32ui
import mss
from PIL import Image

def set_pixel(img, w, x, y, rgb=(0,0,0)):
    """
    Set a pixel in a, RGB byte array
    """
    pos = (x*w + y)*3
    if pos>=len(img):return img # avoid setting pixel outside of frame
    img[pos:pos+3] = rgb
    return img

def add_mouse(img, w):
    flags, hcursor, (cx,cy) = win32gui.GetCursorInfo()
    cursor = get_cursor(hcursor)
    cursor_mean = cursor.mean(-1)
    where = np.where(cursor_mean>0)
    for x, y in zip(where[0], where[1]):
        rgb = [x for x in cursor[x,y]]
        img = set_pixel(img, w, x+cy, y+cx, rgb=rgb)
    return img

def get_cursor(hcursor):
    info = win32gui.GetCursorInfo()
    hdc = win32ui.CreateDCFromHandle(win32gui.GetDC(0))
    hbmp = win32ui.CreateBitmap()
    hbmp.CreateCompatibleBitmap(hdc, 36, 36)
    hdc = hdc.CreateCompatibleDC()
    hdc.SelectObject(hbmp)
    hdc.DrawIcon((0,0), hcursor)

    bmpinfo = hbmp.GetInfo()
    bmpbytes = hbmp.GetBitmapBits()
    bmpstr = hbmp.GetBitmapBits(True)
    im = np.array(Image.frombuffer(
        'RGB',
         (bmpinfo['bmWidth'], bmpinfo['bmHeight']),
         bmpstr, 'raw', 'BGRX', 0, 1))

    win32gui.DestroyIcon(hcursor)    
    win32gui.DeleteObject(hbmp.GetHandle())
    hdc.DeleteDC()
    return im

with mss.mss() as sct:
    screen = sct.monitors[0]
    img = bytearray(sct.grab(screen).rgb)
    img_with_mouse = add_mouse(img, screen['width'])
matanox commented 4 years ago

Maybe the best reference for linux would be ffmpeg, where mouse capture is implemented as part of screen grabbing (taking a screenshot).

For just getting the mouse coordinates at the time of making a screenshot, using python-xlib:

from Xlib import display
pointer = display.Display().screen().root.query_pointer()
position = (pointer.root_x, pointer.root_y)

But then you have to draw a cursor yourself into the screen grab, and you do not know the actual cursor image used by the display at the particular moment. Cursor rendering is its own little kingdom in X11 ...

matanox commented 4 years ago

Here is how ffmpeg goes about it. It may seem however, that python-xlib does not expose the xfixes API as of now, nor does ffmpeg-python expose that ffmpeg function retrieving the pointer image.

matanox commented 4 years ago

It should be possible to use ctypes to access the XFixesGetCursorImage function from libXfixes.so to get back a XFixesCursorImage C structure like here, but in python.

See also the structure of the image struct here or in the original X source ...

zorvios commented 4 years ago

I found a solution but it depends on GTK2 for now and it is written in python2 using x.py and xlib.py from http://code.google.com/p/pyxlib-ctypes/ here is a working exemple https://github.com/zorvios/X11CursorImagePy2

zorvios commented 4 years ago

I've written a working module for linux, that get the cursor image using Ctypes : https://github.com/zorvios/PyXCursor is there anything we could do to add this feature ?

thiagoribeirodamotta commented 4 years ago

Just to add my two cents: When trying to display the mouse via the "grab mouse position on screen and paste mouse icon" method, how to go about scenarios where an application uses custom mouse icons? Would one need to search for each application and see if it uses a custom mouse? Some applications even have more mouse states (or different ones) than OSs does. I know ffmpeg is agnostic to that issue, but don't know how the library does it.

amadeok commented 3 years ago

if using PIL for processing a workaround could be the one here: https://github.com/swharden/pyScreenCapture/blob/master/go.py

BoboTiG commented 1 year ago

v8.0.0 added Linux support. Thanks to @zorvios it introduces all bricks needed for upcoming Mac, and Windows, supports (like MSSBase._merge(screenshot, cursor)).

basket-ball commented 1 year ago

with_cursor=True is only supported in linux. not in windows. Can it be supported in windows? thx

BoboTiG commented 9 months ago

Windows support is ongoing with #272.

tamnguyenvan commented 1 month ago

@BoboTiG , I'd like to suggest a solution.

d = display.Display() if not d.has_extension('XFIXES'): logger.error('XFIXES extension not supported.') return

xfixes_version = d.xfixes_query_version()

root = d.screen().root

image = d.xfixes_get_cursor_image(root) cursor_image = image.cursor_image width, height = image.width, image.height

cursor_data = np.array(cursor_image, dtype=np.uint32).reshape(height, width)

bgra = np.zeros((height, width, 4), dtype=np.uint8) bgra[..., 0] = cursor_data & 0xFF # Blue bgra[..., 1] = (cursor_data >> 8) & 0xFF # Green bgra[..., 2] = (cursor_data >> 16) & 0xFF # Red bgra[..., 3] = (cursor_data >> 24) & 0xFF # Alpha

- Windows:
```python
import win32gui
import win32con
cursor_info = win32gui.GetCursorInfo()
cursor_handle = cursor_info[1]

# Define a dictionary mapping cursor handles to their states
cursor_states = {
    win32gui.LoadCursor(0, win32con.IDC_ARROW): "arrow",
    win32gui.LoadCursor(0, win32con.IDC_IBEAM): "ibeam",
    win32gui.LoadCursor(0, win32con.IDC_WAIT): "wait",
    win32gui.LoadCursor(0, win32con.IDC_CROSS): "cross",
    win32gui.LoadCursor(0, win32con.IDC_UPARROW): "uparrow",
    win32gui.LoadCursor(0, win32con.IDC_SIZENWSE): "sizenwse",
    win32gui.LoadCursor(0, win32con.IDC_SIZENESW): "sizenesw",
    win32gui.LoadCursor(0, win32con.IDC_SIZEWE): "sizewe",
    win32gui.LoadCursor(0, win32con.IDC_SIZENS): "sizens",
    win32gui.LoadCursor(0, win32con.IDC_SIZEALL): "sizeall",
    win32gui.LoadCursor(0, win32con.IDC_NO): "no",
    win32gui.LoadCursor(0, win32con.IDC_HAND): "hand",
    win32gui.LoadCursor(0, win32con.IDC_APPSTARTING): "appstarting",
    win32gui.LoadCursor(0, win32con.IDC_HELP): "help",
}
state = cursor_states.get(cursor_handle, "arrow")

# We can now use the state to load the appropriate cursor image for rendering

Get current cursor

cursor = AppKit.NSCursor.currentSystemCursor() image = cursor.image() size = image.size() width, height = int(size.width), int(size.height) bitmaprep = NSBitmapImageRep.imageRepWithData(image.TIFFRepresentation())

png_data = bitmap_rep.representationUsingTypeproperties(NSPNGFileType, None)

buffer = io.BytesIO(png_data) img_array = Image.open(buffer) rgba = np.array(img_array) bgra = cv2.cvtColor(rgba, cv2.COLOR_RGBA2BGRA)



I have implemented it in my project, and it has worked perfectly.
Refs: https://github.com/tamnguyenvan/screenvivid/blob/main/screenvivid/models/utils/cursor/loader.py