ra1nty / DXcam

A Python high-performance screen capture library for Windows using Desktop Duplication API
MIT License
504 stars 70 forks source link

Feature: Allow capturing a specific windows only #30

Open zawlin opened 2 years ago

zawlin commented 2 years ago

Should also work when it is minimized like OBS studio.

Relevant parts of implementation.

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/window-capture.c#L563

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/dc-capture.c#L158

zawlin commented 2 years ago

I have figured out how to do this. here's a minimal code. It's not very fast though around 50-60 fps. I wonder if it's possible to get it to do faster. It also have some weird corrupted bits on calcuator and notepad although it does work on 3d games I have tested without any issues

import cv2
import numpy as np
import win32gui
import win32ui
import win32con
import ctypes

def set_dpi_awareness():
    awareness = ctypes.c_int()
    errorCode = ctypes.windll.shcore.GetProcessDpiAwareness(
        0, ctypes.byref(awareness))
    errorCode = ctypes.windll.shcore.SetProcessDpiAwareness(2)
    success = ctypes.windll.user32.SetProcessDPIAware()

def capture( win_title='', win_cls = None):
    hwnd = win32gui.FindWindow(win_cls, win_title)
    x1, y1, x2, y2 = win32gui.GetClientRect(hwnd)
    width = x2-x1
    height = y2-y1

    wx1, wy1, wx2, wy2 = win32gui.GetWindowRect(hwnd)
    # normalize to origin
    wx1, wx2 = wx1-wx1, wx2-wx1
    wy1, wy2 = wy1-wy1, wy2-wy1
    # compute border width and title height
    bw = int((wx2-x2)/2.)
    th = wy2-y2-bw
    # calc offset x and y taking into account border and titlebar, screen coordiates of client rect
    sx = bw
    sy = th

    wndc = win32gui.GetWindowDC(hwnd)
    imdc = win32ui.CreateDCFromHandle(wndc)
    # create a memory based device context
    memdc = imdc.CreateCompatibleDC()
    # create a bitmap object
    screenshot = win32ui.CreateBitmap()
    screenshot.CreateCompatibleBitmap(imdc, width, height)
    oldbmp = memdc.SelectObject(screenshot)
    # copy the screen into our memory device context
    memdc.BitBlt((0, 0), (width, height), imdc, (sx, sy), win32con.SRCCOPY)
    memdc.SelectObject(oldbmp)
    bmpinfo = screenshot.GetInfo()
    bmpstr = screenshot.GetBitmapBits(True)
    img = np.frombuffer(bmpstr, dtype='uint8')
    win32gui.DeleteObject(screenshot.GetHandle())
    imdc.DeleteDC()
    win32gui.ReleaseDC(hwnd, wndc)
    memdc.DeleteDC()
    img.shape = (height, width, 4)
    return cv2.cvtColor(img,cv2.COLOR_BGRA2BGR)

set_dpi_awareness()

im = capture('*Untitled - Notepad')
cv2.namedWindow('im',0)
cv2.imshow('im',im)
cv2.waitKey(0)
ra1nty commented 2 years ago

DC capture is slower than desktop dup api ( the api that dxcam used ). However, the desktop dup api is designed to capture the entire monitor with small processing overhead. So the region cropping is handled client side. DXCam has an api to capture a specific region and you can use win32gui.findwindow and win32gui.getclientrect to pass the determined window location to DXcam to do the capture.

zawlin commented 2 years ago

Hmm..but I assume it can't handle overlap ya? It will capture everything in the region, not just the game? And if the game window is hidden by other windows, I guess it wouldn't work?

I saw some other methods here. https://learn.microsoft.com/en-us/archive/blogs/dsui_team/ways-to-capture-the-screen, it seems bitblt is like the fastest possible way that can capture specific window. Not sure about mirror driver but it seems to be similar to dup api as well and has similar draw backs.

crackwitz commented 2 years ago

if you need to handle overlap, i.e. not be bothered by it, there's the "thumbnail" API from the DWM. gives you a full resolution picture of any window. I don't know if DXcam implements that.

https://learn.microsoft.com/en-us/windows/win32/dwm/thumbnail-ovw

(no, it's not for generating a thumbnail of your own window. it's specifically for getting a view of any window.)

zawlin commented 2 years ago

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

ra1nty commented 2 years ago

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

ra1nty commented 2 years ago

Hmm..but I assume it can't handle overlap ya? It will capture everything in the region, not just the game? And if the game window is hidden by other windows, I guess it wouldn't work?

I saw some other methods here. https://learn.microsoft.com/en-us/archive/blogs/dsui_team/ways-to-capture-the-screen, it seems bitblt is like the fastest possible way that can capture specific window. Not sure about mirror driver but it seems to be similar to dup api as well and has similar draw backs.

Yes it can't handle overlap. And in my own test bitblt (dc capture) is way slower than desktop duplication api. The best I can do with bitblt is <70fps.

zawlin commented 2 years ago

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

I think it might be easier to wrap libobs-winrt.dll via ctypes and avoid most of the api calling in python. maybe need to modify obslib a bit to make it as simple as possible for wrapper. Anyway, if I figure it out, I will post a snippet here.

zawlin commented 2 years ago

It seems someone has figured it out in a somewhat questionable project. I took the relevant parts out and made a simple usage example here.

I am not sure how the author did the bindings, either manually or generated since I couldn't find any trace of the wrapped code(the rotypes stuff) in any other public repositories. But it seems to be incomplete as it's missing the apis for removing the yellow border that show up in graphics capture. I also found a more official looking bindings for winrt here, which seems to have everything needed for this functionality.

ra1nty commented 2 years ago

That seems like a full-featured binding. BTW on windows 10, you can not remove the yellow border when using the windows.graphic.capture API. In Windows 11 it can be removed. If you are on win 10 and don't want the border then dc capture and desktop duplication API are the only choices.

Thanks for the find! I will take a look to see if I can borrow the bindings and make the win capture api available in dxcam when I have free time. Meanwhile, feel free to submit a PR : )

JustinHenderson98 commented 2 years ago

I have figured out how to do this. here's a minimal code. It's not very fast though around 50-60 fps. I wonder if it's possible to get it to do faster. It also have some weird corrupted bits on calcuator and notepad although it does work on 3d games I have tested without any issues

After some modifications. Namely blindly trusting that the screen size will not change during use this works well for me getting on avg 100 fps. with 55 fps lows and 130 fps peaks. I am also dedicating a thread to screen capture in a loop while writing the latest image to a locking buffer for pulling the latest image. This serves my needs and can hopefully help others looking for a solution until further work is done on the project. Note: The constructor is a bit messy as I have been playing with a few different implementations and the window border is not properly cropped.

`import numpy as np import win32con import win32gui import win32ui import cv2 as cv import copy import time from threading import Thread, Lock

class WindowCapture:

# constructor
def __init__(self, window_name):
    #
    self.__lock = Lock()
    t1 = Thread(target=self.__doWork)
    t1.start()
    self.__newestImage = np.array(np.zeros((100,100,3), dtype=np.uint8))
    self.__intermediaryImage = np.array(np.zeros((100,100,3), dtype=np.uint8))

    # find the handle for the window we want to capture
    self.hwnd = win32gui.FindWindow(None, window_name)
    self.window_name = window_name
    if not self.hwnd:
        raise Exception('Window not found: {}'.format(window_name))

    # get the window size
    window_rect = win32gui.GetWindowRect(self.hwnd)
    self.w = window_rect[2] - window_rect[0]
    self.h = window_rect[3] - window_rect[1]
    print(f"self.w: {self.w}; self.h: {self.h}")

    # account for the window border and titlebar and cut them off
    border_pixels = 8
    titlebar_pixels = 30
    #self.w = self.w - (border_pixels * 2)
    #self.h = self.h - titlebar_pixels - border_pixels
    self.cropped_x = border_pixels
    self.cropped_y = titlebar_pixels

    # set the cropped coordinates offset so we can translate screenshot
    # images into actual screen positions
    self.offset_x = window_rect[0] + self.cropped_x
    self.offset_y = window_rect[1] + self.cropped_y

def get_screenshot(self):
    hwnd = win32gui.FindWindow(None, self.window_name)
    wndc = win32gui.GetWindowDC(self.hwnd)
    imdc = win32ui.CreateDCFromHandle(wndc)
    # create a memory based device context
    memdc = imdc.CreateCompatibleDC()
    # create a bitmap object
    screenshot = win32ui.CreateBitmap()
    screenshot.CreateCompatibleBitmap(imdc, self.w, self.h)
    oldbmp = memdc.SelectObject(screenshot)
    # copy the screen into our memory device context
    memdc.BitBlt((0, 0), (self.w, self.h), imdc, (0, 0), win32con.SRCCOPY)
    memdc.SelectObject(oldbmp)
    bmpstr = screenshot.GetBitmapBits(True)
    img = np.frombuffer(bmpstr, dtype='uint8')
    win32gui.DeleteObject(screenshot.GetHandle())
    imdc.DeleteDC()
    win32gui.ReleaseDC(hwnd, wndc)
    memdc.DeleteDC()
    img.shape = (self.h, self.w, 4)
    return cv.cvtColor(img, cv.COLOR_BGRA2BGR)

def __doWork(self):
    loop_time = 0
    while True:
        try:
            self.__intermediaryImage = self.get_screenshot()
            self.__lock.acquire()
            self.__newestImage = self.__intermediaryImage
            self.__lock.release()
        except Exception as ex:
            print(ex, flush=True)
            continue
        try:
            fps = 1 / (time.time() - loop_time)
        except:
            pass
        print(f'Raw FPS {fps}', flush=True)
        loop_time = time.time()

def GetLatestImage(self):
    self.__lock.acquire()
    copyImage = copy.copy(self.__newestImage)
    self.__lock.release()
    return copyImage

untested insertion of my main

if name == 'main': windowCap = WindowCapture('Spotify Premium') loop_time = 0

while True:
    try:
        fps = 1 / (time.time() - loop_time)
    except:
        pass
    loop_time = time.time()
    #print(f'FPS {fps}', flush=True)
    img = windowCap.GetLatestImage()
    if img is None:
        continue
    #time.sleep(5//100)
    cv.imshow("hi", img)
    cv.waitKey(1)

`

crackwitz commented 1 year ago

self.__intermediaryImage = self.get_screenshot() self.lock.acquire() self.newestImage = self.intermediaryImage self.lock.release()

that is pointless, and so is the locking in GetLatestImage. python variables are references. the assignment sets a reference to the object. this operation requires no locks at all. drop the locking.

your get_screenshot always creates a new object. nothing in your code ever "writes into" these objects, after they've been created and returned from that function.

Avasam commented 1 year ago

Didn't read through the entire thread, but DirectX Desktop Duplication, BitBlt and Windows Graphics Capture API are all completely different capture methods that server their own purpose and have their limitations.

Summary from my experience on AutoSplit where I had to implement BitBlt and WGC myself: https://github.com/Avasam/Auto-Split#capture-method

Implementation details if you need some inspiration: https://github.com/Avasam/Auto-Split/tree/2.0.0/src/capture_method

lucasmonstrox commented 1 year ago

Should have checked obs carefully, there's another method for capture which is not listed in microsoft website. https://github.com/obsproject/obs-studio/blob/master/libobs-winrt/winrt-capture.cpp

This is the newer windows graphics capture API. This would be the best API to use to my knowledge. However, seems that it requires a non-trivial amount of work if we want to use that in python ( not as simple as using desktop duplication at least). Haven't really got a chance to look into that in depth.

@ra1nty have you worked on it?

crackwitz commented 1 year ago

This issue is tagged as "help wanted"

xiaobaixuejava commented 6 months ago

当它像 OBS 工作室一样最小化时也应该工作。

实施的相关部分。

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/window-capture.c#L563

https://github.com/obsproject/obs-studio/blob/master/plugins/win-capture/dc-capture.c#L158

Have you implemented the d3d11 window specified in the background screenshot? If you could share your code