ra1nty / DXcam

A Python high-performance screen capture library for Windows using Desktop Duplication API
MIT License
457 stars 67 forks source link

Performance improvements #62

Open Agade09 opened 1 year ago

Agade09 commented 1 year ago

To grab() a 1440x2560 screen, profiling total time spent in ctypes.string_at() went from ~20% to ~0%. A further optimization is made if grabbing a subset of the screen. Then if we want a 480x640 region of a 1440x2560 screen, only 480x2560x4 contiguous pixels need to be copied. Overall FPS improvements on my machine, grabbing 1440x2560 in BGRA ~271FPS -> ~685FPS

ninjatall12 commented 1 year ago

Crazy optimisations, where did you get the experience to know how to improve this code?

ra1nty commented 1 year ago

Thanks for the commit! Interesting optimization. Let me take a look and do some benchmarks & merge

Agade09 commented 1 year ago

@ninjatall12 Is this with dxcam in pure python or with AI-M-BOT's pyd ? Because I also have issues with the py310 binary he made.

PS: Thanks, I did a lot of competitions on Codingame and just got experience over the years at work and in personal projects. I was profiling a project of mine with pprofile and noticed that dxcam was spending a lot of time in ctypes.string_at() which just copies memory according to its description; I found this very suspicious and investigated.

AI-M-BOT commented 1 year ago

@Agade09 with AI-M-BOT's pyd, i am using his python 3.11 binary. thanks for the quick reply. will let him know about this

would you share the part of your script taking screenshots? was that area using windll.user32.SetWindowDisplayAffinity?

AI-M-BOT commented 1 year ago

image

reproduced, my bad

Also i will recommend using grab function instead of using start(), start function create a new thread which doesn't benefit in performance (in Python) Taking screenshots using dxgi repeatly without sleeping will dramatically lower down fps of game. Also the sleep function author used in this project is not precise enough (which sleeps around 15ms at least), i will recommend the following function:

from ctypes import windll

def nanosleep(num: float) -> None:
    windll.winmm.timeBeginPeriod(1)
    windll.kernel32.Sleep(int(num))
    windll.winmm.timeEndPeriod(1)
crackwitz commented 1 year ago

nanosleep

that's not a nanosleep. that gives you whole milliseconds at best. and it affects the kernel.

https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod

AI-M-BOT commented 1 year ago

nanosleep

that's not a nanosleep. that gives you whole milliseconds at best. and it affects the kernel.

https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod

i know, just need to sleep as precise as possible, my function name means nothing, why care about it

AI-M-BOT commented 1 year ago

@AI-M-BOT Thanks for the tip about sleepnig but i still face this issue image after using camera.grab()

please use grab(region=region)

Agade09 commented 1 year ago

@AI-M-BOT Could you detail how you compile these .pyd? With Nuitka, I can't get binaries as fast as yours. My binaries also don't reproduce the bug ninjatall12 and I have been seeing. I have been using python -m nuitka --lto=yes --module dxcam --include-package=dxcam.

AI-M-BOT commented 1 year ago

should be fine now, just replace with new file https://github.com/AI-M-BOT/DXcam/releases/tag/1.1

AI-M-BOT commented 1 year ago

create a file named dxshot.py or whatever you prefer, copy the content of dxcam/init.py into dxshot.py, run cmd python -m nuitka --mingw64 --module --show-progress --no-pyi-file --remove-output --follow-import-to=dxcam dxshot.py

AI-M-BOT commented 1 year ago

should be fine now, just replace with new file https://github.com/AI-M-BOT/DXcam/releases/tag/1.1

still the same image I get this error either using cam.grab(region=region) or cam.get_latest_frame()

are you using Python 3.9? Is your testing script open source on github?

Exception ignored in: <function _compointer_base.__del__ at 0x000002A710E128B0>
Traceback (most recent call last):
  File "E:\embed_python\python39\Lib\site-packages\comtypes\__init__.py", line 956, in __del__
    self.Release()
  File "E:\embed_python\python39\Lib\site-packages\comtypes\__init__.py", line 1211, in Release
    return self.__com_Release()
OSError: exception: access violation writing 0xFFFFFFFFFFFFFFFF

I only got this and script still works. I just tested all versions with pure grab() and get_last_frame(), no issue on my laptop

ninjatall12 commented 1 year ago

Moved on to using c++ instead of python since it better fits my usage case. massive performance bump and a lot less resource intensive was quite a pain to get opencv to work with dxgi but got there in the end.

ParticleG commented 1 year ago

should be fine now, just replace with new file https://github.com/AI-M-BOT/DXcam/releases/tag/1.1

still the same image I get this error either using cam.grab(region=region) or cam.get_latest_frame()

are you using Python 3.9? Is your testing script open source on github?

Exception ignored in: <function _compointer_base.__del__ at 0x000002A710E128B0>
Traceback (most recent call last):
  File "E:\embed_python\python39\Lib\site-packages\comtypes\__init__.py", line 956, in __del__
    self.Release()
  File "E:\embed_python\python39\Lib\site-packages\comtypes\__init__.py", line 1211, in Release
    return self.__com_Release()
OSError: exception: access violation writing 0xFFFFFFFFFFFFFFFF

I only got this and script still works. I just tested all versions with pure grab() and get_last_frame(), no issue on my laptop

Hi, could you upload your version to pypi or provide a way that requirements.txt can use? Pycharm doesn't like .pyd files and keeps complaining about module not found errors.

lucasmonstrox commented 1 year ago

Moved on to using c++ instead of python since it better fits my usage case. massive performance bump and a lot less resource intensive was quite a pain to get opencv to work with dxgi but got there in the end.

Is possible to python code call your c++ code? Or use cython style?

lucasmonstrox commented 1 year ago

Moved on to using c++ instead of python since it better fits my usage case. massive performance bump and a lot less resource intensive was quite a pain to get opencv to work with dxgi but got there in the end.

Also, can you show me cpu usage, memory usage and FPS?