Franklin-Zhang0 / Yolo-v8-Apex-Aim-assist

95 stars 20 forks source link

dxcam seems underperformed #2

Closed Chalkeys closed 1 year ago

Chalkeys commented 1 year ago

Thank you for sharing the code. I noticed that you are using DXCam to capture screenshots. However, it seems that the performance is not living up to its claimed potential, as the FPS appears to be settling between 30-34. Have you encountered this issue before? If so, do you have any suggestions for a potential solution? Thanks!

Here is my platform info: OS name: Windows 11 Pro OS architecture: 64 bits Resolutions: Monitor 1: 2560x1440 144Hz Python version: 3.10.0 Videocard: Gigabyte 2080 CPU: AMD Ryzen 7 3900x

Franklin-Zhang0 commented 1 year ago

The maximum fps should be about 70 fps, since I would wait a frame so that the mouse movement one frame ago can be reflected on the screen, which can avoid double move. Your fps seems to be lower than 70. There're two possible reasons for this.

  1. The inference speed limits the fps. If this is the cause, you would see a fps growth when you try a smaller "8n.trt" model. The fps on my machine when using "8s.trt" model and playing apex at the same time is about 25 fps. My graphic card is tesla p100, a little slower than yours. Therefore, it's very likely to be the inference limits.
  2. Your are testing the program in a still scene, the dxcam would return none in a still scene. In this case, I would keep recapturing untill new frame is acquired, therefore, lowering the fps.
Chalkeys commented 1 year ago

Well I was not there into the inference step yet and that's why I believe it's a dxcam issue. It is not uncommon as seen in some of these issues (this and this). The screenshots took average 0.03s which made the screenshot FPS somewhere around 30 and overall FPS around 20 (with .pt weight tho). But now it's solved. Since I can't reproduce the issue, here are some potential solutions (or at least partial solutions):

  1. I rolled back the pillow lib version to 9.2.0
  2. I reinstalled the ctype lib and dxcam (after encountering error importing ctypes.wintypes)

After these two steps I've got 90 FPS in capturing screen: Screenshot 2023-04-17 114817

  1. I replaced the dxcam with this dxshot branch of it and it's significantly faster: Screenshot 2023-04-17 131632

Therefore, I would suggest replacing the dxcam with dxshot or just use win32 api so that screencapture is no longer the bottleneck for higher performance computers and higher refresh rate users.

As of model selection, I'm still at the stage of finding a better way to convert .onnx to .trt with Windows11 since the method you provided is for Linux and Windows is giving me a headache... I will start a new issue dedicated to it I guess.

Franklin-Zhang0 commented 1 year ago

Thank you for the suggestion. I'll replace it later this week. I have provided the converted trt model in the model dir. If you want to convert it by yourself, this repo might be helpful.

Chalkeys commented 1 year ago

I believe Tensorrt engines are environment specific which means everyone has to create their own .trt or .engine from the pytorch weight unless the dev envs are completely identical. Forementioned repo as well as this repo are optimized for Linux and they both require a very specific combo of Window, CUDA, cuDNN and TensorRT versions to be able to work fine. I'm still trying to find a working cuDNN and TensorRT version for CUDA 11.8 or 12.0 as I do have other projects that need CUDA version to be updated.

Franklin-Zhang0 commented 1 year ago

Oh, I've learned a lot. Thanks for the explanation. Good luck for the environment setup.