raspberrypi / bookworm-feedback

13 stars 1 forks source link

Mouse laggy under GPU load #65

Open Botspot opened 10 months ago

Botspot commented 10 months ago

This has been an issue since the earliest bookworm beta image release, and it continues to happen on the latest Bookworm public release. When video- or webgl-related tasks are heavily used on Chromium, the mouse cursor becomes very difficult to move. Tested on a Raspberry Pi 4. It's easy to detect when first loading a 1080p youtube video. In such situations, the mouse becomes unmoveable for multiple seconds at a time, while the rest of my 1080p screen continues to refresh as normal.

It is also easily noticeable on the webgl aquarium. http://webglsamples.org/aquarium/aquarium.html

This issue does not occur on Bullseye - mouse movement under any load is always perfectly smooth.

I'd imagine that what needs to happen is for mouse movements to be given priority.

qrp73 commented 10 months ago

I think this is not because high GPU load, but because some issue in firmware/driver.

This is because I have high GPU load in my OpenGL app with no freezes. But at the same time I can see major freezes issue when start hardinfo and select Sensors tab. The same freezes happens in chromium-browser on telegram pages with many video. Also it happens in firefox when hw-acceleration is enabled.

Steps to reproduce: 1) install hardinfo:

sudo apt install hardinfo

2) start hardinfo 3) select Sensor tab

Expected result: no freezes

Actual result: there are major freezes, mouse cursor and all apps are frozen for 0.5 sec every second - BUG

With Bullseye there is no such freezes on hardinfo/Sensors tab

lurch commented 10 months ago

I see the same "pauses" in mouse-movement (when hardinfo is displaying the Sensors tab) on a Pi 5 too, but the pauses seem to be 0.25 second rather than the 0.5 second on a Pi 4. Same behaviour on 32-bit / 64-bit Bookworm, and makes no difference whether the mouse is plugged into a USB2 port or a USB3 port.

And on the Pi 5, there's no mouse slowdown whatsoever when viewing the WebGL aquarium.

ManOfDiamond commented 9 months ago

Hello, anything new for this particular issue?

Botspot commented 9 months ago

Nope, it still happens.

Botspot commented 8 months ago

I have found that the same issue occurs on RPi5, though it is harder to spot as I cannot cause any multi-second freezes. But during any window open/close animation, the mouse pointer slows down to maybe 5 fps. It is easy to spot if you are moving the mouse while a window is opening.

This is being discussed on the Bookworm feedback thread right now. User dom says:

Disabling acceleration in the browser will avoid the mouse lags.

The 3d block does one job at a time, and if the browser requests a "big" job, then desktop composition (that includes mouse pointer updates) will be held up.

The simple solution of allowing mouse updates to run asynchronously is technically feasible (we actually composite the mouse pointer as a separate hardware plane), but while that makes the mouse pointer smooth, any part of the desktop attached to that (e.g. a window being dragged) will still lag behind, so it may not actually be preferable.

We are looking into whether sub-dividing "big" 3d jobs is feasible, to keep the higher priority interactive updates responsive. It will be a trade-off (increasing overhead on the "big" job vs reducing latency of smaller jobs).

ghollingworth commented 8 months ago

Yes, this is a "many-month" issue not a quick patch...

sudo-splinter-cell commented 8 months ago

This is a wayland specific issue as i do not sense any mouse lag when i am using X11. If i switch to wayland, just like Botspot said, especially during web browsing (Chromium or Firefox, doesn't matter) mouse cursor starts to lag or for a better term "stutter" especially if a graphical web page is open, even when it is running in the background. This issue happens regardless of the desktop environment and compositor that is used (i am on KDE plasma (compositor is kwin) and i still experience it, "but only on wayland"). On my x86 laptop (which is quite old), wayland does not cause any such issue. Mouse is very responsive. So i am assuming, this is an issue that is specific to the interaction between the wayland compositor and the RPI firmware (gpu?).

popcornmix commented 8 months ago

So i am assuming, this is an issue that is specific to the interaction between the wayland compositor and the RPI firmware (gpu?).

Firmware is not involved. The 3d hardware (which handles composition under wayland) is purely controlled by mesa + kernel which run on the arm.

qrp73 commented 8 months ago

I think it may be related to memory allocation, because exactly the same mouse lags with system freezes happens when I tested aggressive memory allocation.

Sometimes malloc call freezes the system for up to several seconds, and then it continue to run. Interestingly that it happens even if there is available free memory but about a half of memory is allocated. Probably it happens when memory is fragmented and needs to freeze all malloc calls until memory defragmentation will be done.

Another finding is that processes which keep their memory and don't call malloc for a new memory allocation still able to run with no freeze during these freeze time ... So it seems like freeze happens only for processes who calling malloc...

Also I found that pcmanfm or wayfire processes requesting memory allocation too often, it happens each time when some window on the desktop is changed. I think this is the reason why total desktop freeze happens when out of memory happens.

So, I think it may be possible that this freeze issue for hw acceleration is related with memory allocation...

There is definitely something strange happens, during malloc call, because sometimes it leads to a very long delays...

sudo-splinter-cell commented 8 months ago

I think it may be related to memory allocation, because exactly the same mouse lags with system freezes happens when I tested aggressive memory allocation.

Sometimes malloc call freezes the system for up to several seconds, and then it continue to run. Interestingly that it happens even if there is available free memory but about a half of memory is allocated. Probably it happens when memory is fragmented and needs to freeze all malloc calls until memory defragmentation will be done.

Another finding is that processes which keep their memory and don't call malloc for a new memory allocation still able to run with no freeze during these freeze time ... So it seems like freeze happens only for processes who calling malloc...

Also I found that pcmanfm or wayfire processes requesting memory allocation too often, it happens each time when some window on the desktop is changed. I think this is the reason why total desktop freeze happens when out of memory happens.

So, I think it may be possible that this freeze issue for hw acceleration is related with memory allocation...

I don't know if this is related to what you say but when i am copying files in between drives or downloading a big file in Chromium for example, i can sense the whole system is somehow slowing down or freeze occasionally. Is that what you are referring to? Also, i am on a 4GB Pi-5 and Chromium fills the whole memory in like 15 minutes of browsing. I actually increased my swap to 2GB just to prevent crashes (yes once the memory is full and chromium is still running, it will crash the whole system).

I don't know if this helps in this bug report but i am just trying to provide as much info as i could.

qrp73 commented 8 months ago

I don't know if this is related to what you say but when i am copying files in between drives or downloading a big file in Chromium for example, i can sense the whole system is somehow slowing down or freeze occasionally. Is that what you are referring to?

Yes, I think it is possible that it also related to memory allocation.

I actually increased my swap to 2GB just to prevent crashes (yes once the memory is full and chromium is still running, it will crash the whole system).

I found that this is not crash of whole system, it just leads to freeze wayfire/pcmanfm process and as result no way to do something. But you can kill process with largest memory allocation with Alt+SysRq+F shortkey, it will resume the system.

As I mention before, processes which don't call malloc function can continue to run during these freeze with no issue.

TheOtherMarcus commented 6 months ago

I see this as well when running a Python Tk application with a lot of screen activity. Maybe I can contribute to the solution with a couple of observations.

I will be very happy when the solution is found.

ghollingworth commented 6 months ago

So, this occurs because:

The GPU is shared between the application (Chromium for example) and the compositor (Wayfire) to draw the output image that you want to send to the hardware through DRM. The mouse is submitted to the DRM planes separately, but it is done as part of the same update. For example to update the output frame you do something like (in Wayfire):

while (always) {
  wait_for_composition_start_time()
  b = get_background_image()
  if (( c = chromium_image_updated()) != null) {
    new_frame = start_new_composition(b, c)
    drm_submit(new_frame)
    drm_submit(mouse_pointer)
  }
}

The system tries to make sure it starts the composition process at the last possible time before the VSYNC, so if last time it took 3ms to compose the image it will start the process 13.667ms after the last VSYNC (60fps == 16.667 ms). The idea is to make sure you get the 'most recent' frame at the output (to reduce the latency).

The composition (i.e. drawing the chromium image onto the desktop background) is done in the background into the new_frame so start_new_composition will complete immediately. The DRM subsystem will only actually wait until all buffers in the submitted update have been released (i.e. the compositor has finished writing to them). As such, this means the mouse update will also have to wait until the composition of the frame is finished.

Since the user application (Chromium in this example) and Wayfire are both using the GPU to do the work it means if the user is using the GPU to do something really complex, it will delay the composition of the new_frame which will then delay the update of the mouse pointer.

There are two possible solutions to this problem, the first is to change the way the GPU is loaded up to break up user application work so the compositor can get in. This means a fairly significant amount of Mesa driver work. Or we implement some kind of hack to wait until the last few microseconds to check the frames and if they're not ready to go, just submit the mouse update...

It's a work in progress, but nothing simple...

sudo-splinter-cell commented 6 months ago

KDE Plasma 6 claims to have solved this problem. It is quite new (beta even) so i can not try it. Hopefuly it will be available as Debian/Ubuntu packages so we can see it for ourselves.

ghollingworth commented 6 months ago

Where do they suggest it is solved? Do you have a link?

sudo-splinter-cell commented 6 months ago

Where do they suggest it is solved? Do you have a link?

https://discuss.kde.org/t/does-wayland-really-break-everything/9083/35

https://discuss.kde.org/t/does-wayland-really-break-everything/9083/34

A KDE dev's blog post about it: https://zamundaaa.github.io/wayland/2023/08/29/getting-rid-of-cursor-stutter.html

tomachinz commented 4 months ago

The system tries to make sure it starts the composition process at the last possible time before the VSYNC, so if last time it took 3ms to compose the image it will start the process 13.667ms after the last VSYNC (60fps == 16.667 ms). The idea is to make sure you get the 'most recent' frame at the output (to reduce the latency).

Coming from the C=64 and Amiga concept of "sprites" which are icon sized 32x32 pixel hardware accelerated planes always rendered on top of the viewport, I was under the impression the SVGA and modern graphics cards would also have some kind of sprite support is that NOT the case? This would explain the terrible mouse lag in Windows 3.11 and Windows 95. NT 4.0 seemed to have no lag mind.

Or is the hardware sprites in all 2D and 3D graphics cards not used by Wayland? Makes me wonder how my iMac 2009 was able to always have rock solid 60 fps mouse at high CPU load, but my AMD Ryzen beast is able to become lagged (see video). On Motorola 68k based Macs, Apple used the hardware interupt line into the CPU to update the mouse sprite resulting in always smooth mouse under load, I'm guessing USB does not have hardware interrupts with the same "sprite updating" capability. Due to lack of sprites. https://www.youtube.com/shorts/4CwU2u4gK6k

TheOtherMarcus commented 4 months ago

I switched to X11. Mouse works great.

sudo raspi-config, look for Wayland and select X11.