paperwm / PaperWM

Tiled scrollable window management for Gnome Shell
GNU General Public License v3.0
2.99k stars 128 forks source link

Crash when a window locks and releases the cursor #947

Open aecsocket opened 3 weeks ago

aecsocket commented 3 weeks ago

Describe the bug When a window locks the cursor and later releases it, this causes the cursor to stay locked, or puts it into some sort of illegal state, which then causes gnome-shell to crash.

To Reproduce I've managed to reproduce it most consistently on Wayland with a Rust Bevy app. Here is a reproduction repo: https://github.com/aecsocket/paperwm-crash-repro

Expected behavior It shouldn't crash.

Screenshots I would try to attach a video here, but I can't record with OBS since as soon as gnome-shell crashes, all windows are torn down as well, and OBS crashes. If anyone knows how I can record a video of my entire desktop outside of GNOME somehow, please let me know.

System information: Please provide system information:

Distribution: Bazzite 40.20240908.0 (Silverblue)
GNOME Shell: 46.4
Display server: Wayland
PaperWM version: 46.17.1
Enabled extensions:
- paperwm@paperwm.github.com

Additional context I'm not sure how to debug this issue properly, since it probably involves some interaction between PaperWM and gnome-shell, and probably isn't solely PaperWM's fault.

I haven't managed to reproduce this with 0 extensions running.

What I have managed to piece together is:

Core dump stack trace:

``` Stack trace of thread 22274: #0 0x00007f6632ea8664 __pthread_kill_implementation (libc.so.6 + 0x99664) #1 0x00007f6632e4fc4e raise (libc.so.6 + 0x40c4e) #2 0x000055ab76859ba6 dump_gjs_stack_on_signal_handler (gnome-shell + 0x4ba6) #3 0x00007f6632e4fd00 __restore_rt (libc.so.6 + 0x40d00) #4 0x00007f6632ea8664 __pthread_kill_implementation (libc.so.6 + 0x99664) #5 0x00007f6632e4fc4e raise (libc.so.6 + 0x40c4e) #6 0x00007f6632e37902 abort (libc.so.6 + 0x28902) #7 0x00007f6633ad210c g_assertion_message.cold (libglib-2.0.so.0 + 0x2010c) #8 0x00007f6633b3f397 g_assertion_message_expr (libglib-2.0.so.0 + 0x8d397) #9 0x00007f663315878c meta_wayland_pointer_constraint_deactivate (libmutter-14.so.0 + 0x15878c) #10 0x00007f66330cf3af event_callback.lto_priv.0 (libmutter-14.so.0 + 0xcf3af) #11 0x00007f6633387d5b _clutter_event_process_filters (libmutter-clutter-14.so.0 + 0x5dd5b) #12 0x00007f66333bef8d clutter_stage_notify_grab_on_pointer_entry (libmutter-clutter-14.so.0 + 0x94f8d) #13 0x00007f66333bf472 clutter_stage_notify_grab (libmutter-clutter-14.so.0 + 0x95472) #14 0x00007f66333bfd41 clutter_stage_unlink_grab (libmutter-clutter-14.so.0 + 0x95d41) #15 0x00007f66324e6056 ffi_call_unix64 (libffi.so.8 + 0x9056) #16 0x00007f66324e26a0 ffi_call_int.lto_priv.0 (libffi.so.8 + 0x56a0) #17 0x00007f66324e54ee ffi_call (libffi.so.8 + 0x84ee) #18 0x00007f663348008e _ZN3Gjs8Function6invokeEP9JSContextRKN2JS8CallArgsENS3_6HandleIP8JSObjectEEP11_GIArgument.localalias.lto_priv.0 (l> #19 0x00007f66334814b3 _ZN3Gjs8Function4callEP9JSContextjPN2JS5ValueE (libgjs.so.0 + 0x564b3) #20 0x00007f663167d044 _ZN2js23InternalCallOrConstructEP9JSContextRKN2JS8CallArgsENS_14MaybeConstructENS_10CallReasonE (libmozjs-115.so.0> #21 0x00007f6631686966 _ZN2js9InterpretEP9JSContextRNS_8RunStateE (libmozjs-115.so.0 + 0x86966) #22 0x00007f663167ca5b _ZN2js9RunScriptEP9JSContextRNS_8RunStateE (libmozjs-115.so.0 + 0x7ca5b) #23 0x00007f663167cf47 _ZN2js23InternalCallOrConstructEP9JSContextRKN2JS8CallArgsENS_14MaybeConstructENS_10CallReasonE (libmozjs-115.so.0> #24 0x00007f663167d3bd _ZN2js4CallEP9JSContextN2JS6HandleINS2_5ValueEEES5_RKNS_13AnyInvokeArgsENS2_13MutableHandleIS4_EENS_10CallReasonE > #25 0x00007f6631703391 _Z20JS_CallFunctionValueP9JSContextN2JS6HandleIP8JSObjectEENS2_INS1_5ValueEEERKNS1_16HandleValueArrayENS1_13Mutabl> #26 0x00007f6633474dc5 _ZN3Gjs7Closure6invokeEN2JS6HandleIP8JSObjectEERKNS1_16HandleValueArrayENS1_13MutableHandleINS1_5ValueEEE (libgjs.> #27 0x00007f66334b1434 _ZN3Gjs7Closure7marshalEP7_GValuejPKS1_PvS5_ (libgjs.so.0 + 0x86434) #28 0x00007f66335e364a g_closure_invoke (libgobject-2.0.so.0 + 0x1164a) #29 0x00007f66336135f3 signal_emit_unlocked_R.isra.0 (libgobject-2.0.so.0 + 0x415f3) #30 0x00007f6633604104 signal_emit_valist_unlocked (libgobject-2.0.so.0 + 0x32104) #31 0x00007f6633604361 g_signal_emit_valist (libgobject-2.0.so.0 + 0x32361) #32 0x00007f6633604423 g_signal_emit (libgobject-2.0.so.0 + 0x32423) #33 0x00007f66333cb592 clutter_text_activate (libmutter-clutter-14.so.0 + 0xa1592) #34 0x00007f663335adfd _clutter_marshal_BOOLEAN__STRING_UINT_FLAGS (libmutter-clutter-14.so.0 + 0x30dfd) #35 0x00007f66335e364a g_closure_invoke (libgobject-2.0.so.0 + 0x1164a) #36 0x00007f66333810b5 clutter_binding_pool_activate (libmutter-clutter-14.so.0 + 0x570b5) #37 0x00007f66333c2c03 clutter_text_key_press (libmutter-clutter-14.so.0 + 0x98c03) #38 0x00007f663335b300 _clutter_marshal_BOOLEAN__BOXED.part.0 (libmutter-clutter-14.so.0 + 0x31300) #39 0x00007f66335e364a g_closure_invoke (libgobject-2.0.so.0 + 0x1164a) #40 0x00007f6633613bd0 signal_emit_unlocked_R.isra.0 (libgobject-2.0.so.0 + 0x41bd0) #41 0x00007f6633603969 signal_emit_valist_unlocked (libgobject-2.0.so.0 + 0x31969) #42 0x00007f6633604361 g_signal_emit_valist (libgobject-2.0.so.0 + 0x32361) #43 0x00007f6633604423 g_signal_emit (libgobject-2.0.so.0 + 0x32423) ... and more ```

There's probably some more useful info that I can give eg GJS logs, or some other stack trace or logs, but I'm not sure what I'm looking for since I'm not experienced with debugging GNOME. If there's any other info I can give that's helpful to debug please let me know.

Also, if you know how I can record a video of this even though gnome-shell and OBS crashes, please let me know.

Lythenas commented 3 weeks ago

Also, if you know how I can record a video of this even though gnome-shell and OBS crashes, please let me know.

The only way to record a video that I can think of is either running it in a VM or a nested shell and recording that from the outside.

As for logs, you could try checking the logs of gnome-shell when it crashed in journalctl.

I also tried to reproduce it, but I couldn't get it to crash.

The only thing I noticed in the logs was that this was printed a bunch of times after quitting (not sure if it means anything):

Sep 09 21:09:14 anarchy gnome-shell[2191]: (../mutter/src/wayland/meta-wayland-pointer-constraints.c:478):should_constraint_be_enabled: runtime check failed: (meta_wayland_surface_is_xwayland (constraint->surface) || META_IS_WAYLAND_SUBSURFACE (constraint->surface->role))

I'm also on Gnome 46.4 wayland, but I'm on Arch. If it matters: I am using two monitors.

jtaala commented 3 weeks ago

Hey all, like @Lythenas mentioned, I can't reproduce this, let me know if I'm doing anything wrong:

Screencast from 2024-09-10 07-50-38.webm

Closing the window (with alt+f4):

Screencast from 2024-09-10 08-00-27.webm

I'm moving my mouse lots while the window is selected.

I suspect this relates to the video driver/card you're using. I note that cargo run wouldn't compile with I was running Integrated (I have a Nvidia / prime laptop). Once switching to Hybrid it would compile fine.

Not seeing any crash, but like @Lythenas, am seeing that error output in logs.

aecsocket commented 3 weeks ago

I feel like that meta-wayland-pointer-constraints is very closely related to the error that's happening here, since I get the same error message before the assertion fails and gnome-shell crashes. I can see this in journalctl right before the assertion fails and systemd-coredump prints the core dump info.

After a lot of trial and error, I have managed to reproduce the crash 2 more times, both times on Bazzite. In total, I can reproduce this on:

Video of the crash in a VM - you can see that I disabled all other gnome-shell extensions, and my mouse cursor disappears when I hover inside the VM after the crash, then it black screens. On my real machine, this kicks me back to the gdm login screen, but seems to crash even harder in the VM.

https://github.com/user-attachments/assets/be8a2772-582a-4e1d-ab7f-353d27426a58

Video of the same VM without PaperWM enabled:

https://github.com/user-attachments/assets/1e81e091-631c-4db8-b35b-6a356b9235ce

Reproduction steps:

Note that I could not reproduce the crash on a fresh image of Fedora Workstation 40 in a VM, on gnome-shell 46.0 (not 46.4).

Distribution: Fedora Linux 40 (Workstation Edition)
GNOME Shell: 46.0
Display server: Wayland
PaperWM version: 46.17.1
Enabled extensions:
- background-logo@fedorahosted.org
- paperwm@paperwm.github.com

This clearly points to Bazzite being the source of the issue here, but I'm confused on what kind of interaction Bazzite, PaperWM and gnome-shell could be having to cause this crash. I suppose the next step is to investigate what kind of changes Bazzite is making to GNOME, and see if one of those conflicts with what PaperWM is doing somehow? Although I feel like I'm starting to go out of my depth here.

KyleGospo commented 3 weeks ago

This clearly points to Bazzite being the source of the issue here

Bazzite dev here, replying to say I ran your example commands on a GNOME build with the only enabled extension being PaperWM and could not reproduce the crash.

Our GNOME is patched only with Ubuntu's triple buffering MR, and a ~6 line patch for changing Switcheroo's behavior when selecting the "alternative" GPU.

aecsocket commented 3 weeks ago

It may not be a patch but something else that Bazzite does that causes this. But I can definitely reproduce it in a VM - here is another video of the crash.

https://github.com/user-attachments/assets/754ac942-3078-4952-884d-b1431210412b

(Note that I fail to get the crash on the first attempt, and the app closes fine. On the second attempt it does crash.)

The host in this video is my desktop, with a nearly out-of-the-box Bazzite install running on bare metal (I only set up sunshine + moonlight so that I can play games on it from my laptop, and setup virtualization to run the VM). I've created another VM and installed Bazzite on it, then followed the same steps I outlined above. I ran ujust update to fully update the system before doing this test, so I can't see the problem being anything other than some sort of weird interaction between Bazzite/gnome-shell/PaperWM. It's also possible that I have some sort of weird hardware issue on both of my machines (and I definitely have had cursed issues before to do with my hardware), but I don't think that's likely.

This clearly points to Bazzite being the source of the issue here

Bazzite dev here, replying to say I ran your example commands on a GNOME build with the only enabled extension being PaperWM and could not reproduce the crash.

Our GNOME is patched only with Ubuntu's triple buffering MR, and a ~6 line patch for changing Switcheroo's behavior when selecting the "alternative" GPU.

Is it possible that you have some sort of extra settings changed related to GNOME which fixes this issue? The video above is from the stock Bazzite ISO - I selected "Virtual Machine" -> "AMD" -> "GNOME", and I downloaded that ISO today specifically to test this issue, so maybe you have some different configuration which causes this issue to not happen.