danvd / wlroots-eglstreams

A modular Wayland compositor library with EGLStreams support
MIT License
106 stars 11 forks source link

Various segfaults #31

Closed git-bruh closed 2 years ago

git-bruh commented 2 years ago

Hi, I've been encountering some weird segfaults since some time, dumps are from latest commit 225adced7c3c66de3dde95068111c5e982dd9d78 of wlroots and 8fa7b99859066b9098acb158d08f7a060c3bf78e of sway. All of them seem to be related to bad values being passed to the linked list implementation in libwayland

  1. Segfault on switching tty:
λ gdb -c glibc_tty_switch_crash /usr/bin/sway
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `sway --my-next-gpu-wont-be-nvidia'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f33f42866bc in wl_list_remove (elm=elm@entry=0x56460373e510) at src/wayland-util.c:55
55  src/wayland-util.c: No such file or directory.
[Current thread is 1 (Thread 0x7f33f38fbd00 (LWP 224))]
(gdb) backtrace 
#0  0x00007f33f42866bc in wl_list_remove (elm=elm@entry=0x56460373e510) at src/wayland-util.c:55
#1  0x00007f33f42133b3 in wlr_session_close_file (session=<optimized out>, dev=0x56460373e500) at ../backend/session/session.c:329
#2  0x00007f33f46e1514 in  () at /usr/lib/libinput.so.10
#3  0x00007f33f46e183e in  () at /usr/lib/libinput.so.10
#4  0x00007f33f46e1cf3 in  () at /usr/lib/libinput.so.10
#5  0x00007f33f46d9fcf in libinput_dispatch () at /usr/lib/libinput.so.10
#6  0x00007f33f420a441 in handle_libinput_readable (fd=<optimized out>, mask=<optimized out>, _backend=0x56460341df40)
    at ../backend/libinput/backend.c:50
#7  0x00007f33f42849ed in wl_event_loop_dispatch (loop=0x564603416f60, timeout=timeout@entry=-1) at src/event-loop.c:1027
#8  0x00007f33f42833a4 in wl_display_run (display=0x56460341d580) at src/wayland-server.c:1351
#9  0x0000564602362637 in server_run (server=server@entry=0x5646023b43e0 <server>) at ../sway/server.c:285
#10 0x00005646023599db in main (argc=2, argv=0x7ffc822dfa48) at ../sway/main.c:397

There seems to be a similar segfault in firefox sometimes (Just randomly clicking on stuff and it happens, but very hard to reproduce):

Core was generated by `sway --my-next-gpu-wont-be-nvidia'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fca58bcb6f4 in wl_list_remove (elm=elm@entry=0x7fca55853f40) at src/wayland-util.c:55
55  src/wayland-util.c: No such file or directory.
[Current thread is 1 (LWP 24208)]
(gdb) backtrace 
#0  0x00007fca58bcb6f4 in wl_list_remove (elm=elm@entry=0x7fca55853f40) at src/wayland-util.c:55
#1  0x00007fca58b6789c in client_buffer_destroy (buffer=<optimized out>) at ../types/wlr_buffer.c:128
#2  0x00007fca58b7adc6 in surface_handle_resource_destroy (resource=<optimized out>) at ../types/wlr_surface.c:717
#3  0x00007fca58bc7eb7 in destroy_resource (element=element@entry=0x7fca553bb250, data=data@entry=0x0, flags=0) at src/wayland-server.c:724
#4  0x00007fca58bc7eff in wl_resource_destroy (resource=0x7fca553bb250) at src/wayland-server.c:741
#5  0x00007fca5861268a in  () at /lib/libffi.so.8
#6  0x00007fca586117c3 in  () at /lib/libffi.so.8
#7  0x00007fca58bcaae5 in wl_closure_invoke (closure=0x7fca509dca70, flags=2, target=<optimized out>, opcode=0, data=<optimized out>)
    at src/connection.c:1018
#8  0x00007fca58bc81b2 in wl_client_connection_data (fd=<optimized out>, mask=<optimized out>, data=0x7fca553e54e0) at src/wayland-server.c:432
#9  0x00007fca58bc9a26 in wl_event_loop_dispatch (loop=0x7fca58e60a00, timeout=timeout@entry=-1) at src/event-loop.c:1027
#10 0x00007fca58bc83dd in wl_display_run (display=0x7fca58f26e50) at src/wayland-server.c:1351
#11 0x000055da5bc586cf in server_run (server=server@entry=0x55da5bcaa400 <server>) at ../sway/server.c:285
#12 0x000055da5bc4f9eb in main (argc=2, argv=0x7ffef8bef8f8) at ../sway/main.c:397

Also firefox doesn't seem to like resizing very much, especially when watching a video in picture and picture mode and shows these errors in the log. This is with webrender, can be reproduced by resizing rapidly:

Image: https://0x0.st/-ggt.png (Yellow color is the sway background)

Crash Annotation GraphicsCriticalError: |[0][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=19.5938) |[106][GFX1]: Error in eglSetDamageRegion: 0x
3009 (t=22.0092) |[107][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=22.0094) |[93][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.8425) |[94]
[GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.8428) |[95][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.8589) |[96][GFX1]: Error in eglSet
DamageRegion: 0x3009 (t=21.8756) |[97][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.8759) |[98][GFX1]: Error in eglSetDamageRegion: 0x3009 (t
=21.9091) |[99][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.9094) |[100][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.9256) |[101][GFX1]
: Error in eglSetDamageRegion: 0x3009 (t=21.9424) |[102][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.9427) |[103][GFX1]: Error in eglSetDama
geRegion: 0x3009 (t=21.9758) |[104][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21.976) |[105][GFX1]: Error in eglSetDamageRegion: 0x3009 (t=21
.9924) [GFX1]: Error in eglSetDamageRegion: 0x3009

EDIT: Forgot to mention that sway says something along the lines of "Failed to close device 7 bad file descriptior" before the tty switching segfault

g4gg433 commented 2 years ago

I can switch ttys freely doesnt cause any problems here.

As for firefox:

This is with webrender,

Nvidia does not work well with GPU acceleration and wayland, not even under GNOME (which pretty much is the reference implementation of Nvidia Wayland). The only program that kind of works with hw acceleration is mpv. Everything else mostly fails silently. Disable webrenderer (and anything to do with gpu acceleration) and it should work fine (it does for me, git packages).

g4gg433 commented 2 years ago

Also firefox doesn't seem to like resizing very much, especially when watching a video in picture and picture mode and shows these errors in the log. This is with webrender, can be reproduced by resizing rapidly:

This is caused by the GPU running out of memory. It's a memory leak. If you resize a window it leaks memory. Try resizing a window and watch GPU memory (nvidia-smi). Then kill that window and again watch GPU memory. It works fine under GNOME.