Closed johan-bjareholt closed 9 years ago
Any updates on this? It's the same case with velox.
Have been running the old a33ff2c commit for a while now and it works great, i tried tackling this issue again today though.
First i tried running swc with gdb like this: swc-launch -- gdb -batch -ex run ./wm 2> gdb-log
Running on /dev/tty1
[swc:libswc/drm.c:160] DEBUG: /dev/dri/card0 is the primary GPU
# find_driver: Trying DRM driver `intel'
glamor: EGL version 1.4 (DRI2):
That wasn't very interesting
When i tried running it with valgrind like this however: swc-launch -- valgrind ./wm 2> valgrind-log
Running on /dev/tty2
==1412== Memcheck, a memory error detector
==1412== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1412== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==1412== Command: ./wm
==1412==
[swc:libswc/drm.c:160] DEBUG: /dev/dri/card0 is the primary GPU
# find_driver: Trying DRM driver `intel'
==1412== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==1412== at 0x5B7F330: __sendmsg_nocancel (in /usr/lib/libc-2.21.so)
==1412== by 0x5063780: ??? (in /usr/lib/libwayland-server.so.0.1.0)
==1412== by 0x506192E: wl_display_flush_clients (in /usr/lib/libwayland-server.so.0.1.0)
==1412== by 0x5061987: wl_display_run (in /usr/lib/libwayland-server.so.0.1.0)
==1412== by 0x4019BC: main (in /home/johan/Dropbox/Programming/Linux/swc/example/wm)
==1412== Address 0xa7d757f is 4,127 bytes inside a block of size 16,424 alloc'd
==1412== at 0x4C29F90: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1412== by 0x50638B1: ??? (in /usr/lib/libwayland-server.so.0.1.0)
==1412== by 0x5061D12: wl_client_create (in /usr/lib/libwayland-server.so.0.1.0)
==1412== by 0x4E4BEE8: ??? (in /usr/lib/libswc.so.0.0)
==1412== by 0x4E48946: swc_initialize (in /usr/lib/libswc.so.0.0)
==1412== by 0x40197F: main (in /home/johan/Dropbox/Programming/Linux/swc/example/wm)
==1412==
glamor: EGL version 1.4 (DRI2):
Looks interesting with the uninitialized bytes, atleast it points to something.
If it is 28f2da3 that broke things for you, that means that swc-launch is not receiving the USR2 signal which indicates that the VT has been switched to (and therefore refuses to open any devices on behalf of the compositor).
Try adding some debugging statements in handle_usr2() in launch/launch.c to see if you can narrow things down. Is the swc-launch you are using up-to-date? Can you post some system details? Are you using systemd and systemd-logind? Maybe those are somehow messing with VT acquisition?
Also, to aid your debugging, I would make sure you have access to another system so you can ssh in and kill the ./wm process so you don't have to reboot all the time.
I’m having similar symptoms. git revision a33ff2c works for me when not using rendering nodes.
However, the git commit right after that one, i.e. 192d691e06c01ad9f99a42e6e04c0ea959a7b7ed breaks things for me. I’m seeing the contents of tty1 (where I start swc-launch ./velox
), but they are frozen, i.e. the mouse cursor does not blink anymore. There is no reaction to any input.
Here’s the output of swc-launch -- strace -f -tt -o /tmp/strace.log -s2048 ./velox
in both cases:
http://t.zekjur.net/st.working.bz2 (commit a33ff2c) http://t.zekjur.net/st.broken.bz2 (commit 192d691e06c01ad9f99a42e6e04c0ea959a7b7ed)
I’m testing this on a ThinkPad X200 (i.e. intel graphics card) with Debian testing.
These are the package versions of all shared libraries that velox depends on:
for lib in $(ldd ./velox | cut -d '>' -f 2 | sed 's/^\s*//g' | cut -d ' ' -f 1); do dpkg -S $lib 2>&- | cut -d ':' -f 1; done > /tmp/libs
for lib in $(sort /tmp/libs | uniq); do dpkg -l "${lib}:amd64" | tail -1; done
ii libc6:amd64 2.19-15 amd64 GNU C Library: Shared libraries
ii libdrm2:amd64 2.4.58-2 amd64 Userspace interface to kernel DRM services -- runtime
ii libdrm-intel1:amd64 2.4.58-2 amd64 Userspace interface to intel-specific kernel DRM services -- runtime
ii libdrm-nouveau2:amd64 2.4.58-2 amd64 Userspace interface to nouveau-specific kernel DRM services -- runtime
ii libevdev2 1.3+dfsg-1 amd64 wrapper library for evdev devices
ii libexpat1:amd64 2.1.0-6+b3 amd64 XML parsing C library - runtime library
ii libffi6:amd64 3.1-2+b2 amd64 Foreign Function Interface library runtime
ii libfontconfig1:amd64 2.11.0-6.3 amd64 generic font configuration library - runtime
ii libfreetype6:amd64 2.5.2-3 amd64 FreeType 2 font engine, shared library files
ii libinput5:amd64 0.6.0+dfsg-2 amd64 input device management and event handling library - shared library
ii libmtdev1:amd64 1.1.5-1 amd64 Multitouch Protocol Translation Library - shared library
ii libpciaccess0:amd64 0.13.2-3+b1 amd64 Generic PCI access library for X
ii libpixman-1-0:amd64 0.32.6-3 amd64 pixel-manipulation library for X and cairo
ii libpng12-0:amd64 1.2.50-2+b2 amd64 PNG library - runtime
ii libudev1:amd64 215-12 amd64 libudev shared library
ii libwayland-client0:amd64 1.6.0-2 amd64 wayland compositor infrastructure - client library
ii libwayland-server0:amd64 1.6.0-2 amd64 wayland compositor infrastructure - server library
ii libxau6:amd64 1:1.0.8-1 amd64 X11 authorisation library
ii libxcb1:amd64 1.10-3+b1 amd64 X C Binding
ii libxcb-composite0:amd64 1.10-3+b1 amd64 X C Binding, composite extension
ii libxcb-icccm4:amd64 0.4.1-1 amd64 utility libraries for X C Binding -- icccm
ii libxcb-render0:amd64 1.10-3+b1 amd64 X C Binding, render extension
ii libxcb-shape0:amd64 1.10-3+b1 amd64 X C Binding, shape extension
ii libxcb-xfixes0:amd64 1.10-3+b1 amd64 X C Binding, xfixes extension
ii libxdmcp6:amd64 1:1.1.1-1+b1 amd64 X11 Display Manager Control Protocol library
ii libxkbcommon0:amd64 0.4.3-2 amd64 library interface to the XKB compiler - shared library
ii zlib1g:amd64 1:1.2.8.dfsg-2+b1 amd64 compression library - runtime
Let me know if you need any more information.
Can you check whether or not the handle_usr2
function in launch/launch.c is being called? That should be called when SIGUSR2 is sent to the launcher by kernel when the new VT gets switched to.
I’ve modified launch/launch.c like this:
--- i/launch/launch.c
+++ w/launch/launch.c
@@ -184,6 +184,8 @@ static void handle_usr2(int signal)
{
struct swc_launch_event event = { .type = SWC_LAUNCH_EVENT_ACTIVATE };
+fprintf(stderr, "handle_usr2\n");
+
ioctl(launcher.tty_fd, VT_RELDISP, VT_ACKACQ);
start_devices();
send(launcher.socket, &event, sizeof event, 0);
With git revision a33ff2c, I don’t see that message in stderr. I’ve added another message to make sure I’m not doing something wrong in installing swc or printing messages, and I do see that other message. So, no, handle_usr2()
is not being called.
I then did the same for git revision 192d691, and handle_usr2()
isn’t called in that revision either.
Let me know if you need more information.
Simply calling handle_usr2(SIGUSR2) instead of calling it through sigaction works great, we should probably fix sigaction instead of this dirty trick though, so we'll have to continue investigating.
Thanks, that really helps narrow it down.
I'm still not sure about the exact cause, but the VT mode that registers the acquire/release signals is set here, and the new VT is activated here.
The only thing I can think is maybe the new VT is the same as the old one, and the VT_ACTIVATE
call doesn't trigger the acquire signal.
Could you try adding print statements for the values of vt
and original_vt_state.vt
in setup_tty
? You should be able to debug with swc-launch -- sleep 2
without having to go to great lengths to recover your display.
Can you check if the launch_activate_fix branch (e7582a3cd68c3512be0f4fef93e637daed68193e) fixes your issue?
@michaelforney That branch indeed does fix the issue for me. Thanks!
@michaelforney Works for me too! Would be nice if you could merge and close.
So after updating swc, swc freezes my computer when starting the example wm. I don't know if the whole program freezes, but my screen freezes and i cannot change tty and cannot see anything responding. After trying out different commits and rebooting my computer a few times, this is how they work respectedly on my machine. The "partially broke" commit is a little strange, since the cursor works and it works to exit with mod+q, but nothing else than the mouse is drawn so i can still see the terminal text at the same time as the cursor which is a pretty cool artifact, but i cannot open the terminal (or any other application i would guess, since i cannot get dmenu-wl to run neither) like on the working commit. The broke commit fully freezes like i said earlier.
broke: 28f2da3b561eb03384d5bdb3b3361dd2b47f4194 partially broke: 1bd1820e59f4a9af588f9917e923346bb7d06e6a working: a33ff2c82819a30afed91d74feb9a7fde3ed9860
I could only try this out on my laptop with intel graphics, since the only other computer i have available has nvidia graphics so wayland isn't available.
I have a pull request ready for #20 that compiles, but i cannot try it out because of this.