Closed colfrog closed 8 months ago
Does it work if you turn off SBCL's floating point exception handling? Add #+SBCL (sb-int:set-floating-point-modes :traps nil)
to the first line of of the run-server
function in main.lisp:
(defun run-server ()
#+sbcl
(sb-int:set-floating-point-modes :traps nil)
; rest of the code
I've actually ran into this problem before: https://github.com/swaywm/wlroots/issues/1170. SBCL uses stricter floating point error handling than is on normally, which triggers exceptions in code that runs fine for other programs. We will need to double-check our code to make sure it isn't doing anything to cause this, but my guess is that it is an "issue" with the radeon driver or wlroots.
You can check for certain by running sway, tinywl, or another wlroots-based compositor under gdb with floating point exception handling turned on. If the exception occurs, it's probably not something we are doing.
We will probably run into this again, so just turning off FPE traps is probably the way to go.
Disabling traps on floating point exceptions works! I can run mahogany with the GLES2 renderer on X11 without issues.
But, opening Sway in LLDB after disabling pass on SIGFPE still works, so there is likely an issue with the hrt_server, but I can't begin to understand what it is.
Running mahogany with LLDB (and GDB) produces segmentation faults (Invalid permissions for mapped object). What is your recommended way to debug heart?
I've honestly not had to debug an issue like this; last time, I just turned the signal trap off and left it as is.
I just remembered that there is an example file at https://github.com/stumpwm/mahogany/blob/master/heart/example/main.c that does the bare minimum initialization (executable should be at build/heart/example/
). You could try using that to narrow down the problem, but I bet it won't trigger the exception and you'll have to manually walk through stack frames. If it does work, I'd just attribute the issue to how CL is more strict about floating point behavior than C.
For regular lisp code, I run mahogany interactively and use the debugger built into SBCL. Examining the stack frames might give you more information, but I don't think SBCL can read C debugger information. If you wanted to try something completely different, Clasp is supposed to play really nicely with gdb, but I haven't tried it myself and don't know if it runs under BSD.
The heart example does runs without issues even when SIGFPE set to nopass. I think #46 is the right approach to solving this.
This means that Lisp is more sensitive to floating point exceptions even when C programs don't emit a signal.
Closed via #46.
The WM runs fine under X11 when the pixman renderer is forced, but with GLES2 it fails in run-server, specifically in heart's hrt_server_start. It initializes the state correctly and is using amdgpu, reaches
Initialized heart state
. The last wlroots log is00:00:00.138 [render/gles2/renderer.c:149] Created GL FBO for buffer 1024x768
.Then I get the division by zero error in hrt-server-start. The first stack trace:
Then it fails again in hrt-server-finish. The second stack trace:
What's weird to me is that it calls radeon_drm_winsys_create, while I'm on an amdgpu card. It should be calling amdgpu_winsys_create instead. But EGL was initialized for AMDGPU.
I think that this is an issue with Mahogany because sway runs just fine on X11 with the GLES2 renderer.
This seems to be an issue with the hrt_server in heart and the way it initializes the renderer and DRM.