resttime / cl-liballegro

Common Lisp bindings and interface to the Allegro 5 game programming library
zlib License
54 stars 12 forks source link

al:create-display doesn't work on Mac #27

Closed kchanqvq closed 2 years ago

kchanqvq commented 3 years ago

Following notes in https://github.com/resttime/cl-liballegro/pull/18 I tried the following

(ql:quickload "cl-liballegro")
(defvar display)
(cffi:defcallback main :void ()
  (al:init)                                    ; al_init();
  (al:init-primitives-addon)                   ; al_init_primitives_addon()
  (al:set-new-display-flags '(:windowed :resizable :opengl)) ; al_set_new_display_flags(ALLEGRO_WINDOWED | ALLEGRO_RESIZABLE);
  (al:set-new-display-option :vsync 0 :require) ; al_set_new_display_option(ALLEGRO_VSYNC, 1, ALLEGRO_REQUIRE);
  (setf display (al:create-display 800 600))
  (al:uninstall-system))
(al:run-main 0 (cffi:null-pointer) (cffi:callback main))

It fails with an FLOATING-POINT-INVALID-OPERATION, backtrace:

0: ("bogus stack frame")
1: ("foreign function: __invokeRunLoopInModeForDuration_block_invoke_2")
2: ("foreign function: invokeRunLoopInModeForDuration")
3: ("foreign function: __29-[NSCFRunLoopSemaphore wait:]_block_invoke_2")
4: ("foreign function: __29-[NSCFRunLoopSemaphore wait:]_block_invoke")
5: ("foreign function: +[NSCFRunLoopSemaphore _observe:whilePerforming:]")
6: ("foreign function: -[NSCFRunLoopSemaphore wait:]")
7: ("foreign function: -[NSCFRunLoopSemaphore wait]")
8: ("foreign function: _ensureAuxServiceAwareOfHostApp")
9: ("foreign function: _dispatch_call_block_and_release")
10: ("foreign function: _dispatch_client_callout")
11: ("foreign function: _dispatch_main_queue_callback_4CF")
12: ("foreign function: __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__")
13: ("foreign function: __CFRunLoopRun")
14: ("foreign function: CFRunLoopRunSpecific")
15: ("foreign function: RunCurrentEventLoopInMode")
16: ("foreign function: ReceiveNextEventCommon")
17: ("foreign function: _BlockUntilNextEventMatchingListInModeWithFilter")
18: ("foreign function: _DPSNextEvent")
19: ("foreign function: -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]")
20: ("foreign function: -[NSApplication run]")
21: ("foreign function: _al_osx_run_main")
22: (CL-LIBALLEGRO:RUN-MAIN :INVALID-VALUE-FOR-UNESCAPED-REGISTER-STORAGE #.(SB-SYS:INT-SAP #X00000000) :INVALID-VALUE-FOR-UNESCAPED-REGISTER-STORAGE)
23: (SB-INT:SIMPLE-EVAL-IN-LEXENV (CL-LIBALLEGRO:RUN-MAIN 0 (CFFI-SYS:NULL-POINTER) (CFFI:CALLBACK MAIN)) #<NULL-LEXENV>)
24: (EVAL (CL-LIBALLEGRO:RUN-MAIN 0 (CFFI-SYS:NULL-POINTER) (CFFI:CALLBACK MAIN)))
25: (INTERACTIVE-EVAL (CL-LIBALLEGRO:RUN-MAIN 0 (CFFI-SYS:NULL-POINTER) (CFFI:CALLBACK MAIN)) :EVAL NIL)
26: (SB-IMPL::REPL-FUN NIL)
27: ((LAMBDA NIL :IN SB-IMPL::TOPLEVEL-REPL))
28: (SB-IMPL::%WITH-REBOUND-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-IMPL::TOPLEVEL-REPL) {1002AB50DB}>)
29: (SB-IMPL::TOPLEVEL-REPL NIL)
30: (SB-IMPL::TOPLEVEL-INIT)
31: ((FLET SB-UNIX::BODY :IN SB-IMPL::START-LISP))
32: ((FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-IMPL::START-LISP))
33: (SB-IMPL::START-LISP)

If I remove create-display, it doesn't produce error.

The above is run from SBCL in terminal. If from SLY in Emacs, it errors into ldb with following log

2021-03-30 21:58:35.845 sbcl[77442:31477097] *** Assertion failure in +[NSUndoManager _endTopLevelGroupings], /BuildRoot/Library/Caches/com.apple.xbs/Sources/Foundation/Foundation-1575.300/Foundation/Misc.subproj/NSUndoManager.m:361
2021-03-30 21:58:35.845 sbcl[77442:31477097] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '+[NSUndoManager(NSInternal) _endTopLevelGroupings] is only safe to invoke on the main thread.'
*** First throw call stack:
(
)
libc++abi.dylib: terminating with uncaught exception of type NSException
fatal error encountered in SBCL pid 77442 pthread 0x700001aee000:
SIGABRT received.

I'm on macOS 10.14.6 and liballegro 5.2.7.0.

@lockie do you have a minimal example working under MacOS?

kchanqvq commented 3 years ago

Following https://github.com/resttime/cl-liballegro/issues/8 I tried using the lispy-interface example.

Running (main) gives Memory fault at 0x0 with backtrace

Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {1001580143}>
0: (SB-DI::SUB-ACCESS-DEBUG-VAR-SLOT #.(SB-SYS:INT-SAP #X0E05F680) 72 #<SB-ALIEN-INTERNALS:ALIEN-VALUE :SAP #X0E05F160 :TYPE (* (SB-ALIEN:STRUCT SB-VM::OS-CONTEXT-T-STRUCT))>)
1: (SB-KERNEL:INTERNAL-ERROR #.(SB-SYS:INT-SAP #X0E05F160) #<unused argument>)
2: ("foreign function: call_into_lisp")
3: ("foreign function: funcall2")
4: ("foreign function: interrupt_internal_error")
5: ("foreign function: signal_emulation_wrapper")
6: ("bogus stack frame")
7: ("foreign function: _dispatch_sync_f_slow")
8: ("foreign function: create_display_win")
9: ("foreign function: al_create_display")
10: (CL-LIBALLEGRO:CREATE-DISPLAY :INVALID-VALUE-FOR-UNESCAPED-REGISTER-STORAGE :INVALID-VALUE-FOR-UNESCAPED-REGISTER-STORAGE)
11: ((:METHOD CL-LIBALLEGRO:INITIALIZE-DISPLAY (T)) #<WINDOW {10023F6163}>) [fast-method]
12: ((:METHOD CL-LIBALLEGRO::INITIALIZE-SYSTEM (T)) #<WINDOW {10023F6163}>) [fast-method]
13: (CL-LIBALLEGRO:RUN-SYSTEM #<WINDOW {10023F6163}>)
14: (SB-INT:SIMPLE-EVAL-IN-LEXENV (MAIN) #<NULL-LEXENV>)
15: (EVAL (MAIN))
16: (INTERACTIVE-EVAL (MAIN) :EVAL NIL)
17: (SB-IMPL::REPL-FUN NIL)
18: ((LAMBDA NIL :IN SB-IMPL::TOPLEVEL-REPL))
19: (SB-IMPL::%WITH-REBOUND-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-IMPL::TOPLEVEL-REPL) {100227FB4B}>)
20: (SB-IMPL::TOPLEVEL-REPL NIL)
21: (SB-IMPL::TOPLEVEL-INIT)
22: ((FLET SB-UNIX::BODY :IN SB-IMPL::START-LISP))
23: ((FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-IMPL::START-LISP))
24: (SB-IMPL::START-LISP)

Running using al:run-main gives FLOATING-POINT-INVALID-OPERATION with similar backtrace to the minimal example.

resttime commented 3 years ago

Without an OSX machine I can't provide too much support unfortunately. OSX tends to be wonky with threads and GUI stuffs. Although one thing comes to mind that's been came across already.

First is messing around with evaluating SB-INT:SET-FLOATING-POINT-MODES or SB-INT:WITH-FLOAT-TRAPS-MASKED, which I think might be the issue since it looks like the interface hasn't been updated to use AL:RUN-MAIN on OSX. The offending forms need to have floating point traps masked for them. On SBCL:

;; Sets traps globally
(sb-int:set-floating-point-modes :traps (:invalid :inexact :overflow))
;; Do whatever after this

or

;; Sets traps for forms that the macro wraps
(sb-int:with-float-traps-masked (:invalid :inexact :overflow)
  (cffi:defcallback main :void ()
    (al:init)                                    ; al_init();
    (al:init-primitives-addon)                   ; al_init_primitives_addon()
    (al:set-new-display-flags '(:windowed :resizable :opengl)) ; al_set_new_display_flags(ALLEGRO_WINDOWED | ALLEGRO_RESIZABLE);
    (al:set-new-display-option :vsync 0 :require) ; al_set_new_display_option(ALLEGRO_VSYNC, 1, ALLEGRO_REQUIRE);
    (setf display (al:create-display 800 600))
    (al:uninstall-system))
  (al:run-main 0 (cffi:null-pointer) (cffi:callback main)))

Second there happens to be a portability library @lockie uses for d2clone-kit used here for example called float-features written by @Shinmera This can be used in place of the macro example I've written above.

Let me know if it goes, I'll do what I can to fix the library (or add documentation) if a solution is found :+1:

kchanqvq commented 3 years ago

Ha, this works, thanks! I read the lispy-interface source and thought it masked it but it still somehow doesn't work. I think it's that initialize-system calls some stuff that also need to be with-float-traps-masked (currently only system-loop is with-float-traps-masked). I'll test it tomorrow.

lockie commented 3 years ago

Right, float-features to the rescue. The general rule is mask those nasty FP traps as early as possible; I think the most appropriate way is just to do something like

(float-features:with-float-traps-masked
        (:divide-by-zero :invalid :inexact :overflow :underflow)
      (al:run-main 0 (cffi:null-pointer) (cffi:callback my-main)))))
kchanqvq commented 3 years ago

I added the following to interface.lisp and lispy-interface now works from terminal.

(defvar *system*)
(defcallback run-system-main :void ()
  (initialize-system *system*)
  (unwind-protect
       (system-loop *system*)
    (al:destroy-display (display *system*))
    (al:destroy-event-queue (event-queue *system*))
    (al:stop-samples)
    (cffi:foreign-free (event *system*))
    (al:uninstall-system)))
(defun run-system (system)
  (setq *system* system)
  (float-features:with-float-traps-masked
      (:divide-by-zero :invalid :inexact :overflow :underflow)
    (run-main 0 (null-pointer) (callback run-system-main))))

It still crashes from Emacs complaining *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '+[NSUndoManager(NSInternal) _endTopLevelGroupings] is only safe to invoke on the main thread.'

Apparently something has to be run from main thread. Let me try putting in some trivial-main-thread.

kchanqvq commented 3 years ago

Ha,

(defun run-system (system)
  (setq *system* system)
  (trivial-main-thread:with-body-in-main-thread ()
    (float-features:with-float-traps-masked
        (:divide-by-zero :invalid :inexact :overflow :underflow)
      (run-main 0 (null-pointer) (callback run-system-main)))))

works from emacs now.

The only remaining problem is it brings down the whole Lisp process after I closes the window, which disrupts the normal live-system work flow a lot. Any ideas on that?

lockie commented 3 years ago

I would argue the initial problem (Terminating app due to uncaught exception 'NSInternalInconsistencyException') is related to the way sly/slime handles child processes - most probably it creates extra thread in which it feeds the commands from REPL, the bits of code from C-c C-c etc., and al:run-main needs to be run in the main thread. I'm not a big elisp expert though. Try asking Shinmera about the trivial-main-thread, perhaps it could be made more compatible with Emacs in that regard.

kchanqvq commented 3 years ago

Yes I've figured that out in https://github.com/resttime/cl-liballegro/issues/27#issuecomment-811369835

Now the only remaining issue is whole Lisp process dies after the window finishes.

resttime commented 3 years ago

There's two things that I can think of to try.

First would be trying the Clozure Common Lisp (CCL) implementation. I dunno if it's true but I've always been under the impression CCL is better supported for OSX.

Second could be to try running the Common Lisp image as a separate process, run a swank server on it, and then connect to that via SLY/SLIME. Here's documentation to connecting with SLIME remotely but it should have the info needed.

resttime commented 3 years ago

For now more documentation has been added to the gh-pages branch surrounding this: https://resttime.github.io/cl-liballegro/

(I should prob combine this with the master branch README at some point actually)

kchanqvq commented 3 years ago

I think the whole Lisp process dies problem is irrelevant to SLY/SLIME because it also happens when running from terminal.

Just in case, does this also happen for other platform (so is it a bug or feature)?

resttime commented 3 years ago

Makes sense, and the issue doesn't appear on Linux with cl-liballegro. On Windows I believe it's the same.

What's left that I can think of is to investigate other Common Lisp libraries in the GUI/Graphics space that support OSX. They have also encountered OSX issues and may have workarounds. And as lockie suggested, Shinmera might be able to help.

resttime commented 2 years ago

Oh, I think the PR #28 fixes this so the issue can be closed.