massung / r-cade

Retro Game Engine for Racket
https://r-cade.io
Other
273 stars 13 forks source link

Segmentation fault running r-cade on Rasbian #26

Open wu-lee opened 4 years ago

wu-lee commented 4 years ago

I've been attempting to get r-cade running on a Kano, which is essentially a Raspberry Pi running (a version of) Raspbian. Specifically:

The problem: I get a segmentation fault when I run all r-cade examples. See output below. The window opens, but nothing appears in it. After a few seconds, it disappears when the program dumps core.

I'm trying to track this down, but perhaps you might be able to offer some clues diagnosing this?

Here is the console output:

$ /opt/bin/racket twinkle.rkt 
Setting vertical sync not supported
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
SIGSEGV MAPERR si_code 1 fault on addr 0xac
Aborted (core dumped)

I think the "vertical sync not supported" message can be ignored as working cases of SFML print this too (see below). Likewise the "server socket err" is I think related to absence of a Jack server. Therefore I don't think these are related to the segmentation fault.

Aside: the stock version of Racket for Raspbian 9.6 is in the 6.x series and has other problems (no definition of vector-equals? which I infer means r-cade needs a 7.x version). As does the stock version of libcsfml (see #5). Hence I've had to work around these by getting newer packages elsewhere.

I've attempted to get a stack trace from these crashes, but not really successfully so far. For example (core from running the hello-world program example in the r-cade tutorials):

$ gdb /opt/bin/racket /var/tmp/core-hello.rkt-6-1003-1003-1590500685.dump
(gdb) 
#0  0x76d5c45c in raise () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#1  0x76d5d824 in abort () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#2  0x00000020 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Valgrind (this time running twinkle.rkt) seems to offer a bit more of a clue, but not much:

$ valgrind -v /opt/bin/racket twinkle.rkt 
==6177== Memcheck, a memory error detector
==6177== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6177== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==6177== Command: /opt/bin/racket twinkle.rkt
==6177== 
--6177-- Valgrind options:
--6177--    -v
--6177-- Contents of /proc/version:
--6177--   Linux version 4.14.79-v7+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1159 SMP Sun Nov 4 17:50:20 GMT 2018
--6177-- 
--6177-- Arch and hwcaps: ARM, LittleEndian, ARMv8-neon-vfp
--6177-- Page sizes: currently 4096, max supported 4096
--6177-- Valgrind library directory: /usr/lib/valgrind
--6177-- Reading syms from /opt/bin/racket
--6177--    object doesn't have a symbol table
--6177--   Reading EXIDX entries: 11 available
==6177==   Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10
==6177==   Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10
==6177==   Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10
==6177==   Warning: whilst reading EXIDX: ExtabEntryDecode: failed with error code: -10
==6177==   Warning: whilst reading EXIDX: Implausible EXIDX last entry size 161599; using 1 instead.
--6177--   Reading EXIDX entries: 7 attempted, 3 successful
--6177-- Reading syms from /lib/arm-linux-gnueabihf/ld-2.24.so
--6177--   Considering /usr/lib/debug/.build-id/97/ea942b1c123793352877a2fdb1197465de7fd7.debug ..
--6177--   .. build-id is valid
--6177-- Scheduler: using generic scheduler lock implementation.
--6177-- Reading suppressions file: /usr/lib/valgrind/default.supp
==6177== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-6177-by-Nick-on-???
==6177== embedded gdbserver: writing to   /tmp/vgdb-pipe-to-vgdb-from-6177-by-Nick-on-???
==6177== embedded gdbserver: shared mem   /tmp/vgdb-pipe-shared-mem-vgdb-6177-by-Nick-on-???
==6177== 
==6177== TO CONTROL THIS PROCESS USING vgdb (which you probably
==6177== don't want to do, unless you know exactly what you're doing,
==6177== or are doing some strange experiment):
==6177==   /usr/lib/valgrind/../../bin/vgdb --pid=6177 ...command...
==6177== 
==6177== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==6177==   /path/to/gdb /opt/bin/racket
==6177== and then give GDB the following command
==6177==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=6177
==6177== --pid is optional if only one valgrind process is running
==6177== 
--6177-- REDIR: 0x401af80 (ld-linux-armhf.so.3:strlen) redirected to 0x58057098 (???)
--6177-- REDIR: 0x401b9e0 (ld-linux-armhf.so.3:memcpy) redirected to 0x580570c4 (???)
--6177-- REDIR: 0x401ab6c (ld-linux-armhf.so.3:strcmp) redirected to 0x580571d0 (???)
--6177-- Reading syms from /usr/lib/valgrind/vgpreload_core-arm-linux.so
--6177--   Considering /usr/lib/valgrind/vgpreload_core-arm-linux.so ..
--6177--   .. CRC mismatch (computed e4037255 wanted 2e14c213)
--6177--    object doesn't have a symbol table
--6177-- Reading syms from /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so
--6177--   Considering /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so ..
--6177--   .. CRC mismatch (computed 62fde5dd wanted ec35a666)
--6177--    object doesn't have a symbol table
--6177-- Reading syms from /usr/lib/arm-linux-gnueabihf/libarmmem.so
--6177--    object doesn't have a symbol table
--6177-- Reading syms from /lib/arm-linux-gnueabihf/libdl-2.24.so
--6177--   Considering /usr/lib/debug/.build-id/bb/7b7177de87be11ac17ae040eae7ee041ad33f7.debug ..
--6177--   .. build-id is valid
--6177-- Reading syms from /lib/arm-linux-gnueabihf/libm-2.24.so
--6177--   Considering /usr/lib/debug/.build-id/a4/bad95ff6ac92945217b0bc96043090af61af91.debug ..
--6177--   .. build-id is valid
--6177-- Reading syms from /lib/arm-linux-gnueabihf/libgcc_s.so.1
--6177--    object doesn't have a symbol table
--6177--   Reading EXIDX entries: 34 available
--6177--   Reading EXIDX entries: 21 attempted, 21 successful
--6177-- Reading syms from /lib/arm-linux-gnueabihf/libc-2.24.so
--6177--   Considering /usr/lib/debug/.build-id/03/11755699fcc430cadc85f73d9aad326cd758a8.debug ..
--6177--   .. build-id is valid
--6177-- REDIR: 0x49c2f40 (libc.so.6:rindex) redirected to 0x484a9f4 (rindex)
--6177-- REDIR: 0x49be30c (libc.so.6:malloc) redirected to 0x48474dc (malloc)
--6177-- REDIR: 0x49c3f2c (libc.so.6:memchr) redirected to 0x484c704 (memchr)
--6177-- REDIR: 0x49c4450 (libc.so.6:memmove) redirected to 0x484ef9c (memmove)
--6177-- REDIR: 0x49c4790 (libc.so.6:memset) redirected to 0x484eeb0 (memset)
--6177-- REDIR: 0x49c4ae0 (libc.so.6:memcpy) redirected to 0x484ce44 (memcpy)
--6177-- REDIR: 0x49c2be0 (libc.so.6:strlen) redirected to 0x484b1e4 (strlen)
--6177-- REDIR: 0x49be964 (libc.so.6:free) redirected to 0x4848b00 (free)
--6177-- REDIR: 0x49c39c8 (libc.so.6:strstr) redirected to 0x4850220 (strstr)
--6177-- REDIR: 0x49c23fc (libc.so.6:strcmp) redirected to 0x484c434 (strcmp)
--6177-- REDIR: 0x49c2da8 (libc.so.6:strncmp) redirected to 0x484bb1c (strncmp)
--6177-- REDIR: 0x49bedb0 (libc.so.6:calloc) redirected to 0x4849c28 (calloc)
--6177-- REDIR: 0x49c667c (libc.so.6:strchrnul) redirected to 0x484f8ac (strchrnul)
--6177-- REDIR: 0x49c2c60 (libc.so.6:strnlen) redirected to 0x484b12c (strnlen)
==6177== Invalid write of size 4
==6177==    at 0xCDCE4: scheme_jit_add_symbol (in /opt/bin/racket)
==6177==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==6177== 
SIGSEGV MAPERR si_code 1 fault on addr 0x10
==6177== 
==6177== Process terminating with default action of signal 6 (SIGABRT): dumping core
==6177==    at 0x497845C: raise (raise.c:51)
==6177==    by 0x4979823: abort (abort.c:89)
==6177==    by 0x1CAF23: fault_handler (in /opt/bin/racket)
==6177== 
==6177== HEAP SUMMARY:
==6177==     in use at exit: 2,279,367 bytes in 1,830 blocks
==6177==   total heap usage: 1,852 allocs, 22 frees, 2,292,902 bytes allocated
==6177== 
==6177== Searching for pointers to 1,830 not-freed blocks
==6177== Checked 8,742,580 bytes
==6177== 
==6177== LEAK SUMMARY:
==6177==    definitely lost: 0 bytes in 0 blocks
==6177==    indirectly lost: 0 bytes in 0 blocks
==6177==      possibly lost: 81,920 bytes in 1 blocks
==6177==    still reachable: 2,197,447 bytes in 1,829 blocks
==6177==         suppressed: 0 bytes in 0 blocks
==6177== Rerun with --leak-check=full to see details of leaked memory
==6177== 
==6177== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 3)
==6177== 
==6177== 1 errors in context 1 of 1:
==6177== Invalid write of size 4
==6177==    at 0xCDCE4: scheme_jit_add_symbol (in /opt/bin/racket)
==6177==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==6177== 
--6177-- 
--6177-- used_suppression:      6 dl-hack3-cond-1 /usr/lib/valgrind/default.supp:1236
==6177== 
==6177== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 3)
Aborted (core dumped)

Examples of SFML working: the C++ Pong example in SFML's source-code seems to run if I compile it, as do the Racket CSFML library examples (although they might have locked up the system once). No seg fault.

I can also get a simple racket HTDP program to run, which opens a window and renders into it, so the problem doesn't seem to be racket or its usual graphics libraries.

However, I have run out of ideas of things to check easily for the moment!

massung commented 4 years ago

First off, this is really cool, and if you get this running I'd love to put some screenshots of it on the homepage (if you didn't mind)! :smile:

Since the Racket CSFML examples work, it most likely has to be something in the R-cade code. While bugs are possible, my gut tells me it's probably a feature being used that isn't supported on the Raspberry Pi or your setup isn't prepared for them. Some of those features:

wu-lee commented 4 years ago

First off, this is really cool, and if you get this running I'd love to put some screenshots of it on the homepage (if you didn't mind)! smile

Absolutely!

So I've checked:

[later] Hmm. Perhaps disabling audio doesn't even help with the segfault, as I get a one reliably now even though I've done it.

Checking on my desktop: running ex_3.rkt in the csfml source code opens a window and a yellow-bordered blue ball follows the mouse cursor. Examples 1 and 2 do too, although 1 doesn't even show anything and 2 just opens a red background window and waits for an escape key press.

Checking these on the Pi: 1 and 2 are still working as above. 3 opens a window, but the ball does not follow the mouse. Hacking it with a debug print, I can see the event loop is running, and events of some sort do arrive when the mouse cursor enters the screen. I've not yet worked out how to print these events' type.

Are there any more pure csfml examples I could try running?