mrosset / scheme

evaluate guile scheme from go language
5 stars 1 forks source link

suspect sigsegv in scm_call_n #1

Open jhelberg opened 1 year ago

jhelberg commented 1 year ago

I see a crash in scm_call_n once in a while (called from scm_public_ref, which is called from Eval) and was wondering if the thread-model of guile maybe collides with the goroutines of go? I can't reproduce the crash in a test-program, but the production-program runs 12 to 14 scheme-expressions and then crashes.

Before -g compiling guile to step through the code, I was wondering whether it is actually safe to mix the two coroutines/threads without taking special precautions.

The code in scm_call_n is (amd64 code):

Dump of assembler code for function scm_call_n:
   0x00007ffff7f25100 <+0>: endbr64 
   0x00007ffff7f25104 <+4>: push   %r14
   0x00007ffff7f25106 <+6>: push   %r13
   0x00007ffff7f25108 <+8>: push   %r12
   0x00007ffff7f2510a <+10>:    mov    %rdx,%r12
   0x00007ffff7f2510d <+13>:    push   %rbp
   0x00007ffff7f2510e <+14>:    mov    %rdi,%rbp
   0x00007ffff7f25111 <+17>:    push   %rbx
   0x00007ffff7f25112 <+18>:    mov    %rsi,%rbx
   0x00007ffff7f25115 <+21>:    sub    $0x100,%rsp
   0x00007ffff7f2511c <+28>:    data16 lea 0x5ad54(%rip),%rdi        # 0x7ffff7f7fe78
   0x00007ffff7f25124 <+36>:    data16 data16 rex.W call 0x7ffff7e82ab0 <__tls_get_addr@plt>
   0x00007ffff7f2512c <+44>:    lea    0x28(%rsp),%rdx
   0x00007ffff7f25131 <+49>:    mov    (%rax),%rax
   0x00007ffff7f25134 <+52>:    mov    %rax,(%rsp)
=> 0x00007ffff7f25138 <+56>:    mov    0x230(%rax),%rax
   0x00007ffff7f2513f <+63>:    mov    %rax,0x8(%rsp)
   0x00007ffff7f25144 <+68>:    sub    %rdx,%rax
   0x00007ffff7f25147 <+71>:    sar    $0x3,%rax
   0x00007ffff7f2514b <+75>:    cmp    0x5be7e(%rip),%rax        # 0x7ffff7f80fd0
   0x00007ffff7f25152 <+82>:    jbe    0x7ffff7f25165 <scm_call_n+101>
   0x00007ffff7f25154 <+84>:    mov    0x5ab45(%rip),%rax        # 0x7ffff7f7fca0

the code crashes in mov 0x230(%rax),%rax, because rax contains 0. It looks like this is the stack-checking code whith uses the current_thread.

if you experienced this before, or can straight tell me: don't do goroutines, as it cannot work, please do. I'll go further with debugging as the alternative (littlescheme) is too small-featured.

regards

jhelberg commented 1 year ago

I'm using go 1.20 and guile 3.0 (2.2 has the same symptoms).

jhelberg commented 1 year ago

have a reproducable case now, doing the query is necessary, going through the rows as well. The code:

package main import ( "log" "fmt" "os" "database/sql" "github.com/lib/pq" "github.com/mrosset/scheme" ) func main() { var globalschemefuncs = []string{ "(define eerawrapper (lambda (res) (if (number? res) (number->string res) res)))", "(define eeravars '())", } res, := scheme.Eval( "(version)" ) log.Printf( "Scheme version %s", res.String() ) dbinfo := fmt.Sprintf( "host=hh-pgsql-public.ebi.ac.uk port=5432 user=reader password=%s dbname=pfmegrnargs sslmode=disable", "NWDMCE5xdipIjRrp" ) if db, err := sql.Open( "postgres", dbinfo ); err != nil { log.Printf( "(%s) Error preparing connection to database, err: %v", os.Args[ 0 ], err ) return } else { defer db.Close() if rows, err := db.Query( "SELECT upi, taxid, ac FROM xref WHERE ac IN ('OTTHUMT00000106564.1', 'OTTHUMT00000416802.1')" ); err != nil { log.Printf( "(%s) Error running a query: %v", os.Args[ 0 ], err ) return } else { for rows.Next() { } } } for , expr := range globalschemefuncs { if , err := scheme.Eval( expr ); err != nil { log.Printf( "(%s) Error evaluating %s: %v", os.Args[0], expr, err ) } } i := 0 for i < 5000 { scheme.Eval( (eerawrapper (set! eeravars (append eeravars '(("clock" . 1541768.762000))))) ) scheme.Eval( (eerawrapper (set! eeravars (append eeravars '(("accelerometer/z" . -0.050780)))))) scheme.Eval( (eerawrapper (set! eeravars (append eeravars '(("accelerometer/y" . -1.037360)))))) scheme.Eval( (eerawrapper (set! eeravars (append eeravars '(("accelerometer/x" . -0.038940)))))) scheme.Eval( (eerawrapper (set! eeravars '())) ) i += 1 log.Printf( "." ) } }

running it gives: $ ./crash 2023/03/27 10:02:31 Scheme version 3.0.7 SIGSEGV: segmentation violation PC=0x7f3f53150138 m=5 sigcode=0 signal arrived during cgo execution

goroutine 1 [syscall]: runtime.cgocall(0x5f7590, 0xc000197d78) /usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc000197d50 sp=0xc000197d18 pc=0x406f9c github.com/mrosset/scheme._Cfunc_scm_c_public_ref(0x188b450, 0x7f3f14000bd0) _cgo_gotypes.go:130 +0x4d fp=0xc000197d78 sp=0xc000197d50 pc=0x5f606d github.com/mrosset/scheme.Eval({0x65ea95?, 0x6b2f08?}) /media/usr/home/joost/go/src/github.com/mrosset/scheme/scheme.go:42 +0x9d fp=0xc000197e30 sp=0xc000197d78 pc=0x5f66bd main.main() /media/usr/home/joost/u/pm/proof4you/scripting/crash/schemecrash.go:33 +0x39e fp=0xc000197f80 sp=0xc000197e30 pc=0x5f72de runtime.main() /usr/local/go/src/runtime/proc.go:250 +0x207 fp=0xc000197fe0 sp=0xc000197f80 pc=0x439a67 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000197fe8 sp=0xc000197fe0 pc=0x467721

goroutine 2 [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000052fb0 sp=0xc000052f90 pc=0x439e96 runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:387 runtime.forcegchelper() /usr/local/go/src/runtime/proc.go:305 +0xb0 fp=0xc000052fe0 sp=0xc000052fb0 pc=0x439cd0 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000052fe8 sp=0xc000052fe0 pc=0x467721 created by runtime.init.6 /usr/local/go/src/runtime/proc.go:293 +0x25

goroutine 3 [GC sweep wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000053780 sp=0xc000053760 pc=0x439e96 runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:387 runtime.bgsweep(0x0?) /usr/local/go/src/runtime/mgcsweep.go:278 +0x8e fp=0xc0000537c8 sp=0xc000053780 pc=0x4265ee runtime.gcenable.func1() /usr/local/go/src/runtime/mgc.go:178 +0x26 fp=0xc0000537e0 sp=0xc0000537c8 pc=0x41bac6 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000537e8 sp=0xc0000537e0 pc=0x467721 created by runtime.gcenable /usr/local/go/src/runtime/mgc.go:178 +0x6b

goroutine 4 [GC scavenge wait]: runtime.gopark(0xc00001c0e0?, 0x6af320?, 0x1?, 0x0?, 0x0?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000053f70 sp=0xc000053f50 pc=0x439e96 runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:387 runtime.(*scavengerState).park(0x7f51c0) /usr/local/go/src/runtime/mgcscavenge.go:400 +0x53 fp=0xc000053fa0 sp=0xc000053f70 pc=0x424533 runtime.bgscavenge(0x0?) /usr/local/go/src/runtime/mgcscavenge.go:628 +0x45 fp=0xc000053fc8 sp=0xc000053fa0 pc=0x424b05 runtime.gcenable.func2() /usr/local/go/src/runtime/mgc.go:179 +0x26 fp=0xc000053fe0 sp=0xc000053fc8 pc=0x41ba66 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000053fe8 sp=0xc000053fe0 pc=0x467721 created by runtime.gcenable /usr/local/go/src/runtime/mgc.go:179 +0xaa

goroutine 5 [finalizer wait]: runtime.gopark(0x1a0?, 0x7f5780?, 0x60?, 0x78?, 0xc000052770?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000052628 sp=0xc000052608 pc=0x439e96 runtime.runfinq() /usr/local/go/src/runtime/mfinal.go:193 +0x107 fp=0xc0000527e0 sp=0xc000052628 pc=0x41ab07 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000527e8 sp=0xc0000527e0 pc=0x467721 created by runtime.createfing /usr/local/go/src/runtime/mfinal.go:163 +0x45

goroutine 6 [select]: runtime.gopark(0xc000054788?, 0x2?, 0x8?, 0x7?, 0xc000054784?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000054610 sp=0xc0000545f0 pc=0x439e96 runtime.selectgo(0xc000054788, 0xc000054780, 0x0?, 0x0, 0x0?, 0x1) /usr/local/go/src/runtime/select.go:327 +0x7be fp=0xc000054750 sp=0xc000054610 pc=0x44965e database/sql.(*DB).connectionOpener(0xc00010eb60, {0x6b2ed0, 0xc00007c0f0}) /usr/local/go/src/database/sql/sql.go:1218 +0x8d fp=0xc0000547b8 sp=0xc000054750 pc=0x4cdf6d database/sql.OpenDB.func1() /usr/local/go/src/database/sql/sql.go:791 +0x2e fp=0xc0000547e0 sp=0xc0000547b8 pc=0x4ccece runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000547e8 sp=0xc0000547e0 pc=0x467721 created by database/sql.OpenDB /usr/local/go/src/database/sql/sql.go:791 +0x18d

rax 0x0 rbx 0x7f3f29ed0cf0 rcx 0x4 rdx 0x7f3f29ed0be8 rdi 0x7f3f531aae78 rsi 0x7f3f29ed0cf0 rbp 0x7f3f28510a40 rsp 0x7f3f29ed0bc0 r8 0x4 r9 0x0 r10 0x188b455 r11 0x4 r12 0x3 r13 0x7f3f285844e0 r14 0xc0000061a0 r15 0xc000088000 rip 0x7f3f53150138 rflags 0x10206 cs 0x33 fs 0x0 gs 0x0 $

mrosset commented 1 year ago

Thank you for reporting this. I apologize for the late reply. Most definitely this is probably a clash between guile threads and Go's green threads. I'll see if I can improve on this problem. Since it is possible to join guile threads.