Closed dkegel-fastly closed 1 year ago
I'm seeing this locally too, but the project I'm trying to compile is big and I don't have a minimal reproduce case. I'll try to dig into it myself, but I'm a Go and Tinygo newbie.
It's crashing for me right after https://github.com/tinygo-org/tinygo/blob/dc449882ad09c60c11cef7c35914d5fbfe22a88e/compiler/symbol.go#L224 with fn.Synthetic == "Len"
(good old print debugging) and len(fn.Blocks) == 0
. The fn.Blocks
array is referenced, per the stack trace, at https://github.com/tinygo-org/tinygo/blob/dc449882ad09c60c11cef7c35914d5fbfe22a88e/compiler/compiler.go#L1219
I added print statements with https://github.com/Michael-S/tinygo/commit/55ba73851f3e4ade00d14521ed9bad1554adc716 and.. everything compiled. So my guess is that we're running into a race condition, and print statements slowed one of the coroutines enough for its dependency to finish. With my initial run that only printed to stdoud the fn.Synthetic
when len(fn.Blocks) == 0
, the output was "Len". But when I print every fn.Synthetic
then for "Len" the len(fn.Blocks) == 1
.
I'll continue to dig, but I'm guessing we've hit the limits of my knowledge right here.
Maybe this is a duplicate of https://github.com/tinygo-org/tinygo/issues/3215
oh, damn, good call. maybe we just need to update go.mod to pick up a newer golang/tools, https://github.com/golang/tools/commit/479f5c690dbddecd7cf0b961fb0f2eb035b499b4 ?
Your guess is better than mine. I'll try it later.
I could have made a mistake incorporating it into my build, but I rebuilt tinygo with https://github.com/Michael-S/tinygo/commit/3cb8aa3a35faacf4905a23b69cf32e7d009908d6 and tried to rebuild my project, and still got the same panic: runtime error: index out of range [0] with length 0
error. ( I'm building a local branch of https://github.com/Michael-S/tempest )
Edit: When I have it print the fn.Synthetic
it's about to crash on, so far I've gotten Len
, Lock
, and ToPtr
. I'm not sure if that's significant, but I'm going to keep digging from there.
Edit 2: (I don't want to keep spamming people with notifications. I'm hoping GitHub doesn't send alerts for this) I assumed Len, Lock, and ToPtr were Golang internals. But it looks like ToPtr is not. I'm not sure if that's important information.
@dkegel-fastly do you have a small reproducer for this crash? tempest is a bit big and has a bunch of dependencies
I don't, but see https://github.com/tinygo-org/tinygo/issues/3532 which this may be a dup of.
I've done more digging. The crash is in symbol.go
, as noted in the stack trace. The functions that get past that point appear to all have their definition run through go/tools:builder.go:builder.buildFunctionBody
in https://github.com/golang/tools/blob/866a6b0ff32adf32534b859fc6499a7e95f09054/go/ssa/builder.go#L2312 and the one that crashes has not.
I hope the stream of posts isn't too tedious, but I keep hoping a Go-guru will jump in with an "a-ha!"
With more print debugging, I am reasonably confident of the following:
capnproto.org/go/capnp/v3/internal/schema
, maybe because it's a large package. Again, sometimes it compiles without errors from a clean tinygo compile, sometimes it fails.tinygo:compiler/compiler.go:CompilePackage
call to ssaPackage.Build()
is still running when log statements I insert further down in that same function in compiler.go
are printing. I am seeing log statements I inserted into https://github.com/golang/tools/blob/866a6b0ff32adf32534b859fc6499a7e95f09054/go/ssa/builder.go#L2457 intermingling with log statements I inserted in compiler.go CompilePackage
, both for the same Go package (the one I identified).I think it's a bug in https://github.com/tinygo-org/tinygo/blob/dc449882ad09c60c11cef7c35914d5fbfe22a88e/src/sync/once.go
func (o *Once) Do(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done {
return
}
o.done = true // problem here
f()
}
I put the // problem here
comment. If a second thread hits the Do
function, it will return even though f() may still be executing.
Am I wrong? I'm not an expert in concurrency.
Edit: No, I'm wrong, because the mutex would guarantee Do
is not entered again until f() is finished. Okay, I'm still confused.
Confirming this as a recipe to reproduce:
git clone https://github.com/capnproto/go-capnp
cd go-capnp
tinygo clean
good=0
bad=0
while true
do
if tinygo test -c 2>&1 | grep "index out of range"; then : $((bad++)); else : $((good++)); fi
echo good $good bad $bad
tinygo clean
done
Letting that run for a couple minutes here yields
panic: runtime error: index out of range [0] with length 0
good 0 bad 1
panic: runtime error: index out of range [0] with length 0
good 0 bad 2
panic: runtime error: index out of range [0] with length 0
good 0 bad 3
good 1 bad 3
panic: runtime error: index out of range [0] with length 0
good 1 bad 4
panic: runtime error: index out of range [0] with length 0
...
good 5 bad 15
That's with go 1.21 and tinygo 0.29 from the prebuilt tarball; results with go 1.20.7 are similar.
Interestingly, results with tinygo 0.28.1 and go 1.20.7 are also similar... so the bug was present in 0.28.1 as well.
Same basic recipe works on another, possibly smaller, package, but only with go 1.21, not go 1.20.7:
git clone github.com/fastly/compute-sdk-go
cd compute-sdk-go
tinygo clean; good=0; bad=0; while true; do if tinygo test -tags nofastlyhostcalls -target wasi -c ./... 2>&1 | grep "index out of range"; then : $((bad++)); else : $((good++)); fi; echo good $good bad $bad; tinygo clean; done
...
good 15 bad 5
That's with go 1.21 and tinygo 0.29 from the prebuilt tarball.
Interestingly, with this reproducer, the problem didn't occur in 15 tries with go 1.20.7 installed, with either tinygo 0.29 or tinygo 0.28.1.
For the go-capng case, running tinygo test -c
in a debugger revealed the crash was related to function Lock[capnproto.org/go/capnp/v3.promiseState]
That looks like this is the most likely suspect (https://github.com/capnproto/go-capnp/blob/ee48d50f34d6ce9245623101d78f41117c8676a6/answer.go#L37C1-L37C33):
state mutex.Mutex[promiseState]
mutex.Mutex is an external template wrapping sync.Mutex
(https://github.com/zenhack/go-util/blob/main/sync/mutex/mutex.go)
Specifically: func (m *Mutex[T]) Lock() *Locked[T]
I assume this is just another instance of https://github.com/tinygo-org/tinygo/issues/3215
Should we close this issue as a duplicate since the error message and code structure seem to replicate #3215 ?
Of note, I can repro after doing this to upgrade golang tools and rebuild tinygo, so i'm not sure the upstream fix is helping:
$ cd tinygo
$ go get golang.org/x/tools@latest
go: downloading golang.org/x/tools v0.13.0
go: downloading golang.org/x/sys v0.12.0
go: upgraded golang.org/x/sys v0.11.0 => v0.12.0
go: upgraded golang.org/x/tools v0.12.0 => v0.13.0
$ make
Further note, it looks like it's failing in example_test.go
compiling Example
. Notably, that's in a package capnp_test
- I wonder if there's something here specific to xxxx_test
packages and their dependency modelling?
When I reproduce this locally, it's failing in different parts of the package - not consistently on the same function. The most recent crash for me was Len[capnproto.org/go/capnp/v3/internal/schema.Node]
@Michael-S - I like your analysis about concurrency... For go-capnp
, using -p
to control the concurrency, I see:
tinygo clean && tinygo test -c -p 1 -tags purego
No crash (unit test 'fails', but that's a separate issue)tinygo clean && tinygo test -c -p 2 -tags purego
No crash (unit test 'fails', but that's a separate issue)tinygo clean && tinygo test -c -p 4 -tags purego
Crash ~ 50% of the timetinygo clean && tinygo test -c -p 8 -tags purego
Crashes reliablyNice sleuthing! I didn't think to do that. I have been dumping log.Println
all over the place and digging through. And I can't be sure I'm understanding the errors correctly, because - I am guessing - once I have enough log statements in place, there may be a queue of messages for stdout that are still in memory when the process exits. My tinygo
runs that succeed have 100k lines of output. (I go berserk with logging when debugging like this 😁)
This PR worked for me. Please give it a try. You'll need to rebuild tinygo itself: https://github.com/tinygo-org/tinygo/pull/3915
That appeared to fix it, thank you so much!
I rebuilt tinygo and then building my local copy of the tempest project did:
for i in seq 25; do make clean; tinygo clean; sleep 3; ./configure --with-go-capnp=../deps/go-capnp; make > junk$i.txt 2>&1
It succeeded 24 times and had a different error once. I'm putting the error here for the sake of completeness, but I think it's a separate issue. I've seen it before, but rarely.
SIGSEGV: segmentation violation
PC=0x57f1fc5 m=9 sigcode=2
signal arrived during cgo execution
goroutine 490 [syscall]:
runtime.cgocall(0x8e3420, 0xc00aab3b30)
/usr/lib/go-1.21/src/runtime/cgocall.go:157 +0x4b fp=0xc00aab3b08 sp=0xc00aab3ad0 pc=0x50c8eb
tinygo.org/x/go-llvm._Cfunc_LLVMVerifyModule(0x7fd5c400b2a0, 0x1, 0xc00fa57588)
_cgo_gotypes.go:9328 +0x4b fp=0xc00aab3b30 sp=0xc00aab3b08 pc=0x75aa8b
tinygo.org/x/go-llvm.VerifyModule.func1({0xc00aab3b98?}, 0xaab3b98?, 0xc00aab3bc0?)
/home/spongebob/go/pkg/mod/tinygo.org/x/go-llvm@v0.0.0-20230918183930-9edb6403d0bc/analysis.go:38 +0x77 fp=0xc00aab3b78 sp=0xc00aab3b30 pc=0x75af97
tinygo.org/x/go-llvm.VerifyModule({0xc000400f20?}, 0x5e?)
/home/spongebob/go/pkg/mod/tinygo.org/x/go-llvm@v0.0.0-20230918183930-9edb6403d0bc/analysis.go:38 +0x3d fp=0xc00aab3bd0 sp=0xc00aab3b78 pc=0x75ae5d
github.com/tinygo-org/tinygo/builder.Build.func4(0xc008cbb680)
/home/spongebob/tempest_and_deps/deps/tinygo/builder/build.go:385 +0x370 fp=0xc00aab3f78 sp=0xc00aab3bd0 pc=0x88f5f0
github.com/tinygo-org/tinygo/builder.runJob(0xc008cbb680, 0x0?)
/home/spongebob/tempest_and_deps/deps/tinygo/builder/jobs.go:222 +0x4d fp=0xc00aab3fc0 sp=0xc00aab3f78 pc=0x89cdad
github.com/tinygo-org/tinygo/builder.runJobs.func2()
/home/spongebob/tempest_and_deps/deps/tinygo/builder/jobs.go:123 +0x25 fp=0xc00aab3fe0 sp=0xc00aab3fc0 pc=0x89c705
runtime.goexit()
/usr/lib/go-1.21/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00aab3fe8 sp=0xc00aab3fe0 pc=0x570061
created by github.com/tinygo-org/tinygo/builder.runJobs in goroutine 1
/home/spongebob/tempest_and_deps/deps/tinygo/builder/jobs.go:123 +0x5bd
I'm pretty sure this is a duplicate of #3532. See that issue for details.
You can compile with -race
to more clearly see where the race happens.
I recompiled tinygo
with -race
by modifying https://github.com/tinygo-org/tinygo/blob/dc449882ad09c60c11cef7c35914d5fbfe22a88e/Makefile#L284 (it made the tinygo
binary about 10 MB larger, and tinygo version
reports the correct commit hash) and started running the build, and the build is crashing with the original stack trace. I'm not seeing any new error information.
The tempest
build runs tinygo
with tinygo build -target wasm -panic trap -no-debug
, if any of that would affect the compile-time race condition capture. https://github.com/zenhack/tempest/blob/fd1ce1af15453afdae57ae06d43b797dbf055dc8/internal/make/make.go#L292
(Edit: I don't mean to be bombarding the maintainers with detail in my comments. I just want to make sure I'm not making newbie errors.)
and the build is crashing with the original stack trace
You mean with the "index out of range" error? That's odd, I'd expect to see a race condition warning. Maybe it's not a duplicate after all.
Yes, tinygo
compiled with -race
still crashes with the "index out of range" error, repeatedly. grep -i race
doesn't find anything in any of the build outputs.
@Michael-S can you try with the current dev (or release) branch? I hope this bug is fixed now.
With dev branch tinygo
, compiling tempest.
for i in $(seq 100); do make clean; tinygo clean; sleep 3; ./configure --with-go-capnp=../deps/go-capnp; make > ../junk_$i.txt 2>&1; done
100 successes.
Edit: as a sanity check, I took dev branch tinygo
, reverted 13a8eae0d4bce114405f175d4e73ec5a8b07a3aa, and started the tempest build ten times. I got the failure in this issue six times. So as far as I can tell, the fix works.
Thank you everyone for figuring this out!
This is an intermittent error compiling a small "hello, http" app with tinygo 0.29 [EDIT: also with 0.28.1, see below]: