golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.12k stars 17.68k forks source link

unique: Fatal errors (found bad pointer in Go heap, found pointer to free object) and memory corruption #69643

Closed connorszczepaniak-wk closed 2 weeks ago

connorszczepaniak-wk commented 1 month ago

Go version

go version go1.23.1 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/root/.cache/go-build'
GOENV='/root/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/root/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/go/src/github.com/my/repo/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3070165830=/tmp/go-build -gno-record-gcc-switches'

What did you do?

In our deployed application, we started using the unique package to intern some strings. It's hard to provide a complete example because of the size of our application and the potential usage graph of the unique.Handles would be difficult to compute. We also haven't been able to repro with a more trivial example yet. The gist of it it this:

type resource struct {
    target unique.Handle[string]
}

type identifierKindA unique.Handle[string]

type identifierKindB string

func (r *resource) getIDA() identifierKindA {
    return identifierKindA(r.target)
}

func (r *resource) getIDB() identifierKindB {
    return identifierKindB(r.target.Value())
}

getIDA and getIDB are getting called in many disparate places, and potentially across many goroutines if that's important.

What did you see happen?

We had a few different fatal errors occur. We also saw memory corruption; getting the value from the handle seems to have returned a string from elsewhere in the program that could never have been an input when creating this particular type.

fatal error: found pointer to free object

Stack Trace ``` goroutine 15 gp=0xc000d80540 m=13 mp=0xc000300e08 [running]: runtime.throw({0x556fda3?, 0xc0de54d710?}) /usr/local/go/src/runtime/panic.go:1067 +0x48 fp=0xc04d702b60 sp=0xc04d702b30 pc=0x473e08 runtime.(*mspan).reportZombies(0x7fa5c5a60688) /usr/local/go/src/runtime/mgcsweep.go:890 +0x2ea fp=0xc04d702be0 sp=0xc04d702b60 pc=0x428aea runtime.(*sweepLocked).sweep(0x56cb850?, 0x0) /usr/local/go/src/runtime/mgcsweep.go:658 +0xb54 fp=0xc04d702d00 sp=0xc04d702be0 pc=0x428134 runtime.(*mspan).ensureSwept(0xc0d53cf3c8?) /usr/local/go/src/runtime/mgcsweep.go:474 +0xc5 fp=0xc04d702d38 sp=0xc04d702d00 pc=0x427565 internal/weak.runtime_makeStrongFromWeak(0xc0ddeb1f30) /usr/local/go/src/runtime/mheap.go:2069 +0xa9 fp=0xc04d702d58 sp=0xc04d702d38 pc=0x472a69 internal/weak.Pointer[...].Strong(...) /usr/local/go/src/internal/weak/pointer.go:74 unique.addUniqueMap[...].func1.1({0xc17f6cab40}) /usr/local/go/src/unique/handle.go:130 +0x39 fp=0xc04d702da8 sp=0xc04d702d58 pc=0x167de99 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc0d53cf360, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:298 +0xe3 fp=0xc04d702de8 sp=0xc04d702da8 pc=0x167c163 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc041930280, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:291 +0x65 fp=0xc04d702e28 sp=0xc04d702de8 pc=0x167c0e5 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc03f937860, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:291 +0x65 fp=0xc04d702e68 sp=0xc04d702e28 pc=0x167c0e5 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc048c73a40, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:291 +0x65 fp=0xc04d702ea8 sp=0xc04d702e68 pc=0x167c0e5 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc03fe74be0, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:291 +0x65 fp=0xc04d702ee8 sp=0xc04d702ea8 pc=0x167c0e5 internal/concurrent.(*HashTrieMap[...]).iter(0x60824c0, 0xc040b94f00, 0xc04d702f48) /usr/local/go/src/internal/concurrent/hashtriemap.go:291 +0x65 fp=0xc04d702f28 sp=0xc04d702ee8 pc=0x167c0e5 unique.addUniqueMap[...]).All.2(...) /usr/local/go/src/internal/concurrent/hashtriemap.go:280 unique.addUniqueMap[...].func1() /usr/local/go/src/unique/handle.go:129 +0x45 fp=0xc04d702f70 sp=0xc04d702f28 pc=0x167de45 unique.registerCleanup.func1() /usr/local/go/src/unique/handle.go:157 +0xd0 fp=0xc04d702fb8 sp=0xc04d702f70 pc=0x5cb170 runtime.unique_runtime_registerUniqueMapCleanup.func1(...) /usr/local/go/src/runtime/mgc.go:1733 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() /usr/local/go/src/runtime/mgc.go:1735 +0x39 fp=0xc04d702fe0 sp=0xc04d702fb8 pc=0x41e7b9 runtime.goexit({}) /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc04d702fe8 sp=0xc04d702fe0 pc=0x47c461 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 /usr/local/go/src/runtime/mgc.go:1730 +0x96 [originating from goroutine 1]: runtime.systemstack_switch(...) /usr/local/go/src/runtime/asm_amd64.s:479 +0x8 runtime.newproc(...) /usr/local/go/src/runtime/proc.go:4977 +0x3f unique.runtime_registerUniqueMapCleanup(...) /usr/local/go/src/runtime/mgc.go:1736 +0x96 unique.registerCleanup(...) /usr/local/go/src/unique/handle.go:169 +0x1a sync.(*Once).doSlow(...) /usr/local/go/src/sync/once.go:75 +0xb4 sync.(*Once).Do(...) /usr/local/go/src/sync/once.go:69 +0x19 unique.Make[...](...) /usr/local/go/src/unique/handle.go:40 +0x8d net/netip.init(...) /usr/local/go/src/net/netip/netip.go:70 +0x25 runtime.doInit1(...) /usr/local/go/src/runtime/proc.go:7287 +0xe8 runtime.doInit(...) /usr/local/go/src/runtime/proc.go:7256 +0x345 runtime.main(...) /usr/local/go/src/runtime/proc.go:254 +0x22e runtime.goexit(...) /usr/local/go/src/runtime/asm_amd64.s:1701 +0x1 ```

fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)

Stack Trace ``` runtime.throw({0x5656b0e?, 0x6?}) /usr/local/go/src/runtime/panic.go:1067 +0x48 fp=0xc00021fdf8 sp=0xc00021fdc8 pc=0x473e08 runtime.badPointer(0x7f0b918ec6e8, 0xc1683e6f00, 0xc53fe46000, 0x86f8) /usr/local/go/src/runtime/mbitmap.go:1247 +0x165 fp=0xc00021fe48 sp=0xc00021fdf8 pc=0x4162e5 runtime.findObject(0xc686822bd0?, 0xc1fc048a20?, 0x1?) /usr/local/go/src/runtime/mbitmap.go:1299 +0xa6 fp=0xc00021fe80 sp=0xc00021fe48 pc=0x472526 runtime.scanobject(0xc0000d1750?, 0xc0000d1750) /usr/local/go/src/runtime/mgcmark.go:1464 +0x14c fp=0xc00021ff10 sp=0xc00021fe80 pc=0x42204c runtime.gcDrain(0xc0000d1750, 0x2) /usr/local/go/src/runtime/mgcmark.go:1230 +0x1f4 fp=0xc00021ff78 sp=0xc00021ff10 pc=0x421914 runtime.gcDrainMarkWorkerDedicated(...) /usr/local/go/src/runtime/mgcmark.go:1112 runtime.gcBgMarkWorker.func2() /usr/local/go/src/runtime/mgc.go:1455 +0x14a fp=0xc00021ffc8 sp=0xc00021ff78 pc=0x41dfca runtime.systemstack(0x0) /usr/local/go/src/runtime/asm_amd64.s:514 +0x4a fp=0xc00021ffd8 sp=0xc00021ffc8 pc=0x47a62a ``` ### What did you expect to see? No fatal errors and no memory corruption.
gabyhelp commented 1 month ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

connorszczepaniak-wk commented 1 month ago

It looks like this issue might have been the same: https://github.com/golang/go/issues/69210

That seems like maybe there will be a fix in 1.23.2. Can someone confirm that this issue could have caused the memory corruption that we were seeing?

MikeMitchellWebDev commented 1 month ago

@connorszczepaniak-wk you can apply the fix now and test it out yourself https://go-review.googlesource.com/c/go/+/610696

mknyszek commented 1 month ago

@connorszczepaniak-wk Yeah, that's almost certainly it. And indeed, it's fixed at tip and the fix is already on the Go 1.23 release branch. It will be in the next minor release.

It may be worth giving the patch a try as @MikeMitchellWebDev suggests, to confirm it does resolve the issue for you. Leaving this issue open for that reason, for now.

Apologies for the breakage.

connorszczepaniak-wk commented 1 month ago

Thanks for the quick reply; I think this may be a bit tricky for us to test out ahead of it being included in a proper patch release (it's a bit unclear how I'd apply a patch to the stdlib that we use to build in a docker image without making some significant changes to our build process), but we could test it once 1.23.2 is out to confirm that we don't see the same issue.

gopherbot commented 2 weeks ago

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)