golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.86k stars 17.65k forks source link

runtime: sporadic memory allocation issues when sufficiently large dependencies are brought in by a program that sets memory rlimits #69105

Closed akalenyu closed 2 weeks ago

akalenyu commented 2 months ago

Go version

go version go1.22.6 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/akalenyu/.cache/go-build'
GOENV='/home/akalenyu/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/akalenyu/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/akalenyu/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.6'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/akalenyu/Work/kubernetes/go.mod'
GOWORK='/home/akalenyu/Work/kubernetes/go.work'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3633518003=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Running a simple program that sets a 1G memory rlimit and does basically nothing, with a dep on a sufficiently large library (k8s for the purpose of demonstration) https://github.com/akalenyu/kubernetes/commit/b88d05b4892ce16634200b54cf84a7e1396f32cd

To reproduce:

git clone https://github.com/kubernetes/kubernetes.git 
# apply change from commit above (i.e. bring in this simple program that bring in a large k8s dep in)
cd cmd/virt-chroot
go build
for i in {1..2000}; do ./virt-chroot; done

What did you see happen?

0x2649320
fatal error: runtime: cannot allocate memory

runtime stack:
runtime.throw({0x2ea8f47?, 0x0?})
    /usr/local/go/src/runtime/panic.go:1023 +0x5c fp=0x7fcfcd3ffa68 sp=0x7fcfcd3ffa38 pc=0x44055c
runtime.persistentalloc1(0x800, 0x412745?, 0x4df10d8)
    /usr/local/go/src/runtime/malloc.go:1576 +0x24d fp=0x7fcfcd3ffab8 sp=0x7fcfcd3ffa68 pc=0x414c2d
runtime.persistentalloc.func1()
    /usr/local/go/src/runtime/malloc.go:1529 +0x28 fp=0x7fcfcd3ffae8 sp=0x7fcfcd3ffab8 pc=0x4149c8
runtime.persistentalloc(0x4da4330?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/malloc.go:1528 +0x5c fp=0x7fcfcd3ffb30 sp=0x7fcfcd3ffae8 pc=0x41495c
runtime.(*spanSet).push(0x4da4ab0, 0x7fcfcf320778)
    /usr/local/go/src/runtime/mspanset.go:104 +0xcd fp=0x7fcfcd3ffb98 sp=0x7fcfcd3ffb30 pc=0x43a2ed
runtime.(*sweepLocked).sweep(0x20001f800100948?, 0x0)
    /usr/local/go/src/runtime/mgcsweep.go:765 +0x795 fp=0x7fcfcd3ffcb0 sp=0x7fcfcd3ffb98 pc=0x42d8f5
runtime.(*mcentral).uncacheSpan(0x2030002?, 0xda?)
    /usr/local/go/src/runtime/mcentral.go:236 +0x98 fp=0x7fcfcd3ffcd8 sp=0x7fcfcd3ffcb0 pc=0x41d778
runtime.(*mcache).releaseAll(0x7fd016947a68)
    /usr/local/go/src/runtime/mcache.go:291 +0x13e fp=0x7fcfcd3ffd40 sp=0x7fcfcd3ffcd8 pc=0x41d0be
runtime.(*mcache).prepareForSweep(0x7fd016947a68)
    /usr/local/go/src/runtime/mcache.go:328 +0x35 fp=0x7fcfcd3ffd68 sp=0x7fcfcd3ffd40 pc=0x41d1b5
runtime.gcMarkTermination.func4(0xc000078a08)
    /usr/local/go/src/runtime/mgc.go:1125 +0x25 fp=0x7fcfcd3ffd90 sp=0x7fcfcd3ffd68 pc=0x472ac5
runtime.forEachPInternal(0x2ffdd98)
    /usr/local/go/src/runtime/proc.go:1947 +0x12b fp=0x7fcfcd3ffe10 sp=0x7fcfcd3ffd90 pc=0x4467eb
runtime.gcMarkTermination.forEachP.func6()
    /usr/local/go/src/runtime/proc.go:1906 +0x3f fp=0x7fcfcd3ffe40 sp=0x7fcfcd3ffe10 pc=0x42333f
runtime.systemstack(0x800000)
    /usr/local/go/src/runtime/asm_amd64.s:509 +0x4a fp=0x7fcfcd3ffe50 sp=0x7fcfcd3ffe40 pc=0x47950a

goroutine 53 gp=0xc000502540 m=7 mp=0xc000580008 [flushing proc caches]:
runtime.systemstack_switch()
    /usr/local/go/src/runtime/asm_amd64.s:474 +0x8 fp=0xc000509cb0 sp=0xc000509ca0 pc=0x4794a8
runtime.forEachP(...)
    /usr/local/go/src/runtime/proc.go:1895
runtime.gcMarkTermination({0xe0?, 0x7d101de2ec7?})
    /usr/local/go/src/runtime/mgc.go:1124 +0x5d3 fp=0xc000509ec0 sp=0xc000509cb0 pc=0x422ad3
runtime.gcMarkDone()
    /usr/local/go/src/runtime/mgc.go:927 +0x305 fp=0xc000509f50 sp=0xc000509ec0 pc=0x4222e5
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1446 +0x345 fp=0xc000509fe0 sp=0xc000509f50 pc=0x423885
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 1 gp=0xc0000061c0 m=0 mp=0x4d89700 [syscall, locked to thread]:
syscall.Syscall(0x1, 0x2, 0xc000b80030, 0x3)
    /usr/local/go/src/syscall/syscall_linux.go:69 +0x25 fp=0xc000b27d58 sp=0xc000b27cf8 pc=0x4d4b05
syscall.write(0xc0000de120?, {0xc000b80030?, 0x2e63588?, 0x2ffdb60?})
    /usr/local/go/src/syscall/zsyscall_linux_amd64.go:964 +0x3b fp=0xc000b27d98 sp=0xc000b27d58 pc=0x4d2c3b
syscall.Write(...)
    /usr/local/go/src/syscall/syscall_unix.go:209
internal/poll.ignoringEINTRIO(...)
    /usr/local/go/src/internal/poll/fd_unix.go:736
internal/poll.(*FD).Write(0xc0000de120, {0xc000b80030, 0x3, 0x10})
    /usr/local/go/src/internal/poll/fd_unix.go:380 +0x368 fp=0xc000b27e48 sp=0xc000b27d98 pc=0x4f1a68
os.(*File).write(...)
    /usr/local/go/src/os/file_posix.go:46
os.(*File).Write(0xc0000b4030, {0xc000b80030?, 0x3, 0x4cdb38?})
    /usr/local/go/src/os/file.go:189 +0x51 fp=0xc000b27ea8 sp=0xc000b27e48 pc=0x4fb1f1
fmt.Fprintln({0x3301e00, 0xc0000b4030}, {0xc000b27f30, 0x1, 0x1})
    /usr/local/go/src/fmt/print.go:305 +0x6f fp=0xc000b27ef8 sp=0xc000b27ea8 pc=0x50802f
main.main()
    /home/akalenyu/Work/kubernetes/cmd/virt-chroot/main.go:31 +0xc5 fp=0xc000b27f50 sp=0xc000b27ef8 pc=0x2653585
runtime.main()
    /usr/local/go/src/runtime/proc.go:271 +0x29d fp=0xc000b27fe0 sp=0xc000b27f50 pc=0x44315d
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000b27fe8 sp=0xc000b27fe0 pc=0x47b4c1

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b0fa8 sp=0xc0000b0f88 pc=0x4435ae
runtime.goparkunlock(...)
    /usr/local/go/src/runtime/proc.go:408
runtime.forcegchelper()
    /usr/local/go/src/runtime/proc.go:326 +0xb3 fp=0xc0000b0fe0 sp=0xc0000b0fa8 pc=0x443413
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b0fe8 sp=0xc0000b0fe0 pc=0x47b4c1
created by runtime.init.6 in goroutine 1
    /usr/local/go/src/runtime/proc.go:314 +0x1a

goroutine 3 gp=0xc000007180 m=nil [runnable]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b1780 sp=0xc0000b1760 pc=0x4435ae
runtime.goparkunlock(...)
    /usr/local/go/src/runtime/proc.go:408
runtime.bgsweep(0xc0000da000)
    /usr/local/go/src/runtime/mgcsweep.go:318 +0xdf fp=0xc0000b17c8 sp=0xc0000b1780 pc=0x42cc5f
runtime.gcenable.gowrap1()
    /usr/local/go/src/runtime/mgc.go:203 +0x25 fp=0xc0000b17e0 sp=0xc0000b17c8 pc=0x421545
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b17e8 sp=0xc0000b17e0 pc=0x47b4c1
created by runtime.gcenable in goroutine 1
    /usr/local/go/src/runtime/mgc.go:203 +0x66

goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x32f4058?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b1f78 sp=0xc0000b1f58 pc=0x4435ae
runtime.goparkunlock(...)
    /usr/local/go/src/runtime/proc.go:408
runtime.(*scavengerState).park(0x4d87e80)
    /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc0000b1fa8 sp=0xc0000b1f78 pc=0x42a5e9
runtime.bgscavenge(0xc0000da000)
    /usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc0000b1fc8 sp=0xc0000b1fa8 pc=0x42ab99
runtime.gcenable.gowrap2()
    /usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc0000b1fe0 sp=0xc0000b1fc8 pc=0x4214e5
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b1fe8 sp=0xc0000b1fe0 pc=0x47b4c1
created by runtime.gcenable in goroutine 1
    /usr/local/go/src/runtime/mgc.go:204 +0xa5

goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]:
runtime.gopark(0xc0000b0648?, 0x414665?, 0xa8?, 0x1?, 0xc0000061c0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b0620 sp=0xc0000b0600 pc=0x4435ae
runtime.runfinq()
    /usr/local/go/src/runtime/mfinal.go:194 +0x107 fp=0xc0000b07e0 sp=0xc0000b0620 pc=0x420507
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b07e8 sp=0xc0000b07e0 pc=0x47b4c1
created by runtime.createfing in goroutine 1
    /usr/local/go/src/runtime/mfinal.go:164 +0x3d

goroutine 6 gp=0xc00041f880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b2750 sp=0xc0000b2730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000b27e0 sp=0xc0000b2750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b27e8 sp=0xc0000b27e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 7 gp=0xc00041fa40 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d8a987?, 0x3?, 0x78?, 0x6e?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b2f50 sp=0xc0000b2f30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000b2fe0 sp=0xc0000b2f50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b2fe8 sp=0xc0000b2fe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 8 gp=0xc00041fc00 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89be7?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b3750 sp=0xc0000b3730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000b37e0 sp=0xc0000b3750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b37e8 sp=0xc0000b37e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 9 gp=0xc00041fdc0 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89d5c?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000b3f50 sp=0xc0000b3f30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000b3fe0 sp=0xc0000b3f50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b3fe8 sp=0xc0000b3fe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 10 gp=0xc0004da000 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d8a5f4?, 0x0?, 0x0?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000ac750 sp=0xc0000ac730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000ac7e0 sp=0xc0000ac750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000ac7e8 sp=0xc0000ac7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 11 gp=0xc0004da1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d9b4a3?, 0x3?, 0x23?, 0x36?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000acf50 sp=0xc0000acf30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000acfe0 sp=0xc0000acf50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000acfe8 sp=0xc0000acfe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 12 gp=0xc0004da380 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89ba8?, 0x3?, 0x6e?, 0xd8?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000ad750 sp=0xc0000ad730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000ad7e0 sp=0xc0000ad750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000ad7e8 sp=0xc0000ad7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 13 gp=0xc0004da540 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d86d83?, 0xc000479640?, 0x1a?, 0xa?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000adf50 sp=0xc0000adf30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000adfe0 sp=0xc0000adf50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000adfe8 sp=0xc0000adfe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 14 gp=0xc0004da700 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89a8a?, 0x1?, 0xa7?, 0x4a?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000ae750 sp=0xc0000ae730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000ae7e0 sp=0xc0000ae750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000ae7e8 sp=0xc0000ae7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 15 gp=0xc0004da8c0 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89bf8?, 0x3?, 0xcc?, 0x11?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000aef50 sp=0xc0000aef30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000aefe0 sp=0xc0000aef50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000aefe8 sp=0xc0000aefe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 16 gp=0xc0004daa80 m=nil [GC worker (idle)]:
runtime.gopark(0x4dec4e0?, 0x1?, 0xe4?, 0xd0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000af750 sp=0xc0000af730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000af7e0 sp=0xc0000af750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000af7e8 sp=0xc0000af7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 18 gp=0xc0004dac40 m=nil [GC worker (idle)]:
runtime.gopark(0x4dec4e0?, 0x3?, 0x8b?, 0x0?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0000aff50 sp=0xc0000aff30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000affe0 sp=0xc0000aff50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000affe8 sp=0xc0000affe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 19 gp=0xc0004dae00 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d8ac7e?, 0x3?, 0x18?, 0x91?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0004e0750 sp=0xc0004e0730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0004e07e0 sp=0xc0004e0750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0004e07e8 sp=0xc0004e07e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d86b96?, 0x1?, 0x26?, 0xd5?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0004dc750 sp=0xc0004dc730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0004dc7e0 sp=0xc0004dc750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0004dc7e8 sp=0xc0004dc7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 50 gp=0xc000502000 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89bdc?, 0x1?, 0x40?, 0x4a?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000508750 sp=0xc000508730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0005087e0 sp=0xc000508750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0005087e8 sp=0xc0005087e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 51 gp=0xc0005021c0 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89b42?, 0x1?, 0x7d?, 0x73?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000508f50 sp=0xc000508f30 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000508fe0 sp=0xc000508f50 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000508fe8 sp=0xc000508fe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 52 gp=0xc000502380 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d89ab8?, 0x1?, 0xdf?, 0x80?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000509750 sp=0xc000509730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0005097e0 sp=0xc000509750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0005097e8 sp=0xc0005097e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 54 gp=0xc000502700 m=nil [GC worker (idle)]:
runtime.gopark(0x7d101d8a830?, 0x1?, 0x88?, 0x27?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00050a750 sp=0xc00050a730 pc=0x4435ae
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc00050a7e0 sp=0xc00050a750 pc=0x423625
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 55 gp=0xc0005028c0 m=nil [runnable]:
runtime.gopark(0x0?, 0x0?, 0x80?, 0x67?, 0x0?)
    /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00050ae58 sp=0xc00050ae38 pc=0x4435ae
runtime.goparkunlock(...)
    /usr/local/go/src/runtime/proc.go:408
runtime.semacquire1(0x4d8939c, 0x0, 0x0, 0x0, 0x12)
    /usr/local/go/src/runtime/sema.go:160 +0x225 fp=0xc00050aec0 sp=0xc00050ae58 pc=0x456465
runtime.semacquire(...)
    /usr/local/go/src/runtime/sema.go:111
runtime.gcMarkDone()
    /usr/local/go/src/runtime/mgc.go:807 +0x2f fp=0xc00050af50 sp=0xc00050aec0 pc=0x42200f
runtime.gcBgMarkWorker()
    /usr/local/go/src/runtime/mgc.go:1446 +0x345 fp=0xc00050afe0 sp=0xc00050af50 pc=0x423885
runtime.goexit({})
    /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x47b4c1
created by runtime.gcBgMarkStartWorkers in goroutine 1
    /usr/local/go/src/runtime/mgc.go:1234 +0x1c

What did you expect to see?

No sporadic ENOMEM

ianlancetaylor commented 2 months ago

If a program runs out of memory, it can't continue. 1G is not going to be enough for a large program. What are you suggesting that we do here?

akalenyu commented 2 months ago

I am suggesting that something leaks, since it's very unlikely that simple program would run out of 1G of memory

ianlancetaylor commented 2 months ago

I don't think it's unlikely at all. You are limiting the size of virtual memory, but the program itself takes up virtual memory.

akalenyu commented 2 months ago

Hmm, the core dump size is also nowhere near 1G. What am I missing here?

ls -sh ./virt-chroot.core
136M ./virt-chroot.core
prattmic commented 2 months ago

The Go runtime reserves large portions of address space, see https://go.dev/doc/gc-guide#A_note_about_virtual_memory. RLIMIT_AS is not a good way to limit the memory of Go programs.

cc @mknyszek

fweimer-rh commented 2 months ago

Hmm, the core dump size is also nowhere near 1G. What am I missing here?

RLIMIT_AS is about address space reservation, not actually committed memory. I see 24 mappings that are larger than 50 megabytes. The address space limit is already exceeded at the time of the syscall.Rlimit call. Linux does not return failure in this case. The limit is still applied, but future allocation (actually: address space reservation) from the kernel will fail. But there is still plenty of unused Go heap, so most of the time, the program succeeds. There's probably some heap expansion heuristic that kicks in very rarely, and that produces the sporadic failure. But in truth, the process is in a bad state during every run.

akalenyu commented 2 months ago

The Go runtime reserves large portions of address space, see https://go.dev/doc/gc-guide#A_note_about_virtual_memory. RLIMIT_AS is not a good way to limit the memory of Go programs.

cc @mknyszek

Thank you! this link is super useful. From a quick search I see that we have no better options unfortunately, since SetMemoryLimit is a soft limit and we're kind of looking at protecting against indirect malicious usage of a binary we call which is out of our control.

RLIMIT_AS is about address space reservation, not actually committed memory. I see 24 mappings that are larger than 50 megabytes. The address space limit is already exceeded at the time of the syscall.Rlimit call. Linux does not return failure in this case. The limit is still applied, but future allocation (actually: address space reservation) from the kernel will fail. But there is still plenty of unused Go heap, so most of the time, the program succeeds. There's probably some heap expansion heuristic that kicks in very rarely, and that produces the sporadic failure. But in truth, the process is in a bad state during every run.

Thank you. I am assuming those come from the "sufficiently large dependencies" but I am particularly interested in 2 mmaps for 512Mi that IIUC already exceed the disclaimers in https://go.dev/doc/gc-guide#A_note_about_virtual_memory

prattmic commented 2 months ago

If you are running on Linux, I recommend using a memory cgroup to limit memory use of an application. Memory cgroups measure actual memory usage more precisely and aren't quite so trivially circumvented (RLIMIT_AS is immediately circumvented by fork()).

akalenyu commented 2 months ago

If you are running on Linux, I recommend using a memory cgroup to limit memory use of an application. Memory cgroups measure actual memory usage more precisely and aren't quite so trivially circumvented (RLIMIT_AS is immediately circumvented by fork()).

So we follow up on the Setrlimit call with a syscall.Exec, so it should not be circumvented. I am thinking the same about cgroups being our best bet, but, if we don't, I guess we could do some nasty things with a short cgo func that calls setrlimit and execve directly

gabyhelp commented 2 months ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

prattmic commented 2 months ago

Ah, do you actually want to apply the rlimit to a (non-Go) process you are exec'ing to? I think there is reasonable room for a proposal to add rlimits to https://pkg.go.dev/syscall#SysProcAttr so that os.StartProcess / os/exec could set rlimits for the child process.

ianlancetaylor commented 2 months ago

Note that at least on Linux you can also do that by executing the process via the prlimit command.

akalenyu commented 2 months ago

So playing around a bit with this new finding, I am getting larger VSS numbers in the core dump than I was expecting. I acknowledge that I might be doing something terribly wrong, but, would 1.4Gi be expected with the below program?

package main

import (
    "fmt"

    "net/http"
)

func main() {
    fmt.Println(http.Client{})
    panic("test")
}
$ eu-readelf -l reprducer-cgo-pthread.core | awk '{sz=strtonum($6); if (sz > 50 * 1024 * 1024) {print sz}}' | awk '{n += $1}; END{print n}'
1505267712
gopherbot commented 2 weeks ago

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)