Closed pio2398 closed 1 year ago
Thx, I have looked at your profile and only 3.8GiB of memory is alive on the heap. I guess you run with the default GOGC value of 200% that means go will only attempt GC when your heap size is 200% bigger than the previous GC run result, however 3.8GiB * 2 = 7.6GiB, and I guess ~400MiB isn't enough to run the rest of your system. In other words, IPFS is only using half of the ram, the other half is dead values that havn't been reclaimed by Go yet.
Go recently introduced https://pkg.go.dev/runtime/debug#SetMemoryLimit you can set it with GOMEMLIMIT=6GiB
(6GiB because you have 8GiB of ram, so it should leave ~1GiB of ram free for the rest of the OS) while starting a go program, this will force a GC to happen when you use more than 6GiB, this can be a performance killer if you use more than 6GiB because then you essentially run the GC permanently, but if you use 5.5GiB, it will run the GC more often to compensate (instead of OOMing). It's like dynamically reducing GOGC
when you are about to reach the memory limit.
My test to confirm this behaviour was:
package main
import "runtime/debug"
import "runtime"
import "os"
var leak []byte // lots of memory alive to bias GOGC through higher values
func main() {
// update freeMemory to the memory on your system
const freeMemory = 40 * 1024 * 1024 * 1024
const target = freeMemory / 3 * 2 // try to use two third of the system os (because with default 200% GOGC it will oom before GCing)
const garbage = 1024 * 1024
leak = make([]byte, target - garbage)
for i := range leak {
leak[i] = 1 // memset to force page commit
}
debug.SetMemoryLimit(freeMemory) // coment that line to test with normal GOGC behaviour
os.Stdout.WriteString("initial leak setupped, now generating garbage!\n")
var keepAlive []byte
for i := freeMemory/garbage * 3; i != 0; i-- {
// run this in for a while, try to generate 3 times more garbage than memory we have
keepAlive = make([]byte, garbage)
for i := range keepAlive {
keepAlive[i] = 1 // memset to force page commit
}
runtime.Gosched() // simulate some IO, let the GC maybe run
}
leak = keepAlive
}
Using debug.SetMemoryLimit
did fix OOMs in this synthetic test by running the GC more often (run with GODEBUG=gctrace=1
).
For now the mitigation I'll recomend is manually setting GOMEMLIMIT
environment variable to slightly less than how much free memory while starting Kubo.
In the future hopefully we can configure this automagically with https://github.com/ipfs/kubo/issues/8798/.
Checklist
Installation method
third-party binary
Version
Config
Description
My IPFS was unstable so I decide to remove all config and data and start new fresh instance. I started by adding some local content and pining some remote and IPFS was killed by oomd. Next try also ended with usage more than 8 GB of ram.
diag: ipfs/QmU3EWqCxYsMN3EkuuMgeMnSsvPGW55NPfn3i9jU7BAJ93