allegro / bigcache

Efficient cache for gigabytes of data written in Go.
http://allegro.tech/2016/03/writing-fast-cache-service-in-go.html
Apache License 2.0
7.54k stars 597 forks source link

Benchmark paniced with go1.14 #208

Open Shitaibin opened 4 years ago

Shitaibin commented 4 years ago

What is the issue you are having?

Benchmark paniced with go1.14 in centos7.

What is BigCache doing that it shouldn't?

Should not panic when allocating memory.

Minimal, Complete, and Verifiable Example

bigcache git:(master) go test -bench=. -benchmem -benchtime=4s ./... -timeout 30m
2020/03/13 02:08:26 Allocated new queue in 1.202335ms; Capacity: 585000
2020/03/13 02:08:26 Allocated new queue in 320.745µs; Capacity: 1170000
2020/03/13 02:08:26 Allocated new queue in 296ns; Capacity: 94
goos: linux
goarch: amd64
pkg: github.com/allegro/bigcache/v2
BenchmarkWriteToCacheWith1Shard-4                                4106458          1520 ns/op         608 B/op          3 allocs/op
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-4         6213543           909 ns/op          41 B/op          3 allocs/op
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-4      fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x564930, 0x16)
    /home/centos/go/src/runtime/panic.go:1112 +0x72
runtime.sysMap(0xc1cc000000, 0xc8000000, 0x6a2598)
    /home/centos/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x68d5c0, 0xc8000000, 0x68d5c8, 0x64000)
    /home/centos/go/src/runtime/malloc.go:715 +0x1cd
runtime.(*mheap).grow(0x68d5c0, 0x64000, 0x0)
    /home/centos/go/src/runtime/mheap.go:1286 +0x11c
runtime.(*mheap).allocSpan(0x68d5c0, 0x64000, 0xc000030100, 0x6a25a8, 0x200)
    /home/centos/go/src/runtime/mheap.go:1124 +0x6a0
runtime.(*mheap).alloc.func1()
    /home/centos/go/src/runtime/mheap.go:871 +0x64
runtime.(*mheap).alloc(0x68d5c0, 0x64000, 0x450101, 0x68d5c0)
    /home/centos/go/src/runtime/mheap.go:865 +0x81
runtime.largeAlloc(0xc8000000, 0x101, 0xc085ab0b38)
    /home/centos/go/src/runtime/malloc.go:1152 +0x92
runtime.mallocgc.func1()
    /home/centos/go/src/runtime/malloc.go:1047 +0x46
runtime.systemstack(0x0)
    /home/centos/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
    /home/centos/go/src/runtime/proc.go:1041

goroutine 222 [running]:
runtime.systemstack_switch()
    /home/centos/go/src/runtime/asm_amd64.s:330 fp=0xc085ab0ad8 sp=0xc085ab0ad0 pc=0x461830
runtime.mallocgc(0xc8000000, 0x52c520, 0x70c4d01, 0xd6987420e770)
    /home/centos/go/src/runtime/malloc.go:1046 +0x895 fp=0xc085ab0b78 sp=0xc085ab0ad8 pc=0x40c1b5
runtime.makeslice(0x52c520, 0xc8000000, 0xc8000000, 0xc085ab0c00)
    /home/centos/go/src/runtime/slice.go:49 +0x6c fp=0xc085ab0ba8 sp=0xc085ab0b78 pc=0x44934c
github.com/allegro/bigcache/v2/queue.(*BytesQueue).allocateAdditionalMemory(0xc000060008, 0x421)
    /home/centos/gopath/src/github.com/allegro/bigcache/queue/bytes_queue.go:103 +0xbd fp=0xc085ab0c38 sp=0xc085ab0ba8 pc=0x50165d
github.com/allegro/bigcache/v2/queue.(*BytesQueue).Push(0xc000060008, 0xc0000c2480, 0x41d, 0x41d, 0xc0000ca400, 0x400, 0x400)
    /home/centos/gopath/src/github.com/allegro/bigcache/queue/bytes_queue.go:81 +0xda fp=0xc085ab0c78 sp=0xc085ab0c38 pc=0x50151a
github.com/allegro/bigcache/v2.(*cacheShard).set(0xc000060000, 0xc051d49b10, 0xb, 0x84b5b6a195780c0b, 0xc0000ca400, 0x400, 0x400, 0xc051d49b10, 0xc00009c0d0)
    /home/centos/gopath/src/github.com/allegro/bigcache/shard.go:140 +0x1b2 fp=0xc085ab0d28 sp=0xc085ab0c78 pc=0x507192
github.com/allegro/bigcache/v2.(*BigCache).Set(0xc01432e0d0, 0xc051d49b10, 0xb, 0xc0000ca400, 0x400, 0x400, 0xb, 0x0)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:134 +0xac fp=0xc085ab0d80 sp=0xc085ab0d28 pc=0x503d7c
github.com/allegro/bigcache/v2.BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard(0xc083a7c1c0)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache_bench_test.go:44 +0x1f4 fp=0xc085ab0ee8 sp=0xc085ab0d80 pc=0x509854
testing.(*B).runN(0xc083a7c1c0, 0x19f652)
    /home/centos/go/src/testing/benchmark.go:191 +0xe8 fp=0xc085ab0f58 sp=0xc085ab0ee8 pc=0x4bcdc8
testing.(*B).launch(0xc083a7c1c0)
    /home/centos/go/src/testing/benchmark.go:321 +0xea fp=0xc085ab0fd8 sp=0xc085ab0f58 pc=0x4bd45a
runtime.goexit()
    /home/centos/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc085ab0fe0 sp=0xc085ab0fd8 pc=0x4637c1
created by testing.(*B).doBench
    /home/centos/go/src/testing/benchmark.go:276 +0x55

goroutine 1 [chan receive]:
testing.(*B).doBench(0xc083a7c1c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/centos/go/src/testing/benchmark.go:277 +0x73
testing.(*benchContext).processBench(0xc03477ec00, 0xc083a7c1c0)
    /home/centos/go/src/testing/benchmark.go:568 +0x207
testing.(*B).run(0xc083a7c1c0)
    /home/centos/go/src/testing/benchmark.go:268 +0x63
testing.(*B).Run(0xc083a7c000, 0x56a5b9, 0x38, 0x56b1b0, 0x4bca00)
    /home/centos/go/src/testing/benchmark.go:652 +0x3fd
testing.runBenchmarks.func1(0xc083a7c000)
    /home/centos/go/src/testing/benchmark.go:533 +0x78
testing.(*B).runN(0xc083a7c000, 0x1)
    /home/centos/go/src/testing/benchmark.go:191 +0xe8
testing.runBenchmarks(0x56675f, 0x1e, 0xc03477ebe0, 0x672ee0, 0xb, 0xb, 0xc0000a4e01)
    /home/centos/go/src/testing/benchmark.go:539 +0x390
testing.(*M).Run(0xc0000d4000, 0x0)
    /home/centos/go/src/testing/testing.go:1206 +0x42d
main.main()
    _testmain.go:152 +0x135

goroutine 154 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x4, 0x3b9aca00, 0x3b9aca00, 0x1, 0x100, 0x0, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 180 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x0, 0x3b9aca00, 0x927c0, 0x1f4, 0x101, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 198 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x8bb2c97000, 0x3b9aca00, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 200 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x8bb2c97000, 0x45d964b800, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x2000, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 194 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x1, 0x12a05f200, 0x45d964b800, 0x1, 0x1, 0x100, 0x58fc40, 0x69fec0, 0x1, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 193 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x3b9aca00, 0x3b9aca00, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 153 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x12a05f200, 0x3b9aca00, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 15 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0xdf8475800, 0x3b9aca00, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 195 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x1, 0x12a05f200, 0x3b9aca00, 0x927c0, 0x1f4, 0x101, 0x58fc40, 0x69fec0, 0x1, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f

goroutine 196 [select]:
github.com/allegro/bigcache/v2.newBigCache.func1(0x400, 0x12a05f200, 0x3b9aca00, 0x927c0, 0x1f4, 0x100, 0x58fc40, 0x69fec0, 0x0, 0x0, ...)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:91 +0x131
created by github.com/allegro/bigcache/v2.newBigCache
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache.go:87 +0x46f
exit status 2
FAIL    github.com/allegro/bigcache/v2  24.119s
PASS
ok      github.com/allegro/bigcache/v2/queue    0.007s
2020/03/13 02:08:49 Entry not found
2020/03/13 02:08:49 stored "testkey" in cache.
2020/03/13 02:08:49 empty request.
2020/03/13 02:08:49 test read error
2020/03/13 02:08:49  not found.
2020/03/13 02:08:49 empty request.
2020/03/13 02:08:49 stored "putKey" in cache.
2020/03/13 02:08:49 invalidDeleteKey not found.
2020/03/13 02:08:49 Entry not found
2020/03/13 02:08:49 entry is bigger than max shard size
PASS
ok      github.com/allegro/bigcache/v2/server   0.130s
FAIL

Environment:

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"


- go version: go version go1.14 linux/amd64
cristaloleg commented 4 years ago

fatal error: runtime: out of memory

@janisz should we do something with it?

siennathesane commented 4 years ago

@Shitaibin can you give some stats about the testing box? If the runtime is out of memory I'm not inherently sure there's much we can do about that (doesn't mean we can't try). It could also be permissions, too. Does this happen on the same box, but with go1.13?

Shitaibin commented 4 years ago

@mxplusb same error with go1.13 in this host. It passed all benchmark in my MacBook Pro with go1.13

Shitaibin commented 4 years ago

@mxplusb @cristaloleg the cause is -benchtime flag. If -benchtime using the default value 1s, all benchmark will passed. If -benchtime > 1s, it will panic. But I don't know why this flag leads to the panic. Does it because of not free memory after each iteration of benchmark?

Info from: https://golang.org/cmd/go/#hdr-Testing_flags

-benchtime t
    Run enough iterations of each benchmark to take t, specified
    as a time.Duration (for example, -benchtime 1h30s).
    The default is 1 second (1s).
    The special syntax Nx means to run the benchmark N times
    (for example, -benchtime 100x).
siennathesane commented 4 years ago

IIRC the internal default time resolution is 1s, but that shouldn't really affect -benchtime too much. Does go test -v -bench=. not provide what you need?

Shitaibin commented 4 years ago

bigcache-benchmark-panic.txt

Comparing without -v flag, it just a few more TEST PAUSE/ PASS information. These information is useless.

My host has 8G memory, does it matter?

siennathesane commented 4 years ago

8G is more than enough. Looking at what you've uploaded, it's throwing the same error, but in a different location:

goroutine 601 [running]:
runtime.systemstack_switch()
    /home/centos/go/src/runtime/asm_amd64.s:330 fp=0xc04a220bc8 sp=0xc04a220bc0 pc=0x459ab0
runtime.(*mheap).alloc(0x678480, 0x1, 0x7f10a0010005, 0x7f10aecaa668)
    /home/centos/go/src/runtime/mheap.go:1085 +0x8a fp=0xc04a220c18 sp=0xc04a220bc8 pc=0x42472a
runtime.(*mcentral).grow(0x678960, 0x0)
    /home/centos/go/src/runtime/mcentral.go:255 +0x7b fp=0xc04a220c58 sp=0xc04a220c18 pc=0x41678b
runtime.(*mcentral).cacheSpan(0x678960, 0x1bf)
    /home/centos/go/src/runtime/mcentral.go:106 +0x2fe fp=0xc04a220cb8 sp=0xc04a220c58 pc=0x4162ae
runtime.(*mcache).refill(0x7f10b479fd98, 0x205)
    /home/centos/go/src/runtime/mcache.go:138 +0x85 fp=0xc04a220cd8 sp=0xc04a220cb8 pc=0x415d55
runtime.(*mcache).nextFree(0x7f10b479fd98, 0xc1cbc3d505, 0xc1cbfffbf0, 0x7f10aecaa668, 0x67be00)
    /home/centos/go/src/runtime/malloc.go:854 +0x87 fp=0xc04a220d10 sp=0xc04a220cd8 pc=0x40b6e7
runtime.mallocgc(0x7, 0x0, 0x502400, 0xc028c51320)
    /home/centos/go/src/runtime/malloc.go:998 +0x5ae fp=0xc04a220db0 sp=0xc04a220d10 pc=0x40be3e
runtime.slicebytetostring(0x0, 0xc04a220e71, 0x7, 0x7, 0x0, 0x0)
    /home/centos/go/src/runtime/string.go:102 +0x9f fp=0xc04a220de0 sp=0xc04a220db0 pc=0x446c0f
strconv.formatBits(0x0, 0x0, 0x0, 0x15c467, 0xa, 0x7ff8f78752670000, 0xc04a220ef0, 0x4fe575, 0xc0000741b0, 0x87f008, ...)
    /home/centos/go/src/strconv/itoa.go:200 +0x318 fp=0xc04a220e98 sp=0xc04a220de0 pc=0x484d08
strconv.FormatInt(0x15c467, 0xa, 0x15c467, 0x0)
    /home/centos/go/src/strconv/itoa.go:29 +0xdb fp=0xc04a220f00 sp=0xc04a220e98 pc=0x48499b
strconv.Itoa(...)
    /home/centos/go/src/strconv/itoa.go:35
github.com/allegro/bigcache/v2.readFromCacheNonExistentKeys.func1(0xc033a26000)
    /home/centos/gopath/src/github.com/allegro/bigcache/bigcache_bench_test.go:175 +0x75 fp=0xc04a220f60 sp=0xc04a220f00 pc=0x510f75
testing.(*B).RunParallel.func1(0xc01fe4c8b0, 0xc01fe4c8a8, 0xc01fe4c8a0, 0xc0ac710380, 0xc0339842c0)
    /home/centos/go/src/testing/benchmark.go:742 +0xa1 fp=0xc04a220fb8 sp=0xc04a220f60 pc=0x4c5621
runtime.goexit()
    /home/centos/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc04a220fc0 sp=0xc04a220fb8 pc=0x45ba01
created by testing.(*B).RunParallel
    /home/centos/go/src/testing/benchmark.go:735 +0x192

Before it was the bytesQueue that was failing to allocate more memory, now it's an anonymous function that can't convert an integer to a string. I don't think this is an issue with bigcache, I think this is an issue with your testing machine, I can't replicate the issue and to my knowledge there's nothing happening in the CI system.

@janisz what do you think?

janisz commented 4 years ago

I think this is an issue with your testing machine,

@Shitaibin can you share your ulimit -a configuration?

Shitaibin commented 4 years ago

The different location is because of using -benchtime=2s

Using -benchtime=4s, it always show the same location. ulimit and details see the log file: centos-benchtime=4s-ulimit.txt

I run the benchmark using the same parameters. It always output another panic error: panic: runtime error: slice bounds out of range [16:0]. ulimit and details see the log file: macos-benchtime=4s-ulimit.txt

Shitaibin commented 4 years ago

I made a group tests using -benchtime and -run=^$ .. The result is :

With -run=^$, it always pass. Without -run=^$, it alway outputs these logs before benchmark:

2020/03/18 02:11:29 Allocated new queue in 264ns; Capacity: 94
2020/03/18 02:11:29 Allocated new queue in 749.477µs; Capacity: 585000
2020/03/18 02:11:29 Allocated new queue in 119.722µs; Capacity: 1170000

Does it matter? Why -run=^$ affect the benchmark behavior.

Shitaibin commented 4 years ago

Running benchmark will running benchmarks after running tests.

-run=^$: not match to any tests. So the cause of panic may be related to tests.

janisz commented 4 years ago

Myabe it's related to https://github.com/allegro/bigcache/issues/148