osrg / gobgp

BGP implemented in the Go Programming Language
https://osrg.github.io/gobgp/
Apache License 2.0
3.59k stars 684 forks source link

goBGP VM crash #2532

Open bakul-khanna opened 2 years ago

bakul-khanna commented 2 years ago

Hello,

For a VM that has been running the goBGP daemon for ~6 months, we saw a VM crash with the following log. It appears that the VM was down from Feb 21 23:00:02 through Feb 21 23:10:33 and then logged "fatal error: workbuf is not empty", followed by other logs as below.

Has anyone seen this issue? Any suggestions on how to avoid this?

Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "mem_total_kb: 16423604", "timestamp": "2022-02-21 23:00:02,664"}
Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "mem_used_kb: 203088", "timestamp": "2022-02-21 23:00:02,665"}
Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "", "timestamp": "2022-02-21 23:00:02,665"}
Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "", "timestamp": "2022-02-21 23:00:02,665"}
Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "CURRENT MEMORY  USAGE: 1.24%", "timestamp": "2022-02-21 23:00:02,665"}
Feb 21 23:00:02 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io bgp-status[20514]: {"level": "info", "message": "", "timestamp": "2022-02-21 23:00:02,665"}
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: **fatal error: workbuf is not empty**
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime stack:
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.throw(0xd68f0e, 0x14)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/panic.go:608 +0x72
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.(*workbuf).checkempty(0xc000748800)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/mgcwork.go:359 +0x4c
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.getempty(0x7fa5ae7673f8)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/mgcwork.go:371 +0x1f7
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.(*gcWork).init(0xc00003d270)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/mgcwork.go:96 +0x22
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.(*gcWork).putBatch(0xc00003d270, 0xc00003d2a0, 0x1, 0x200)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/mgcwork.go:165 +0x19c
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: runtime.wbBufFlush1(0xc00003c000)
Feb 21 23:10:33 gc-rm-gen3alpha-01-alpha.nae07.gi-nw-spprod.viasat.io gobgpd[610]: #011/usr/local/go/src/runtime/mwbbuf.go:272 +0x1e5

Thanks.

serejkus commented 2 years ago

Looks like it's a go runtime issue https://github.com/golang/go/issues/37793 .