hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.31k stars 4.42k forks source link

panic: cannot free page 0 or 1: 0 #3771

Open hehailong5 opened 6 years ago

hehailong5 commented 6 years ago

version: 0.8.4 scenario: disaster recovery with peers.json

==> Starting Consul agent... panic: cannot free page 0 or 1: 0

goroutine 1 [running]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(freelist).free(0xc4203afe90, 0x455, 0x7f36c702a000) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/freelist.go:109 +0x3f0 github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Tx).Commit(0xc4201abdc0, 0x18df07c, 0x4) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/tx.go:176 +0x205 github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(BoltStore).initialize(0xc4203ad760, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:75 +0x139 github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.NewBoltStore(0xc4203b2800, 0x33, 0x2, 0xc4203b2800, 0x33) /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:51 +0xc4 github.com/hashicorp/consul/consul.(Server).setupRaft(0xc420358780, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/consul/server.go:492 +0xabc github.com/hashicorp/consul/consul.NewServerLogger(0xc420358500, 0xc42023ee60, 0x0, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/consul/server.go:320 +0xc1f github.com/hashicorp/consul/command/agent.(Agent).makeServer(0xc4201a06c0, 0x1, 0xc420153c80, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:860 +0x112 github.com/hashicorp/consul/command/agent.(Agent).Start(0xc4201a06c0, 0xc4201a06c0, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:238 +0x4f6 github.com/hashicorp/consul/command/agent.(Command).run(0xc4201ac0f0, 0xc42000e140, 0xf, 0x10, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/command.go:720 +0x4db github.com/hashicorp/consul/command/agent.(Command).Run(0xc4201ac0f0, 0xc42000e140, 0xf, 0x10, 0xc420290140) /gopath/src/github.com/hashicorp/consul/command/agent/command.go:669 +0x56 github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420170840, 0xc420170840, 0x40, 0xc42028e240) /gopath/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:160 +0x1cc main.realMain(0xc4200001a0) /gopath/src/github.com/hashicorp/consul/main.go:54 +0x40d main.main() /gopath/src/github.com/hashicorp/consul/main.go:18 +0x22

slackpad commented 6 years ago

We did update Bolt for 1.0, so need to see if any changes there look related.

slackpad commented 6 years ago

Doesn't look like it's possibly fixed via any of the BoltDB updates that came in Consul 1.0.0, needs further investigation.

hehailong5 commented 6 years ago

Hi,

Any progress on this issue? We encounter a similar one recently in 0.8.4, here goes the crash log:

==> WARNING: Bootstrap mode enabled! Do not enable unless necessary ==> Starting Consul agent... panic: invalid page type: 215: 10

goroutine 1 [running]: github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).search(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0xd7) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:256 +0x405 github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).searchPage(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0x7faa6027e000) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:314 +0x13e github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).search(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0x15a) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:271 +0x1bb github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).searchPage(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0x7faa602a4000) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:314 +0x13e github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).search(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0x180) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:271 +0x1bb github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Cursor).seek(0xc4200a4ca8, 0xc4200a4d10, 0x8, 0x8, 0x0, 0x0, 0x968053, 0xc4203f9cf8, 0xc4203fc880, 0x1, ...) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:159 +0xb1 github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(Bucket).Get(0xc42058bbc0, 0xc4200a4d10, 0x8, 0x8, 0xc42058bbc0, 0x0, 0x7faa60564960) /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/bucket.go:260 +0xef github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(BoltStore).GetLog(0xc42041c7e0, 0x946, 0xc420558060, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:124 +0xfd github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.(LogCache).GetLog(0xc420419e00, 0x946, 0xc420558060, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/log_cache.go:46 +0x110 github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.NewRaft(0xc42027c240, 0x18eac80, 0xc4202f19a0, 0x18f1040, 0xc420419e00, 0x18ed1c0, 0xc42041c7e0, 0x18eb380, 0xc42041c920, 0x18f2360, ...) /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/api.go:489 +0x9ae github.com/hashicorp/consul/consul.(Server).setupRaft(0xc4203ea500, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/consul/server.go:595 +0x5e9 github.com/hashicorp/consul/consul.NewServerLogger(0xc4203ea280, 0xc4202f0be0, 0x0, 0x0, 0x0) /gopath/src/github.com/hashicorp/consul/consul/server.go:320 +0xc1f github.com/hashicorp/consul/command/agent.(Agent).makeServer(0xc42024e6c0, 0x1, 0xc4201f9d40, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:860 +0x112 github.com/hashicorp/consul/command/agent.(Agent).Start(0xc42024e6c0, 0xc42024e6c0, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:238 +0x4f6 github.com/hashicorp/consul/command/agent.(Command).run(0xc420298000, 0xc42000e110, 0xd, 0xd, 0x0) /gopath/src/github.com/hashicorp/consul/command/agent/command.go:720 +0x4db github.com/hashicorp/consul/command/agent.(Command).Run(0xc420298000, 0xc42000e110, 0xd, 0xd, 0xc420286440) /gopath/src/github.com/hashicorp/consul/command/agent/command.go:669 +0x56 github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420248240, 0xc420248240, 0x40, 0xc420233dc0) /gopath/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:160 +0x1cc main.realMain(0xc4200001a0) /gopath/src/github.com/hashicorp/consul/main.go:54 +0x40d main.main() /gopath/src/github.com/hashicorp/consul/main.go:18 +0x22

EugenMayer commented 6 years ago

very similar issue on our side, 1.0.7

goroutine 1 [running]:
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*node).spill(0xc42015e7e0, 0x10, 0x10)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/node.go:375 +0x677
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Bucket).spill(0xc4200650b8, 0xa1bed90, 0x1ea8880)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/bucket.go:570 +0x4b8
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Tx).Commit(0xc4200650a0, 0x1d4509c, 0x4)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/tx.go:163 +0x11f
github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(*BoltStore).initialize(0xc42015b160, 0x0, 0x0)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:75 +0x12a
github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.NewBoltStore(0xc420338740, 0x19, 0x2, 0xc420338740, 0x19)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:51 +0xc4
github.com/hashicorp/consul/agent/consul.(*Server).setupRaft(0xc420598000, 0x0, 0x0)
        /gopath/src/github.com/hashicorp/consul/agent/consul/server.go:495 +0x902
github.com/hashicorp/consul/agent/consul.NewServerLogger(0xc42017a280, 0xc4203a6be0, 0xc4204b9f20, 0x0, 0xc4203a6be0, 0xc4202e6a80)
        /gopath/src/github.com/hashicorp/consul/agent/consul/server.go:336 +0xbae
github.com/hashicorp/consul/agent.(*Agent).Start(0xc420528000, 0xc420528000, 0x0)
        /gopath/src/github.com/hashicorp/consul/agent/agent.go:303 +0x33b
github.com/hashicorp/consul/command/agent.(*cmd).run(0xc420250800, 0xc42002e0a0, 0x2, 0x2, 0x0)
        /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:337 +0x3e5
github.com/hashicorp/consul/command/agent.(*cmd).Run(0xc420250800, 0xc42002e0a0, 0x2, 0x2, 0xc420253ba0)
        /gopath/src/github.com/hashicorp/consul/command/agent/agent.go:77 +0x50
github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc4201c5560, 0xc4201c5560, 0x40, 0xc420253e80)
        /gopath/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:242 +0x1eb
main.realMain(0xc42007a058)
        /gopath/src/github.com/hashicorp/consul/main.go:52 +0x3ee
main.main()
        /gopath/src/github.com/hashicorp/consul/main.go:19 +0x22
consul --version
Consul v1.0.7
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

We are using the official consul docker image

FROM consul:1.0.7

We have a setup with

if that helps in any regard

EugenMayer commented 6 years ago

For us, this entirely blocks us. That is our server_config without the gossip/acl/tls part:

{
  "datacenter": "stable",
  "data_dir": "/consul/data",
  "ui": true,
  "dns_config": {
    "allow_stale": false
  },
  "node_name": "dwconsul",
  "client_addr": "0.0.0.0",
  "server": true,
  "bootstrap_expect": 1,
  "check_update_interval": "0s",
  "disable_update_check": true
}
Oneiroi commented 6 years ago

Consul version 1.2.2 running test setup with CLI: sudo -u consul consul agent -server -bind 127.0.0.1 -data-dir /tmp/consul/ -bootstrap

This was for testing purposes only;' consul throws the error:

bootstrap = true: do not enable unless necessary
==> Starting Consul agent...
panic: invalid page type: 0: 4

goroutine 1 [running]:
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Cursor).search(0xc420026be8, 0x2be30b0, 0x4, 0x4, 0x4)
        /go/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:256 +0x388
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Cursor).seek(0xc420026be8, 0x2be30b0, 0x4, 0x4, 0x0, 0x0, 0xc420480560, 0xc420026c10, 0xa97166, 0xc420480560, ...)
        /go/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/cursor.go:159 +0xa5
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Bucket).Bucket(0xc4204c4398, 0x2be30b0, 0x4, 0x4, 0xc420026c58)
        /go/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/bucket.go:112 +0xde
github.com/hashicorp/consul/vendor/github.com/boltdb/bolt.(*Tx).Bucket(0xc4204c4380, 0x2be30b0, 0x4, 0x4, 0x0)
        /go/src/github.com/hashicorp/consul/vendor/github.com/boltdb/bolt/tx.go:101 +0x4f
github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(*BoltStore).Get(0xc420141fa0, 0x2be3628, 0xb, 0xb, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:210 +0xc5
github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb.(*BoltStore).GetUint64(0xc420141fa0, 0x2be3628, 0xb, 0xb, 0x1de6f80, 0x2be3558, 0xc42020ef70)
        /go/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft-boltdb/bolt_store.go:226 +0x4d
github.com/hashicorp/consul/vendor/github.com/hashicorp/raft.HasExistingState(0x1df9e00, 0xc4201b2ec0, 0x1df4560, 0xc420141fa0, 0x1df1be0, 0xc420322ec0, 0x0, 0x0, 0xc42006cc00)
        /go/src/github.com/hashicorp/consul/vendor/github.com/hashicorp/raft/api.go:354 +0x62
github.com/hashicorp/consul/agent/consul.(*Server).setupRaft(0xc4200de6c0, 0x0, 0x0)
        /go/src/github.com/hashicorp/consul/agent/consul/server.go:602 +0x44e
github.com/hashicorp/consul/agent/consul.NewServerLogger(0xc4200f2280, 0xc4200c2190, 0xc420222120, 0x0, 0xc4200c2190, 0xc420228080)
        /go/src/github.com/hashicorp/consul/agent/consul/server.go:362 +0xbe1
github.com/hashicorp/consul/agent.(*Agent).Start(0xc4204801e0, 0xc4204801e0, 0x0)
        /go/src/github.com/hashicorp/consul/agent/agent.go:330 +0x36b
github.com/hashicorp/consul/command/agent.(*cmd).run(0xc420261800, 0xc4200c4020, 0x6, 0x6, 0x0)
        /go/src/github.com/hashicorp/consul/command/agent/agent.go:219 +0x454
github.com/hashicorp/consul/command/agent.(*cmd).Run(0xc420261800, 0xc4200c4020, 0x6, 0x6, 0xc420322da0)
        /go/src/github.com/hashicorp/consul/command/agent/agent.go:74 +0x50
github.com/hashicorp/consul/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420394120, 0xc420394120, 0x40, 0xc420322e40)
        /go/src/github.com/hashicorp/consul/vendor/github.com/mitchellh/cli/cli.go:242 +0x1eb
main.realMain(0xc4200b0058)
        /go/src/github.com/hashicorp/consul/main.go:53 +0x3ee
main.main()
        /go/src/github.com/hashicorp/consul/main.go:20 +0x22

Other projects are having issues with boltdb too see:

https://github.com/odeke-em/drive/issues/828 https://github.com/influxdata/influxdb/issues/6165

I'm unable to proceed with the test case at this time.

pierresouchay commented 5 years ago

FYI We had this issue when having incorrect ACLs

jsosulska commented 4 years ago

Hello all!

To update this thread - I have created a top level issue to track upgrading BoltDB to bbolt here. Please follow that work as a precursor to the issues mentioned here.

Thank you all for your patience!