Closed taariq closed 7 months ago
This is the full error we are seeing:
2:13PM ERR CONSENSUS FAILURE!!! err="non-recoverable error when signing vote (15277751/0)" module=consensus stack="goroutine 837 [running]:
runtime/debug.Stack()
/opt/hostedtoolcache/go/1.21.8/x64/src/runtime/debug/stack.go:24 +0x5e
github.com/cometbft/cometbft/consensus.(*State).receiveRoutine.func2()
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:800 +0x46
panic({0x2d03080?, 0xc00a719cc0?})
/opt/hostedtoolcache/go/1.21.8/x64/src/runtime/panic.go:914 +0x21f
github.com/cometbft/cometbft/consensus.(*State).signVote(0xc001961500, 0x2, {0xc007d79180, 0x20, 0x20}, {0xe0eb5c?, {0xc007d791a0?, 0xc00ca1f2ec?, 0x3035160?}}, 0xc00126af00)
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:2388 +0x639
github.com/cometbft/cometbft/consensus.(*State).signAddVote(0xc001961500, 0x0?, {0xc007d79180, 0x20, 0x20}, {0x1?, {0xc007d791a0?, 0xc008c41ec0?, 0x20?}}, 0xc00126af00)
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:2439 +0x212
github.com/cometbft/cometbft/consensus.(*State).enterPrecommit(0xc001961500, 0xe91eb7, 0x0)
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:1536 +0x1337
github.com/cometbft/cometbft/consensus.(*State).addVote(0xc001961500, 0xc000f6f040, {0xc002df0690, 0x28})
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:2296 +0x186f
github.com/cometbft/cometbft/consensus.(*State).tryAddVote(0xc001961500, 0xc000f6f040, {0xc002df0690?, 0xc00ca1fc08?})
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:2056 +0x26
github.com/cometbft/cometbft/consensus.(*State).handleMsg(0xc001961500, {{0x3c0ebc0, 0xc0009e8a90}, {0xc002df0690, 0x28}})
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:928 +0x3ce
github.com/cometbft/cometbft/consensus.(*State).receiveRoutine(0xc001961500, 0x0)
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:835 +0x3d1
created by github.com/cometbft/cometbft/consensus.(*State).OnStart in goroutine 321
/home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.5/consensus/state.go:397 +0x10c"
2:13PM INF service stop impl=baseWAL module=consensus msg="Stopping baseWAL service" wal=/home/validator/.paloma/data/cs.wal/wal
2:13PM INF service stop impl=Group module=consensus msg="Stopping Group service" wal=/home/validator/.paloma/data/cs.wal/wal
2:13PM INF Timed out dur=3000 height=15277751 module=consensus round=0 step=RoundStepPropose
2:13PM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=4 numToDial=6
The node catches up to the current height, but immediately halts once it is unable to sign. It afterwards shows peering info/errors. It happens irrespective of which height it is trying to sign.
Also noting that this issue is not causing the validator to get jailed as far as I can tell
@frangrech Are you using a remote signer? Are you using the release build or are you building from source?
Yes, tmkms.
@frangrech COO! COO! Heads up. If you are using a remote signer like tmkms on Paloma with the upgrade to cosmos SDK 0.50.5, please make sure you are upgraded on the signer as well. https://github.com/iqlusioninc/tmkms/tags
According to @frangrech
we tried updating tmkms to v0.14.0, but the issue is still persisting. We're getting the same error as I previously reported.
https://github.com/cometbft/cometbft/issues/2711#issuecomment-2037466222 we need to upgrade and implement vote extensions
Superseded by https://github.com/palomachain/paloma/issues/1105
What is happening?
Section description
Provide as much context as you can. Give as much context as you can to make it easier for the developers to figure what is happening.Paloma and pigeon versions and logs
Section description
Write down paloma version. Write down pigeon version. Copy and paste pigeon config file as well as relevant ENV variables.paloma: 1.13.0 pigeons: 1.11.0
How to reproduce?
Section description
Please write detailed steps of what you were doing for this bug to appear.Unable to repro with other validators. Validator unable to sign blocks.
What is the expected behaviour?
Section description
If you know, please write down what is the expected behaviour. If you don't know, that's ok. We can have a discussion in comments.Sign blocks without error.