Closed yulidai closed 5 years ago
It seems when KmsVal proposal, there is a high chance that PreCommit
will not be signed and get error info: (attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>))
.
Edit: It should be that this situation will cause problems:
signed Proposal:40A4050BB4 at h/r/s 308/0/3 (0 ms)
signed PreCommit:<nil> at h/r/s 308/0/6 (0 ms)
signed PreVote:CEB28EBB90 at h/r/s 308/1/6 (0 ms)
signed PreCommit:CEB28EBB90 at h/r/s 308/1/6 (0 ms)
Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS. The KMS will do its best to prevent double signing in these cases. @mdyring and @liangping helped test it, but I wouldn't say it's rock solid yet.
For more information, check out this blog post:
Note this in particular:
For this reason we recommend validators don’t run this sort of configuration in perpetuity, but use the functionality to failover between validators. Based on this incident, and others we’ll describe below, we in fact recommend you only run in this configuration on testnets for now because we think this operational mode needs a lot more testing to be safe.
If I modify the code to ensure that I sign only once at the same height and same round and same type(proposal/prevote/procommit)
. Then change code in state.rs
:
/**** old code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
!(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
{...}
/**** new code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
!(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
&& new_state.block_id.is_some() && self.consensus_state.block_id.is_some()
{...}
It seems to solve this problem.
The test seems to avoid the missed signage caused by the error message like attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)
What do you think of this way?
I believe that would reintroduce the double signing vulnerability from #333 which was fixed in #334.
Note that we've already made a considerable effort to allow signing (and re-signing) in all cases which are not considered double signing.
This is covered in this blog post: https://iqlusion.blog/postmortem-2019-08-08-tendermint-kms-related-cosmos-hub-validator-incident#tendermint-consensus-and-double-signing_1
So sign only once at the same height and same round and same type(proposal/prevote/procommit)
can't bring security guarantee, right?
Correct: there needs to be special logic for handling proof-of-lock-change (PoLC) which is expressed as the <nil>
votes. Without that logic there is a double signing vulnerability as seen in #333.
Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS.
emmm...It seems that when there is only one Validator corresponding to a KMS, this error will also be reported.
If that were true, it would indicate bugs in both the KMS and Tendermint.
Can you paste some loglines where this is happening and confirm the requests originate from the same client?
softsign for test.
[[chain]]
id = "ld_test"
key_format = { type = "bech32", account_key_prefix = "cosmospub", consensus_key_prefix = "cosmosvalconspub" }
## Validator configuration
[[validator]]
addr = "tcp://127.0.0.1:16659"
chain_id = "ld_test"
reconnect = true
secret_key = "secret_connection.key"
[[providers.softsign]]
chain_ids = ["ld_test"]
key_format = "json"
path = "priv_validator_key.json"
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:9FBB94F5F5 at h/r/s 268/0/3 (0 ms)
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9FBB94F5F5 at h/r/s 268/0/6 (0 ms)
06:15:38 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 268/0/6 (9FBB94F5F5 != <nil>)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:97A771C714 at h/r/s 272/0/3 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:97A771C714 at h/r/s 272/0/6 (0 ms)
06:15:59 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 272/0/6 (97A771C714 != <nil>)
gaia_3 | I[2019-09-24|06:15:54.546] Committed state module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_2 | I[2019-09-24|06:15:54.548] Executed block module=state height=271 validTxs=0 invalidTxs=0
gaia_2 | I[2019-09-24|06:15:54.558] Committed state module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_1 | I[2019-09-24|06:15:59.874] Executed block module=state height=272 validTxs=0 invalidTxs=0
gaia_1 | I[2019-09-24|06:15:59.886] Committed state module=state height=272 txs=0 appHash=D9F9C6F45B35EFA856DE018064E0F0F3E7BB358629C94614E36C741051F04542
gaia_kms_1 | E[2019-09-24|06:15:59.966] Error signing vote module=consensus height=272 round=0 vote="Vote{2:93BB1F0645CA 272/00/2(Precommit) 000000000000 000000000000 @ 2019-09-24T06:15:59.964628Z}" err="signerServiceEndpoint returned error #4: double signing requested at height: 272"
gaia_2 | I[2019-09-24|06:15:59.977] Executed block module=state height=272 validTxs=0 invalidTxs=0
gaia_3 | I[2019-09-24|06:15:59.983] Executed block module=state height=272 validTxs=0 invalidTxs=0
gaia_kms_1 | I[2019-09-24|06:15:59.985] Executed block module=state height=272 validTxs=0 invalidTxs=0
Are gaia_1
, gaia_2
, and gaia_3
not separate instances of gaia?
They are separate instances of gaia, genesis.json
is:
{
"genesis_time": "2019-09-22T18:04:18.0316648Z",
"chain_id": "ld_test",
"consensus_params": {
"block": {
"max_bytes": "22020096",
"max_gas": "-1",
"time_iota_ms": "1000"
},
"evidence": {
"max_age": "100000"
},
"validator": {
"pub_key_types": [
"ed25519"
]
}
},
"app_hash": "",
"app_state": {
"gov": {
"starting_proposal_id": "1",
"deposits": null,
"votes": null,
"proposals": null,
"deposit_params": {
"min_deposit": [
{
"denom": "stake",
"amount": "10000000"
}
],
"max_deposit_period": "172800000000000"
},
"voting_params": {
"voting_period": "172800000000000"
},
"tally_params": {
"quorum": "0.334000000000000000",
"threshold": "0.500000000000000000",
"veto": "0.334000000000000000"
}
},
"crisis": {
"constant_fee": {
"denom": "stake",
"amount": "1000"
}
},
"slashing": {
"params": {
"max_evidence_age": "120000000000",
"signed_blocks_window": "100",
"min_signed_per_window": "0.500000000000000000",
"downtime_jail_duration": "600000000000",
"slash_fraction_double_sign": "0.050000000000000000",
"slash_fraction_downtime": "0.010000000000000000"
},
"signing_infos": {},
"missed_blocks": {}
},
"accounts": [
{
"address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
"coins": [
{
"denom": "stake",
"amount": "100000000"
},
{
"denom": "validatortoken",
"amount": "100000000"
}
],
"sequence_number": "0",
"account_number": "0",
"original_vesting": [],
"delegated_free": [],
"delegated_vesting": [],
"start_time": "0",
"end_time": "0",
"module_name": "",
"module_permissions": [
""
]
},
{
"address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
"coins": [
{
"denom": "stake",
"amount": "100000000"
},
{
"denom": "validatortoken",
"amount": "100000000"
}
],
"sequence_number": "0",
"account_number": "0",
"original_vesting": [],
"delegated_free": [],
"delegated_vesting": [],
"start_time": "0",
"end_time": "0",
"module_name": "",
"module_permissions": [
""
]
},
{
"address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
"coins": [
{
"denom": "stake",
"amount": "100000000"
},
{
"denom": "validatortoken",
"amount": "100000000"
}
],
"sequence_number": "0",
"account_number": "0",
"original_vesting": [],
"delegated_free": [],
"delegated_vesting": [],
"start_time": "0",
"end_time": "0",
"module_name": "",
"module_permissions": [
""
]
},
{
"address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
"coins": [
{
"denom": "stake",
"amount": "100000000"
},
{
"denom": "validatortoken",
"amount": "100000000"
}
],
"sequence_number": "0",
"account_number": "0",
"original_vesting": [],
"delegated_free": [],
"delegated_vesting": [],
"start_time": "0",
"end_time": "0",
"module_name": "",
"module_permissions": [
""
]
}
],
"bank": {
"send_enabled": true
},
"supply": {
"supply": []
},
"genutil": {
"gentxs": [
{
"type": "cosmos-sdk/StdTx",
"value": {
"msg": [
{
"type": "cosmos-sdk/MsgCreateValidator",
"value": {
"description": {
"moniker": "testing",
"identity": "",
"website": "",
"details": ""
},
"commission": {
"rate": "0.100000000000000000",
"max_rate": "0.200000000000000000",
"max_change_rate": "0.010000000000000000"
},
"min_self_delegation": "1",
"delegator_address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
"validator_address": "cosmosvaloper1gqzx83hluqxqj70gw33ufmcvmcugwh73nm0gcg",
"pubkey": "cosmosvalconspub1zcjduepq77h8lt3mzh7fdx6zupp6t5v0h7rhs8tsk0s24stauejjjg8s6hwqgymdrs",
"value": {
"denom": "stake",
"amount": "10000000"
}
}
}
],
"fee": {
"amount": [],
"gas": "200000"
},
"signatures": [
{
"pub_key": {
"type": "tendermint/PubKeySecp256k1",
"value": "A+2x1fROB4QInzo43WqL+Hvz//0CBupgRX/cQAdfV0mP"
},
"signature": "cHjvlpXt31w0+iR7dDjUIB7fyppZkDZhUsdrj8LE/RoOAp6gaJuSUlKONGhrZOzVFiE5qhXGjv/YJYukUhzHBw=="
}
],
"memo": "6eb1c4468e1e2b5a994cb44285d7de5345690ec8@172.17.0.4:26656"
}
},
{
"type": "cosmos-sdk/StdTx",
"value": {
"msg": [
{
"type": "cosmos-sdk/MsgCreateValidator",
"value": {
"description": {
"moniker": "testing",
"identity": "",
"website": "",
"details": ""
},
"commission": {
"rate": "0.100000000000000000",
"max_rate": "0.200000000000000000",
"max_change_rate": "0.010000000000000000"
},
"min_self_delegation": "1",
"delegator_address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
"validator_address": "cosmosvaloper1q2tpku988upsevfmr0ckcvfnng47fxk5w43r8y",
"pubkey": "cosmosvalconspub1zcjduepqfcvduhnpavv253lgf9swu4ee5855lee3x6pxhsakleg7gr36x5gq0hnnf4",
"value": {
"denom": "stake",
"amount": "10000000"
}
}
}
],
"fee": {
"amount": [],
"gas": "200000"
},
"signatures": [
{
"pub_key": {
"type": "tendermint/PubKeySecp256k1",
"value": "A+zEasV3xRemKpsxELYm0O9U2g8puF9V1BfOUKJDa2iM"
},
"signature": "4tZHr8ZxQqX7w8MceMNhVTtH2jsakZB6ewmvlM3PhZB1XxZ9pGbiUm0mYLLIwHi064pthGt6LJvfDi6+m6zo0A=="
}
],
"memo": "b8e38f589cba4466a483d13b66c98b1bcd2a9e10@172.17.0.4:26656"
}
},
{
"type": "cosmos-sdk/StdTx",
"value": {
"msg": [
{
"type": "cosmos-sdk/MsgCreateValidator",
"value": {
"description": {
"moniker": "testing",
"identity": "",
"website": "",
"details": ""
},
"commission": {
"rate": "0.100000000000000000",
"max_rate": "0.200000000000000000",
"max_change_rate": "0.010000000000000000"
},
"min_self_delegation": "1",
"delegator_address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
"validator_address": "cosmosvaloper1nzrrq282fauqc2xdp5tmt7mnsf9xzhlla4329n",
"pubkey": "cosmosvalconspub1zcjduepq9vxlrxqgwvzdk646ue6767n4f6yt60eucr2mhynx949yeh6ddurs3y46wy",
"value": {
"denom": "stake",
"amount": "10000000"
}
}
}
],
"fee": {
"amount": [],
"gas": "200000"
},
"signatures": [
{
"pub_key": {
"type": "tendermint/PubKeySecp256k1",
"value": "A7dYDToebAorbKhF24aq3H+vZUXaN2sPcRKKW7wIS0dc"
},
"signature": "xlsJvaHAYj3Nm6sOfd6WPwe+ge4CdPmUIb3FUFzzRuV6CsQKEFsj7psxo8r7nWqAzkSERTElcIxS99jpEqJ7Fw=="
}
],
"memo": "d522889513c78e4706bd42e840fd8755d7d28a84@172.17.0.4:26656"
}
},
{
"type": "cosmos-sdk/StdTx",
"value": {
"msg": [
{
"type": "cosmos-sdk/MsgCreateValidator",
"value": {
"description": {
"moniker": "testing",
"identity": "",
"website": "",
"details": ""
},
"commission": {
"rate": "0.100000000000000000",
"max_rate": "0.200000000000000000",
"max_change_rate": "0.010000000000000000"
},
"min_self_delegation": "1",
"delegator_address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
"validator_address": "cosmosvaloper1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfxuq4n5",
"pubkey": "cosmosvalconspub1zcjduepqvjkq0x379vsjups9s338rv97f427atz3r4pu4uq0er87tpdnlf0smzftt9",
"value": {
"denom": "stake",
"amount": "10000000"
}
}
}
],
"fee": {
"amount": [],
"gas": "200000"
},
"signatures": [
{
"pub_key": {
"type": "tendermint/PubKeySecp256k1",
"value": "AgtOk6bwPkPOy968rT0gsjAnlFqP9ZIDbHmzlAMOuRTX"
},
"signature": "ArQPT7sJxTBWLe6QnXeBw6i36tS+FnxL2fEvxIXPNscEeaGkhS0TNjyb2QaEXX2alGxP7VUluMMxMfDTlXdPXA=="
}
],
"memo": "ea2f7725d2fea3d0e50ecf79232839058dab3016@172.17.0.4:26656"
}
}
]
},
"auth": {
"params": {
"max_memo_characters": "256",
"tx_sig_limit": "7",
"tx_size_cost_per_byte": "10",
"sig_verify_cost_ed25519": "590",
"sig_verify_cost_secp256k1": "1000"
}
},
"params": null,
"distribution": {
"fee_pool": {
"community_pool": []
},
"community_tax": "0.020000000000000000",
"base_proposer_reward": "0.010000000000000000",
"bonus_proposer_reward": "0.040000000000000000",
"withdraw_addr_enabled": true,
"delegator_withdraw_infos": [],
"previous_proposer": "",
"outstanding_rewards": [],
"validator_accumulated_commissions": [],
"validator_historical_rewards": [],
"validator_current_rewards": [],
"delegator_starting_infos": [],
"validator_slash_events": []
},
"staking": {
"params": {
"unbonding_time": "1814400000000000",
"max_validators": 100,
"max_entries": 7,
"bond_denom": "stake"
},
"last_total_power": "0",
"last_validator_powers": null,
"validators": null,
"delegations": null,
"unbonding_delegations": null,
"redelegations": null,
"exported": false
},
"mint": {
"minter": {
"inflation": "0.130000000000000000",
"annual_provisions": "0.000000000000000000"
},
"params": {
"mint_denom": "stake",
"inflation_rate_change": "0.130000000000000000",
"inflation_max": "0.200000000000000000",
"inflation_min": "0.070000000000000000",
"goal_bonded": "0.670000000000000000",
"blocks_per_year": "6311520"
}
}
}
}
Separate instances of gaia (regardless of where they are located, be it the same host or on separate hosts) act in an uncoordinated manner and can send conflicting signing requests to the KMS, so this is to be expected.
I don't understand, because of ONLY ONE validator named gaia_kms_1
connect to the KMS, four validators using different ips in different containers, docker-compose
like below:
services:
gaia_1:
container_name: gaia_1
networks:
kms_test:
ipv4_address: 172.19.0.2
command: gaiad start
gaia_2:
container_name: gaia_2
networks:
kms_test:
ipv4_address: 172.19.0.3
command: gaiad start
gaia_3:
container_name: gaia_3
networks:
kms_test:
ipv4_address: 172.19.0.4
command: gaiad start
gaia_kms_1:
container_name: gaia_kms_1
networks:
kms_test:
ipv4_address: 172.19.0.5
ports:
- 16659:26659
- 1317:1317
command: gaiad start
Maybe I need to learn and understand what you mean?
Does each validator have a separate consensus key, or are they sharing one?
If this is a network you are trying to spin up yourself, it seems there may not be enough validators to make progress?
Yes, each validator have a separate consensus key
Can you try installing tmkms
directly from GitHub:
cargo install tmkms --git https://github.com/tendermint/kms
It's possible #348 might contain a fix for this scenario.
It looks like it’s okay.
After reading the code of #348 code carefully. I found out that it looks like what i says before:
If I modify the code to ensure that I sign only once at the
same height and same round and same type(proposal/prevote/procommit)
. Then change code instate.rs
:/**** old code ****/ if self.consensus_state.block_id != new_state.block_id && // If we incremented rounds allow locking on a new block new_state.round == self.consensus_state.round && // if we precommitting after prevoting nil allow locking on new block !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none()) {...} /**** new code ****/ if self.consensus_state.block_id != new_state.block_id && // If we incremented rounds allow locking on a new block new_state.round == self.consensus_state.round && // if we precommitting after prevoting nil allow locking on new block !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none()) && new_state.block_id.is_some() && self.consensus_state.block_id.is_some() {...}
It seems to solve this problem. The test seems to avoid the missed signage caused by the error message like
attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)
:P
Thanks for confirming. Looks like we need to get another release out soon.
Vestion
Description
We start a four-validator test network, one of which is use KMS(Called KmsVal below). Whenever it's KmsVal turn to proposal, will get error info. The odds are high, but not 100 percent.
the error info of gaia
the error info of kms
Question
When this error occurs, will it lead to omission or even double signature?