tendermint / tmkms

Key Management service for Tendermint Validator nodes
Apache License 2.0
142 stars 42 forks source link

KMS double sign false positive on small testnet #357

Closed yulidai closed 5 years ago

yulidai commented 5 years ago

Vestion

Description

We start a four-validator test network, one of which is use KMS(Called KmsVal below). Whenever it's KmsVal turn to proposal, will get error info. The odds are high, but not 100 percent.

Question

When this error occurs, will it lead to omission or even double signature?

yulidai commented 5 years ago

It seems when KmsVal proposal, there is a high chance that PreCommit will not be signed and get error info: (attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)).

Edit: It should be that this situation will cause problems:

signed Proposal:40A4050BB4 at h/r/s 308/0/3 (0 ms)
signed PreCommit:<nil> at h/r/s 308/0/6 (0 ms)
signed PreVote:CEB28EBB90 at h/r/s 308/1/6 (0 ms)
signed PreCommit:CEB28EBB90 at h/r/s 308/1/6 (0 ms)
tarcieri commented 5 years ago

Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS. The KMS will do its best to prevent double signing in these cases. @mdyring and @liangping helped test it, but I wouldn't say it's rock solid yet.

For more information, check out this blog post:

https://iqlusion.blog/postmortem-2019-08-08-tendermint-kms-related-cosmos-hub-validator-incident#tendermint-consensus-and-double-signing_1

Note this in particular:

For this reason we recommend validators don’t run this sort of configuration in perpetuity, but use the functionality to failover between validators. Based on this incident, and others we’ll describe below, we in fact recommend you only run in this configuration on testnets for now because we think this operational mode needs a lot more testing to be safe.

yulidai commented 5 years ago

If I modify the code to ensure that I sign only once at the same height and same round and same type(proposal/prevote/procommit). Then change code in state.rs:

/**** old code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
{...}

/**** new code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
     && new_state.block_id.is_some() && self.consensus_state.block_id.is_some()
{...}

It seems to solve this problem. The test seems to avoid the missed signage caused by the error message like attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)

What do you think of this way?

tarcieri commented 5 years ago

I believe that would reintroduce the double signing vulnerability from #333 which was fixed in #334.

Note that we've already made a considerable effort to allow signing (and re-signing) in all cases which are not considered double signing.

This is covered in this blog post: https://iqlusion.blog/postmortem-2019-08-08-tendermint-kms-related-cosmos-hub-validator-incident#tendermint-consensus-and-double-signing_1

yulidai commented 5 years ago

So sign only once at the same height and same round and same type(proposal/prevote/procommit) can't bring security guarantee, right?

tarcieri commented 5 years ago

Correct: there needs to be special logic for handling proof-of-lock-change (PoLC) which is expressed as the <nil> votes. Without that logic there is a double signing vulnerability as seen in #333.

yulidai commented 5 years ago

Any time you run multiple concurrent validators against the same KMS, you will see errors like this, as the validators are uncoordinated and sending inconsistent signing requests to the KMS.

emmm...It seems that when there is only one Validator corresponding to a KMS, this error will also be reported.

tarcieri commented 5 years ago

If that were true, it would indicate bugs in both the KMS and Tendermint.

Can you paste some loglines where this is happening and confirm the requests originate from the same client?

yulidai commented 5 years ago

using method

softsign for test.

tmkms.toml

[[chain]]
id = "ld_test"
key_format = { type = "bech32", account_key_prefix = "cosmospub", consensus_key_prefix = "cosmosvalconspub" }

## Validator configuration
[[validator]]
addr = "tcp://127.0.0.1:16659"
chain_id = "ld_test"
reconnect = true
secret_key = "secret_connection.key"

[[providers.softsign]]
chain_ids = ["ld_test"]
key_format = "json"
path = "priv_validator_key.json"
Logs of KMS
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:9FBB94F5F5 at h/r/s 268/0/3 (0 ms)
06:15:37 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9FBB94F5F5 at h/r/s 268/0/6 (0 ms)
06:15:38 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 268/0/6 (9FBB94F5F5 != <nil>)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:43 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:54C9F3D04E at h/r/s 269/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:48 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:DAF65E42C9 at h/r/s 270/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:54 [info] [ld_test@tcp://127.0.0.1:16659] signed PreCommit:9CCA7DB717 at h/r/s 271/0/6 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed Proposal:97A771C714 at h/r/s 272/0/3 (0 ms)
06:15:59 [info] [ld_test@tcp://127.0.0.1:16659] signed PreVote:97A771C714 at h/r/s 272/0/6 (0 ms)
06:15:59 [error] [ld_test:tcp://127.0.0.1:16659] attempted double sign at h/r/s: 272/0/6 (97A771C714 != <nil>)

Logs of Validators

gaia_3        | I[2019-09-24|06:15:54.546] Committed state                              module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_2        | I[2019-09-24|06:15:54.548] Executed block                               module=state height=271 validTxs=0 invalidTxs=0
gaia_2        | I[2019-09-24|06:15:54.558] Committed state                              module=state height=271 txs=0 appHash=5E92E038AB80CE113B1152ACD0DABCC6C5E3564694D8AA164B412667A6B3A4CC
gaia_1        | I[2019-09-24|06:15:59.874] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_1        | I[2019-09-24|06:15:59.886] Committed state                              module=state height=272 txs=0 appHash=D9F9C6F45B35EFA856DE018064E0F0F3E7BB358629C94614E36C741051F04542
gaia_kms_1    | E[2019-09-24|06:15:59.966] Error signing vote                           module=consensus height=272 round=0 vote="Vote{2:93BB1F0645CA 272/00/2(Precommit) 000000000000 000000000000 @ 2019-09-24T06:15:59.964628Z}" err="signerServiceEndpoint returned error #4: double signing requested at height: 272"
gaia_2        | I[2019-09-24|06:15:59.977] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_3        | I[2019-09-24|06:15:59.983] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
gaia_kms_1    | I[2019-09-24|06:15:59.985] Executed block                               module=state height=272 validTxs=0 invalidTxs=0
tarcieri commented 5 years ago

Are gaia_1, gaia_2, and gaia_3 not separate instances of gaia?

yulidai commented 5 years ago

They are separate instances of gaia, genesis.json is:

{
  "genesis_time": "2019-09-22T18:04:18.0316648Z",
  "chain_id": "ld_test",
  "consensus_params": {
    "block": {
      "max_bytes": "22020096",
      "max_gas": "-1",
      "time_iota_ms": "1000"
    },
    "evidence": {
      "max_age": "100000"
    },
    "validator": {
      "pub_key_types": [
        "ed25519"
      ]
    }
  },
  "app_hash": "",
  "app_state": {
    "gov": {
      "starting_proposal_id": "1",
      "deposits": null,
      "votes": null,
      "proposals": null,
      "deposit_params": {
        "min_deposit": [
          {
            "denom": "stake",
            "amount": "10000000"
          }
        ],
        "max_deposit_period": "172800000000000"
      },
      "voting_params": {
        "voting_period": "172800000000000"
      },
      "tally_params": {
        "quorum": "0.334000000000000000",
        "threshold": "0.500000000000000000",
        "veto": "0.334000000000000000"
      }
    },
    "crisis": {
      "constant_fee": {
        "denom": "stake",
        "amount": "1000"
      }
    },
    "slashing": {
      "params": {
        "max_evidence_age": "120000000000",
        "signed_blocks_window": "100",
        "min_signed_per_window": "0.500000000000000000",
        "downtime_jail_duration": "600000000000",
        "slash_fraction_double_sign": "0.050000000000000000",
        "slash_fraction_downtime": "0.010000000000000000"
      },
      "signing_infos": {},
      "missed_blocks": {}
    },
    "accounts": [
      {
        "address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      },
      {
        "address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
        "coins": [
          {
            "denom": "stake",
            "amount": "100000000"
          },
          {
            "denom": "validatortoken",
            "amount": "100000000"
          }
        ],
        "sequence_number": "0",
        "account_number": "0",
        "original_vesting": [],
        "delegated_free": [],
        "delegated_vesting": [],
        "start_time": "0",
        "end_time": "0",
        "module_name": "",
        "module_permissions": [
          ""
        ]
      }
    ],
    "bank": {
      "send_enabled": true
    },
    "supply": {
      "supply": []
    },
    "genutil": {
      "gentxs": [
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1gqzx83hluqxqj70gw33ufmcvmcugwh73k0ma5m",
                  "validator_address": "cosmosvaloper1gqzx83hluqxqj70gw33ufmcvmcugwh73nm0gcg",
                  "pubkey": "cosmosvalconspub1zcjduepq77h8lt3mzh7fdx6zupp6t5v0h7rhs8tsk0s24stauejjjg8s6hwqgymdrs",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A+2x1fROB4QInzo43WqL+Hvz//0CBupgRX/cQAdfV0mP"
                },
                "signature": "cHjvlpXt31w0+iR7dDjUIB7fyppZkDZhUsdrj8LE/RoOAp6gaJuSUlKONGhrZOzVFiE5qhXGjv/YJYukUhzHBw=="
              }
            ],
            "memo": "6eb1c4468e1e2b5a994cb44285d7de5345690ec8@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1q2tpku988upsevfmr0ckcvfnng47fxk5tp9kth",
                  "validator_address": "cosmosvaloper1q2tpku988upsevfmr0ckcvfnng47fxk5w43r8y",
                  "pubkey": "cosmosvalconspub1zcjduepqfcvduhnpavv253lgf9swu4ee5855lee3x6pxhsakleg7gr36x5gq0hnnf4",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A+zEasV3xRemKpsxELYm0O9U2g8puF9V1BfOUKJDa2iM"
                },
                "signature": "4tZHr8ZxQqX7w8MceMNhVTtH2jsakZB6ewmvlM3PhZB1XxZ9pGbiUm0mYLLIwHi064pthGt6LJvfDi6+m6zo0A=="
              }
            ],
            "memo": "b8e38f589cba4466a483d13b66c98b1bcd2a9e10@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1nzrrq282fauqc2xdp5tmt7mnsf9xzhllcp9lfq",
                  "validator_address": "cosmosvaloper1nzrrq282fauqc2xdp5tmt7mnsf9xzhlla4329n",
                  "pubkey": "cosmosvalconspub1zcjduepq9vxlrxqgwvzdk646ue6767n4f6yt60eucr2mhynx949yeh6ddurs3y46wy",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "A7dYDToebAorbKhF24aq3H+vZUXaN2sPcRKKW7wIS0dc"
                },
                "signature": "xlsJvaHAYj3Nm6sOfd6WPwe+ge4CdPmUIb3FUFzzRuV6CsQKEFsj7psxo8r7nWqAzkSERTElcIxS99jpEqJ7Fw=="
              }
            ],
            "memo": "d522889513c78e4706bd42e840fd8755d7d28a84@172.17.0.4:26656"
          }
        },
        {
          "type": "cosmos-sdk/StdTx",
          "value": {
            "msg": [
              {
                "type": "cosmos-sdk/MsgCreateValidator",
                "value": {
                  "description": {
                    "moniker": "testing",
                    "identity": "",
                    "website": "",
                    "details": ""
                  },
                  "commission": {
                    "rate": "0.100000000000000000",
                    "max_rate": "0.200000000000000000",
                    "max_change_rate": "0.010000000000000000"
                  },
                  "min_self_delegation": "1",
                  "delegator_address": "cosmos1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfrg5ql8",
                  "validator_address": "cosmosvaloper1u0037f0vm5ym9kp5y9jt4xeatxzm8lvfxuq4n5",
                  "pubkey": "cosmosvalconspub1zcjduepqvjkq0x379vsjups9s338rv97f427atz3r4pu4uq0er87tpdnlf0smzftt9",
                  "value": {
                    "denom": "stake",
                    "amount": "10000000"
                  }
                }
              }
            ],
            "fee": {
              "amount": [],
              "gas": "200000"
            },
            "signatures": [
              {
                "pub_key": {
                  "type": "tendermint/PubKeySecp256k1",
                  "value": "AgtOk6bwPkPOy968rT0gsjAnlFqP9ZIDbHmzlAMOuRTX"
                },
                "signature": "ArQPT7sJxTBWLe6QnXeBw6i36tS+FnxL2fEvxIXPNscEeaGkhS0TNjyb2QaEXX2alGxP7VUluMMxMfDTlXdPXA=="
              }
            ],
            "memo": "ea2f7725d2fea3d0e50ecf79232839058dab3016@172.17.0.4:26656"
          }
        }
      ]
    },
    "auth": {
      "params": {
        "max_memo_characters": "256",
        "tx_sig_limit": "7",
        "tx_size_cost_per_byte": "10",
        "sig_verify_cost_ed25519": "590",
        "sig_verify_cost_secp256k1": "1000"
      }
    },
    "params": null,
    "distribution": {
      "fee_pool": {
        "community_pool": []
      },
      "community_tax": "0.020000000000000000",
      "base_proposer_reward": "0.010000000000000000",
      "bonus_proposer_reward": "0.040000000000000000",
      "withdraw_addr_enabled": true,
      "delegator_withdraw_infos": [],
      "previous_proposer": "",
      "outstanding_rewards": [],
      "validator_accumulated_commissions": [],
      "validator_historical_rewards": [],
      "validator_current_rewards": [],
      "delegator_starting_infos": [],
      "validator_slash_events": []
    },
    "staking": {
      "params": {
        "unbonding_time": "1814400000000000",
        "max_validators": 100,
        "max_entries": 7,
        "bond_denom": "stake"
      },
      "last_total_power": "0",
      "last_validator_powers": null,
      "validators": null,
      "delegations": null,
      "unbonding_delegations": null,
      "redelegations": null,
      "exported": false
    },
    "mint": {
      "minter": {
        "inflation": "0.130000000000000000",
        "annual_provisions": "0.000000000000000000"
      },
      "params": {
        "mint_denom": "stake",
        "inflation_rate_change": "0.130000000000000000",
        "inflation_max": "0.200000000000000000",
        "inflation_min": "0.070000000000000000",
        "goal_bonded": "0.670000000000000000",
        "blocks_per_year": "6311520"
      }
    }
  }
}
tarcieri commented 5 years ago

Separate instances of gaia (regardless of where they are located, be it the same host or on separate hosts) act in an uncoordinated manner and can send conflicting signing requests to the KMS, so this is to be expected.

yulidai commented 5 years ago

I don't understand, because of ONLY ONE validator named gaia_kms_1 connect to the KMS, four validators using different ips in different containers, docker-compose like below:

services:
  gaia_1:
    container_name: gaia_1
    networks:
      kms_test:
        ipv4_address: 172.19.0.2
    command: gaiad start
  gaia_2:
    container_name: gaia_2
    networks:
      kms_test:
        ipv4_address: 172.19.0.3
    command: gaiad start
  gaia_3:
    container_name: gaia_3
    networks:
      kms_test:
        ipv4_address: 172.19.0.4
    command: gaiad start
  gaia_kms_1:
    container_name: gaia_kms_1
    networks:
      kms_test:
        ipv4_address: 172.19.0.5
    ports:
      - 16659:26659
      - 1317:1317
    command: gaiad start

Maybe I need to learn and understand what you mean?

tarcieri commented 5 years ago

Does each validator have a separate consensus key, or are they sharing one?

If this is a network you are trying to spin up yourself, it seems there may not be enough validators to make progress?

yulidai commented 5 years ago

Yes, each validator have a separate consensus key

tarcieri commented 5 years ago

Can you try installing tmkms directly from GitHub:

cargo install tmkms --git https://github.com/tendermint/kms

It's possible #348 might contain a fix for this scenario.

yulidai commented 5 years ago

It looks like it’s okay.

After reading the code of #348 code carefully. I found out that it looks like what i says before:

If I modify the code to ensure that I sign only once at the same height and same round and same type(proposal/prevote/procommit). Then change code in state.rs:

/**** old code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
{...}

/**** new code ****/
if self.consensus_state.block_id != new_state.block_id &&
// If we incremented rounds allow locking on a new block
    new_state.round == self.consensus_state.round &&
// if we precommitting after prevoting nil allow locking on new block
     !(new_state.step == 3 && self.consensus_state.step == 2 && self.consensus_state.block_id.is_none())
     && new_state.block_id.is_some() && self.consensus_state.block_id.is_some()
{...}

It seems to solve this problem. The test seems to avoid the missed signage caused by the error message like attempted double sign at h/r/s: 64/0/6 (F9794CEE05 != <nil>)

:P

tarcieri commented 5 years ago

Thanks for confirming. Looks like we need to get another release out soon.