near / mpc

36 stars 11 forks source link

Make sure that it's hard to lose a node keys #435

Closed volovyks closed 6 months ago

volovyks commented 8 months ago
  1. We must explore best practices for saving node keys
  2. We must provide our partners with instructions on how they need to set their env to make it hard to lose the keys
ppca commented 8 months ago

Did some research into this. In short, given our first batch customers will also use GCP, I'd recommend our partners to follow similar approach as us, it should protect their keys unless gcp account stolen or gcp goes thru huge security hack.

What we do now, is we save all the private keys(cipher sk, account sk, aws access key, aws secret key, and sk shares) in google secret manager, and when we start an instance, these secret values are passed in as Env variables. We store the cipher pk on the machine that we use to run the terraform apply command. This approach saves no private keys locally on any machine, not on the machine running terraform, not on the signing nodes either.

So if the keys were to be stolen, it can be stolen in the following ways:

  1. someone hacks the gcp account and gets all the keys.
  2. the key is somehow stole the keys while keys are being transited from secret manager to the container.
  3. someone hijacks the signing node, they can sign a partial signature, but they don't have access to all other keys.
  4. key stolen from gcp machines storing the keys

For 1: Suppose we are t-out-of-n, as long as less than n-t nodes got their gcp account hacked, the multichain signing system won't be compromised. We will need to make sure we can kick the hacked nodes. For 2: API calls to secret manager are all authenticated and go thru a secure HTTPS connection. For 3: We need to make sure we can kick that hijacked node. And we can create a new set of keys to start a new node, which then goes thru the process of started -> joining -> resharing-> running. For 4: secrets are always encrypted before persisting to disk in secret manager.

The keys won't be lost unless user lost their gcp account access. But they would be able to get it back typically.

The option out there I see that provide greater security and richer features is Hashicorp Vault. It has more involved encryption, and supports key rotation, especially a feature called dynamic secrets. Hashicorp Vault is also better as a universal solution interfacing all cloud provided instances. But I don't see us needing it soon because:

  1. we are not concerned about key rotation at the moment. Hashicorp Vault does an automatic rotation in dynamic secrets, which I think is not what we'd want, at least for the sk key shares. Secret manager allows to reversion quite easily as well, so we have options if we want to do key rotation.
  2. secret manager's encryption should be enough for us now.
  3. we are ok with first batch of partners onboarding on GCP

If there ever comes a day when we have a lot of partners and we need to scale to different cloud providers and step up on security, we could always use the enterprise Hashicorp Vault (we'd need to pay for the cost) and add the option to get secrets from that in our code. Our partners could also easily switch to Hashicorp Vault if they want to.

reference: https://scalesec.com/blog/a-comparison-of-secrets-managers-for-gcp/

volovyks commented 8 months ago
ppca commented 8 months ago
volovyks commented 8 months ago
  1. Ok, make sense.
  2. You are right, if somebody steals all the keys they can participate in resharing. My suggestion works only if they do not do that for a day or two. For example, because they want to steal keys from other nodes.
  3. In your initial message you said "But they don't have access to all other keys". It means that they will not be able to participate in the resharing process. But yeah, in reality, a bad actor will steal all the keys.

I realized that we could use one more strategy to increase security. In our protocol, we use an encryption key. It means that the bad actor must still both of them to participate in the protocol in the current epoch. If we separate ownership of these two keys - we will significantly improve security. The attacker will need to get access to 2 Google accounts. We can even ask to give access to the NEAR key to a third person. There is a GCP admin, who controls everything, but that is inevitable.

In general, let's summarize this discussion into a doc and close this issue.