Closed volovyks closed 6 months ago
Did some research into this. In short, given our first batch customers will also use GCP, I'd recommend our partners to follow similar approach as us, it should protect their keys unless gcp account stolen or gcp goes thru huge security hack.
What we do now, is we save all the private keys(cipher sk, account sk, aws access key, aws secret key, and sk shares) in google secret manager, and when we start an instance, these secret values are passed in as Env variables. We store the cipher pk on the machine that we use to run the terraform apply
command. This approach saves no private keys locally on any machine, not on the machine running terraform, not on the signing nodes either.
So if the keys were to be stolen, it can be stolen in the following ways:
For 1: Suppose we are t-out-of-n, as long as less than n-t nodes got their gcp account hacked, the multichain signing system won't be compromised. We will need to make sure we can kick the hacked nodes. For 2: API calls to secret manager are all authenticated and go thru a secure HTTPS connection. For 3: We need to make sure we can kick that hijacked node. And we can create a new set of keys to start a new node, which then goes thru the process of started -> joining -> resharing-> running. For 4: secrets are always encrypted before persisting to disk in secret manager.
The keys won't be lost unless user lost their gcp account access. But they would be able to get it back typically.
The option out there I see that provide greater security and richer features is Hashicorp Vault. It has more involved encryption, and supports key rotation, especially a feature called dynamic secrets. Hashicorp Vault is also better as a universal solution interfacing all cloud provided instances. But I don't see us needing it soon because:
If there ever comes a day when we have a lot of partners and we need to scale to different cloud providers and step up on security, we could always use the enterprise Hashicorp Vault (we'd need to pay for the cost) and add the option to get secrets from that in our code. Our partners could also easily switch to Hashicorp Vault if they want to.
reference: https://scalesec.com/blog/a-comparison-of-secrets-managers-for-gcp/
Can you elaborate on 4? Where we will store the encryption key?
So by 4 I mean google will encrypt the secrets for us when it persists them to google's disks, we don't need to store that encryption key ourselves. It is possible to have customer managed encryption keys if we want to, but I haven't looked into how that works yet.
I would add that for "1" we should reshare regularly (we have an issue with that)
can you explain why resharing will help? if the gcp account is hacked then hacker can start a node with those keys if they want to, and we will just be resharing with them.
Also, I think that the biggest risk here is the human factor. Yes, they should protect their GCP accounts, they should also limit access to GSM to a limited number of people. Most of the developers should not have the access.
--- agree +10000
For "3", if the node is under control again, resharing should help. "Kick" mechanism is good, but we need to design it from the ground-up and include all other requirements.
the hijacked node will also be involved in resharing and will get a new keyshare too, so that won't help?
I realized that we could use one more strategy to increase security. In our protocol, we use an encryption key. It means that the bad actor must still both of them to participate in the protocol in the current epoch. If we separate ownership of these two keys - we will significantly improve security. The attacker will need to get access to 2 Google accounts. We can even ask to give access to the NEAR key to a third person. There is a GCP admin, who controls everything, but that is inevitable.
In general, let's summarize this discussion into a doc and close this issue.