After Step 1, our security release process will be the following:
We make a patch
We sparsely test it
We send it to the trusted validator group
Once enough of these validators have deployed it, we publish the patch
This proposal focuses on the “distributing the patch” part (steps 3-4) of this process. The “testing” part (step 2) is orthogonal to this proposal, as whatever we choose to do, we need to test with the patch still being secret.
Why should NEAR One work on this
When we need to make a protocol upgrade for a security fix, validators who are not in the trusted group will be kicked out for one epoch, as they are not able to patch their binary ahead-of-time.
Also, if the security vulnerability can lead to bad things happening to validators (eg. DoS), then validators would be vulnerable to the attack, especially during the time between when we publish the patch and when they apply it.
What needs to be accomplished
We can publish binaries in a way that other validators can confirm we did not maliciously introduce vulnerabilities (or were hacked), and we commit to the binaries. Non-trusted validators can then use these binaries to not be kicked in the epoch following the security protocol upgrade, or to mitigate the attack while they don’t know of the patch yet.
In order to do that, we should be publishing reproducible builds.
The steps to do this would be
Make sure our builds are statically-linked and bit-for-bit reproducible (nix could probably help here? Though we’d probably also want to publish our GCP machine type and linux distribution / kernel version, just in case)
On every release, build once in CI and once at another place (eg. the release engineer’s laptop? another CI system? I think having at least one non-automated place would be better), and verify that the built binaries are the same
When we publish a new security release, instead of the TLP:RED phase, we:
Publish a binary for non-trusted validators
Send the patch to trusted validators, as in step 1
Ask trusted validators to redo our build with the patch that they know of, and confirm to non-trusted validators that we did not lie with the build
Trusted validators will run with the patch from step 1, non-trusted validators can choose to use the binary that we provide to them
After the protocol version upgraded, we perform the rest of the step 1 process, publishing the patch to everyone as TLP:GREEN then TLP:CLEAR
As soon as they get the patch, people can then check that our commitment was correct, ie. that with the patch applied, they can actually build the binary that we published
For the commitment step, we might want to have a private key used to sign builds; but seeing that people currently trust that the source code on github comes from us, we’re probably fine just having a git commit that adds the binary hash to the nearcore readme. That’d be way easier for us to handle than a long-lived private key, but we do need to check with validators that they’d be OK with it
Main use case
With this change, we could publish binary-only security releases, without validators having to trust us. It means we may also be able to restrict again the size of the trusted validator group, to make sure the people aware of security vulnerabilities are as few as possible before the network is patched.
Security Release Process Step 1, which is much easier to do, should be performed first, in order to list the validators that will receive the patch and not only the binary.
Out of scope
None
Task list
### Tasks
- [ ] make our release builds reproducible
- [ ] define with the validator community whether they’re fine with just having hashes in github, or whether we need a long-lived private key to sign the builds
- [ ] change our release process to also publish (reproducible) binaries and their hashes
- [ ] change our security release process to match the new process described here
Goals
Background
After Step 1, our security release process will be the following:
This proposal focuses on the “distributing the patch” part (steps 3-4) of this process. The “testing” part (step 2) is orthogonal to this proposal, as whatever we choose to do, we need to test with the patch still being secret.
Why should NEAR One work on this
When we need to make a protocol upgrade for a security fix, validators who are not in the trusted group will be kicked out for one epoch, as they are not able to patch their binary ahead-of-time.
Also, if the security vulnerability can lead to bad things happening to validators (eg. DoS), then validators would be vulnerable to the attack, especially during the time between when we publish the patch and when they apply it.
What needs to be accomplished
We can publish binaries in a way that other validators can confirm we did not maliciously introduce vulnerabilities (or were hacked), and we commit to the binaries. Non-trusted validators can then use these binaries to not be kicked in the epoch following the security protocol upgrade, or to mitigate the attack while they don’t know of the patch yet.
In order to do that, we should be publishing reproducible builds.
The steps to do this would be
Main use case
With this change, we could publish binary-only security releases, without validators having to trust us. It means we may also be able to restrict again the size of the trusted validator group, to make sure the people aware of security vulnerabilities are as few as possible before the network is patched.
Links to external documentations and discussions
Discussion has already started on the full design document.
Estimated effort
The estimated time is one quarter-person.
Assumptions
None
Pre-requisites
Security Release Process Step 1, which is much easier to do, should be performed first, in order to list the validators that will receive the patch and not only the binary.
Out of scope
None
Task list