We set up an SSH Certificate Authority that signs system administrators' keys.
The certificate would indicate who the key can login as and an expiry time. The
authority would broadcast the CA's public key and a list of revoked keys via
https, which other VMs periodically poll and use to update the local sshd
config. It could also broadcast a list of valid sysadmins so that the other VMs
can automatically create local accounts.
This does not depend on being on a secure network, since the public keys are
broadcast via https.
There is no single point of failure, since the CA key is stored locally.
There is, however, still a single point of trust.
To revoke keys (or users), we only have to do it on the CA's machine which
is then broadcast to all other servers.
Since certificates have an expiry, the keys have to be periodicially
re-signed. This lets us keep track of inactive sysadmins (who would also
lose direct access) for better security.
Host keys can be signed too
Cons
This is only supported by OpenSSH; it is not supported by PuTTY or JucieSSH
Problem to be solved
This is a proposed solution to #4.
Solution details
We set up an SSH Certificate Authority that signs system administrators' keys. The certificate would indicate who the key can login as and an expiry time. The authority would broadcast the CA's public key and a list of revoked keys via https, which other VMs periodically poll and use to update the local sshd config. It could also broadcast a list of valid sysadmins so that the other VMs can automatically create local accounts.
See this Facebook engineering post for some implementation details.
Pros and cons
Pros
This does not depend on being on a secure network, since the public keys are broadcast via https.
There is no single point of failure, since the CA key is stored locally. There is, however, still a single point of trust.
To revoke keys (or users), we only have to do it on the CA's machine which is then broadcast to all other servers.
Since certificates have an expiry, the keys have to be periodicially re-signed. This lets us keep track of inactive sysadmins (who would also lose direct access) for better security.
Host keys can be signed too
Cons
Unsolved questions
What is the workflow for signing sysadmins' keys?