debops / ansible-pki

Bootstrap and manage internal PKI, Certificate Authorities and OpenSSL/GnuTLS certificates
GNU General Public License v3.0
65 stars 29 forks source link

Role breaks without warning (and continues on "normally") if SAN misconfigured #103

Open bryanbecker opened 7 years ago

bryanbecker commented 7 years ago

I had a typo in a script of mine that accidentally passed a CIDR ip range instead of an IP as a subject_alt_name.

Everything seemed to run normally, except the generated certs never appeared on the host. The request file was left as request.pem.tmp. The issue is that the role appears to complete successfully, influencing later roles, and making debugging quite difficult.

Result after completing a broken run:

.
├── [4.0K]  acme
├── [4.0K]  config
│   └── [2.8K]  realm.conf  
├── [4.0K]  external
├── [4.0K]  internal
│   ├── [ 745]  gnutls.conf 
│   └── [   0]  request.pem.tmp
├── [4.0K]  private
│   ├── [1.6K]  key.pem
│   └── [1.6K]  realm_key.pem
└── [4.0K]  public

I would guess the issue is in one of these lines , given that that's where request.pem.tmp is created.

Maybe there is a better way to get the exit status from certtool, or perhaps first check the validity of the generated file?

The exact line in my gnutls.conf that caused the error was:

ip_address = "123.123.123.123/16" when it should have been ip_address = "123.123.123.123"

drybjed commented 7 years ago

I agree that the whole setup is very fragile and hard to debug. The pki-realm script has the set -o errexit enabled, which should have stopped execution when certtool exited with an error, I wonder why it didn't work... Will investigate.

You cannot use CIDR prefixes in certificates that way, so at least you know where the error originates from.

bryanbecker commented 7 years ago

Your system is still much more robust than any others I've seen.

Does bash read the whole if/then thing as one line? the errexit option doesn't break on multiline statements since they only return the exit code of the last statement.

Maybe set -o pipefail which returns the last non-zero exit status code in a line would work?

Also, is there an easy way to regenerate realms/certs that I'm missing? Currently, I'm manually deleting the realm on the host and in the secrets dir, and re-running. (For example changing the SAN doesn't trigger the role to regenerate the certs)

drybjed commented 7 years ago

The -o pipefail is set as well.

Maybe the whole setup could fare better as a separate project with kind of a client-server architecture... But there are a few of these already, with focus on a server side, but not on a client side (PKI realms). And implementing this differently would require changes on Ansible Controller that I'm trying to avoid. Oh well, maybe some other time.

Currently there's no other way to recreate the certificates than removing the PKI realm directory. It should be safe, everything that's inside is expected to be re-created by the role or sourced from the secret/ directory. If you only mess with PKI realms and not authorities, you shouldn't need to remove anything from secret/pki/ directory, role handles that on its own.

I wanted the pki-realm script to be both an interactive and non-interactive interface to the realms. Perhaps adding some code that would reset a specified PKI realm (remove the certs but not the keys, etc.) could be benefical for this. But as you said, scripts are massive - perhaps rewritting them in a better language and splitting the functions into smaller chunks with a git-like CLI interface might be a better idea. This could even help with implementation as a standalone project.

bryanbecker commented 7 years ago

Keep an eye on https://github.com/kubernetes/contrib/tree/master/ansible (and the contrib repo there in general).

They have mentioned working on simple go tools to help with cert generation, and I bet we'll see them sometime this year

drybjed commented 7 years ago

Wow, their etcd Ansible code is... impressive. I was hoping for the etcd package to make it in Debian Stretch, but that sadly wasn't the case. As an alternative, I plan to write a role that properly installs it, similar to debops.hashicorp, perhaps this will give DebOps installations some boost in the form of distributed key/value database. More and more projects depend on etcd, so that will be useful as well.

To be honest, X.509 certificate generation is a solved problem; it can be done by one shell command at this point. The issue is sane management of key/certificate pair in a filesystem, so that other software can get to it. Another anchor, certbot or cfssl won't help when their result is dumped on the user to deal with, each tool having its own idea of how the directory layout should look like.

bryanbecker commented 7 years ago

Do you know of any open source certificate management tools? It seems like there must be already a project for this.

Then you have a debops role/set-of-roles to bootstrap a certificate server, then the rest of certificates are handled through an API that debops-pki-v2.0 can call

bryanbecker commented 7 years ago

And yeah, I've been working on combining their ansible roles with yours for a nicer bootstrap from zero to kubernetes experience. Your ansible is much cleaner, but you also have the advantage of debian-only

drybjed commented 7 years ago

Sure.

The last one is probably the best bet at the moment. DebOps can securely install it right now using debops.hashicorp role, all it needs is a configuration role. Vault apart from PKI provides secret support and Ansible has a lookup plugin for it, so it might be even more useful in the future.

When Vault is available, debops.pki could implement support for it as another CA, besides acme, external, internal and selfsigned. At the moment you could use it after writing an external CA script.

AnBuKu commented 7 years ago

Small note on

Do you know of any open source certificate management tools? It seems like there must be already a project for this.

Beside of @drybjed mentioned EJBCA, DogTag and Hashicorp Vault there is

As well there are some ACME "derivate" in Debian.