debops / ansible-pki

Bootstrap and manage internal PKI, Certificate Authorities and OpenSSL/GnuTLS certificates
GNU General Public License v3.0
65 stars 29 forks source link

KeyError: 'newAccount' when attempting to generate ACME certificate #126

Open adituriya opened 6 years ago

adituriya commented 6 years ago

I love debops and I even saw ACME work once, but I have to say it is super hard to get it working. I've never had this kind of difficulty with other letsencrypt tools or ansible roles. I've tried to piece it together from the official docs, the debops-wordpress project and a few issues documented here, but I'm really not having much joy.

I'm currently stuck on this error, in /etc/pki/realms/mydomain.com/acme/error.log

Parsing account key...
Parsing CSR...
Found domains: www.mydomain.com, mydomain.com
Getting directory...
Directory found!
Registering account...
Traceback (most recent call last):
  File "/usr/local/lib/pki/acme-tiny", line 197, in <module>
    main(sys.argv[1:])
  File "/usr/local/lib/pki/acme-tiny", line 193, in main
    signed_crt = get_crt(args.account_key, args.csr, args.acme_dir, log=LOGGER, CA=args.ca, disable_check=args.disable_check, directory_url=args.directory_url, contact=args.contact)
  File "/usr/local/lib/pki/acme-tiny", line 111, in get_crt
    account, code, acct_headers = _send_signed_request(directory['newAccount'], reg_payload, "Error registering")
KeyError: 'newAccount'

So apparently it is failing when it attempts to send the signed CSR request to letsencrypt. I can't tell much more than that by looking at the code.

Is there any way to get more verbose output to actually see what the problem is? I assume there is something wrong with my CSR request, but really not sure where to look next.

This is in my host inventory (though I've tried a bunch of other permutations, with and without www, with and without subdomain settings):

pki_host_realms:
  - name: www.mydomain.com
    enabled: True
    acme: True

In group_vars/all I have

pki_ca_domain: mycompany.com
pki_ca_organization: My Company
pki_acme: True
pki_acme_install: True
pki_acme_default_subdomains: []

(Note this is a different domain than the website domain I'm trying to get working.)

Then I have an nginx site (which is working great otherwise)

nginx__servers:
  - name: ['www.mydomain.com', 'mydomain.com']
    redirect_from: True
    enabled: True
    ssl: True
    redirect_to_ssl: False
    ...etc

The DNS for the root and www domain resolve correctly to the IP address of the host. The fqdn of the host (gamma.mycompany.com) is not the same as the website (and is on a different domain, if that makes a difference).

The internal certificate is working correctly, and is used when viewing the site in https. Obviously this generates warnings due to the untrusted CA.

Basically my approach is to try something, then delete /etc/pki/realms/mydomain.com on the managed host (and sometimes need to delete /ansible/secret/pki/realms/by-host/gamma.mycompany.com/mydomain.com on the controller host) then try something else and run again - is there a more graceful way (or other things I need to delete before a new run)?

The only time I've had ACME work was when I finally got the configuration right, and then started from scratch on a new server. So I suspect maybe I'm just not cleaning up enough debris from my trial and error.

At this point I'm thinking it may be best to disable debops.pki and try a different tool for managing letsencrypt certificates. But I would love to get this working. Apparently it is possible.

What am I doing wrong / or where should I be looking to get more insight?

drybjed commented 6 years ago

First of all, removing the whole PKI realm directory is a good way to reset the realm, role will not modify an existing realm. You cannot modify a signed certificate anyhow, so deleting the whole directory and letting debops.pki recreate it is fine.

The errors you see are due to the debops.pki pointing the acme-tiny script to an old Let's Encrypt API endpoint. The acme-tiny script has been recently updated to work with ACMEv2 API, and this particular repository of debops.pki hasn't been updated to use the new option of the script. However, the debops.pki role in the DebOps monorepo is already updated via https://github.com/debops/debops/commit/7792560e4a2f7d7d5e291665bda798581a34bfb0 and Let's Encrypt support works again.

I would suggest that you switch to the monorepo version of DebOps, it already has numerous bugs fixed (check the Changelog) and should work mostly the same with DebOps for WordPress.

Switching to the monorepo should be easy - uninstall the current debops pip package and install the latest version of it, then run debops-update. The new debops script should use the monorepo automatically. You can also check out detailed installation instructions.

adituriya commented 6 years ago

Brilliant, thanks. After updating the pip package (it was already up to date) then running debops-update, and deleting the domain's /ansible/secret/pki/realms/... folder (otherwise I was getting the tree access violation error at first)... it just works! I didn't think to update, as I had installed this controller host quite recently. But in the future I will update everything before opening an issue.

Thanks for your reply. FWIW I can confirm that this issue is fixed in latest monorepo.

drybjed commented 6 years ago

Glad to know ti works for you. :-)

DebOps is updated pretty frequently, you might want to keep an eye on the repository. I plan to release a new version in a few days, it should finally have the proper support for playbooks and roles in the Python package, so there will be a way to have a stable version.

xeroc commented 6 years ago

FYI, the content of this repo seems to be different than then content on the debops/debops repo.

drybjed commented 6 years ago

@xeroc Yes, the code from various DebOps repositories was merged some time ago to make development and project management easier. Since Ansible Galaxy team is planning to introduce support for multiple roles in a single repository, I'm waiting for them to do so before messing with the old role repositories to allow existing users to continue using the old versions. Switching to the monorepo is however preferred if you want to get the latest updates.