Open Ramblurr opened 7 years ago
can you append the log entries before too?
2 quick fixes: there is an option during stack creation for using the Acme staging server, and you can also change your email address (Gmail addresses support +
append, like myemail+whateveryouwant@gmail.com
) to start generating certs again.
This doesn't solve any problems though.
@Ramblurr seconding @Munsio's request - it would be useful to see the logs before the rate limits happen, when certificate requests are (likely) failing for other reasons.
Unfortunately I don't have the logs directly proceeding as I removed the config dir and booted a fresh container.. but here is an example from further up in the logs.
Notably, again, it is trying to authorize FOO but somehow find BAR?
edit: is it possible to trigger a renew from the command line?
DEBU[0000] Parsing: FOO-FOO-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "FOO.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Executing notify command 'acmetool want $(echo "FOO.mydomain.example" | tr , " ")'
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "20170720142252 [ERROR] acme.storageops: could not obtain authorization for BAR.mydomain.example: failed all combinations"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "20170720142252 [ERROR] acme.storageops: Target(BAR.mydomain.example;https://acme-v01.api.letsencrypt.org/directory;0): failed to request certificate: failed all combinations"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "20170720142252 [ERROR] acme.storageops: error while processing targets: the following errors occurred:"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "error satisfying Target(BAR.mydomain.example;https://acme-v01.api.letsencrypt.org/directory;0): failed all combinations"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "20170720142252 [ERROR] acme.storageops: failed to reconcile: the following errors occurred:"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "error satisfying Target(BAR.mydomain.example;https://acme-v01.api.letsencrypt.org/directory;0): failed all combinations"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "20170720142252 [CRITICAL] acmetool: fatal: reconcile: the following errors occurred:"
INFO[0007] [acmetool want $(echo "FOO.mydomain.example" | tr , " ")]: "error satisfying Target(BAR.mydomain.example;https://acme-v01.api.letsencrypt.org/directory;0): failed all combinations"
Here's some more logs. Right before this I restarted the rgon service:
INFO[1082] Exit requested by signal: terminated
/etc/nginx/certs/default/default.pass.key: No such file or directory
140621314571148:error:02001002:system library:fopen:No such file or directory:bss_file.c:402:fopen('/etc/nginx/certs/default/default.pass.key','w')
140621314571148:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:404:
Error opening Private Key /etc/nginx/certs/default/default.pass.key
140036572375948:error:02001002:system library:fopen:No such file or directory:bss_file.c:402:fopen('/etc/nginx/certs/default/default.pass.key','r')
140036572375948:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:404:
unable to load Private Key
rm: can't remove '/etc/nginx/certs/default/default.pass.key': No such file or directory
Error opening Private Key /etc/nginx/certs/default/default.key
140061420919692:error:02001002:system library:fopen:No such file or directory:bss_file.c:402:fopen('/etc/nginx/certs/default/default.key','r')
140061420919692:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:404:
unable to load Private Key
/etc/nginx/certs/default/default.csr: No such file or directory
100.00% 0s .00%
20170721085253 [WARN] acmetool: Don't know how to install a cron job on this system, please install the following job:
SHELL=/bin/sh
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin
MAILTO=root
17 12 * * * root /usr/local/bin/acmetool --batch reconcile
------------------------- Quickstart Complete ----------------------
The quickstart process is complete.
Ensure your chosen challenge conveyance method is configured properly
before attempting to request certificates. You can find more
information about how to configure your system for each method in the
acmetool documentation:
https://github.com/hlandau/acme/blob/master/_doc/WSCONFIG.md
To request a certificate, run:
$ sudo acmetool want example.com www.example.com
If the certificate is successfully obtained, it will be placed in
/var/lib/acme/live/example.com/{cert,chain,fullchain,privkey}.
[ENTRYPOINT]: Running Rancher-Gen first-run
INFO[0000] Starting rancher-gen v0.6.0 (ee2ce5c)
INFO[0000] Initializing Rancher Metadata client (version 2015-12-19)
INFO[0000] Processing all templates once.
DEBU[0000] Checking for metadata change
DEBU[0000] Old version: init, New Version: "17778-4f3c5c96fb170da7fa8781d4ac55192c"
DEBU[0000] Fetching Metadata
DEBU[0000] Processing template /etc/rancher-gen/default/nginx.tmpl for destination /etc/nginx/conf.d/nginx.conf
DEBU[0000] Checking whether content has changed
DEBU[0000] Checksum content: 36420ec4669aacfd38b19cc1ef23e2c9, checksum file:
DEBU[0000] Creating staging file
DEBU[0000] Created staging file /etc/nginx/conf.d/.nginx.conf-221671822
DEBU[0000] Copying file permissions and owner from destination
DEBU[0000] Writing destination
INFO[0000] Destination file has been updated: /etc/nginx/conf.d/nginx.conf
DEBU[0000] Notifying label 'rgon-proxy' with value 'nginx'
DEBU[0000] Fetching Metadata
DEBU[0000] NOTIFY: rgon-proxy-nginx-1 :: [rgon-proxy:nginx]
DEBU[0000] Parsing: rgon-proxy-nginx-1
INFO[0000] Executing notify command 'rgon-exec -name=rgon-proxy-nginx-1 -cmd="service nginx reload"'
INFO[0000] [rgon-exec -name=rgon-proxy-nginx-1 -cmd="service nginx reload"]: "Executing [service nginx reload] on container [rgon-proxy-nginx-1]"
INFO[0000] [rgon-exec -name=rgon-proxy-nginx-1 -cmd="service nginx reload"]: "[....] Reloading nginx: nginx\x1b[?25l\x1b7\x1b[1G[\x1b[32m ok \x1b[39;49m\x1b8\x1b[?12l\x1b[?25h.\r"
INFO[0000] [rgon-exec -name=rgon-proxy-nginx-1 -cmd="service nginx reload"]: "websocket: close 1000 (normal)"
DEBU[0000] Notify cmd output: "Executing [service nginx reload] on container [rgon-proxy-nginx-1]\n[....] Reloading nginx: nginx\x1b[?25l\x1b7\x1b[1G[\x1b[32m ok \x1b[39;49m\x1b8\x1b[?12l\x1b[?25h.\r\nwebsocket: close 1000 (normal)\n"
INFO[0000] All templates processed. Exiting.
[ENTRYPOINT]: Rancher-Gen first-run complete
INFO[0000] Starting rancher-gen v0.6.0 (ee2ce5c)
INFO[0000] Initializing Rancher Metadata client (version 2015-12-19)
INFO[0000] Polling Metadata with %d second interval30
DEBU[0000] Checking for metadata change
DEBU[0000] Old version: init, New Version: "17778-4f3c5c96fb170da7fa8781d4ac55192c"
DEBU[0000] Fetching Metadata
DEBU[0000] No template - processing commands
DEBU[0000] Notifying label 'rgon.ssl'
DEBU[0000] Fetching Metadata
DEBU[0000] NOTIFY: KLAM-KLAM3-1 :: [rgon.ssl:true]
DEBU[0000] NOTIFY: FOO-test-FOO-frontend-1 :: [rgon.ssl:true]
DEBU[0000] NOTIFY: FOO-FOO-frontend-1 :: [rgon.ssl:true]
DEBU[0000] NOTIFY: BAR-mydomain.example-BAR-1 :: [rgon.ssl:true]
DEBU[0000] NOTIFY: DRY-DRY-1 :: [rgon.ssl:true]
DEBU[0000] Parsing: KLAM-KLAM3-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "KLAM.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Check failed, skipping notify-cmd
DEBU[0000] Parsing: FOO-test-FOO-frontend-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "FOO-test.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Check failed, skipping notify-cmd
DEBU[0000] Parsing: FOO-FOO-frontend-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "FOO.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Check failed, skipping notify-cmd
DEBU[0000] Parsing: BAR-mydomain.example-BAR-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "BAR.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Check failed, skipping notify-cmd
DEBU[0000] Parsing: DRY-DRY-1
DEBU[0000] Running check command '[ -d /etc/nginx/certs/$(echo "DRY.mydomain.example" | cut -d"," -f 1) ] && exit 1 || exit 0'
INFO[0000] Check failed, skipping notify-cmd
DEBU[0000] Processing template /etc/rancher-gen/default/nginx.tmpl for destination /etc/nginx/conf.d/nginx.conf
DEBU[0000] Checking whether content has changed
DEBU[0000] Checksum content: 36420ec4669aacfd38b19cc1ef23e2c9, checksum file: 36420ec4669aacfd38b19cc1ef23e2c9
DEBU[0000] Destination /etc/nginx/conf.d/nginx.conf is up to date
INFO[0000] All templates processed. Waiting for changes in Metadata...
DEBU[0030] Checking for metadata change
DEBU[0030] No changes in Metadata
DEBU[0060] Checking for metadata change
DEBU[0060] No changes in Metadata
Sry if it sounds silly but did you obfuscate the logs by changing the real domains to those mydomain.example?
Also before you changed to dev branch did you remove the config folder except your customized one?
Next question - where there already functional letsencrypt certificates for the domains you tried to create one after you switched to dev branch?
Also what could be helpful is sending us the genrated nginx.conf we have an discord server where you can send us logs/configs in private. https://discord.gg/EeBjSr5
Currently we also need to be able to expose port 402 for the acmetool webserver to verify the domains
-- something you could try -- Turn off ssl-generation on the containers by setting rgon.ssl to off and restart rgon service before trying the below.
Exec into container and running "acmetool cull --simulate" if there is some output post it here. Also if you are brave enough you can run it without --simulate to remove old/unused certificates
Exec into container and running "acmetool revoke cert-path" - didn't tried this by myself so i dont know what you need as path - but with that you are revoking the "old" valid certificate and maybe be able to generate it new.
Turn on the ssl labels again and check if acmetool is possible to re-/generate the certificates
Sry if it sounds silly but did you obfuscate the logs by changing the real domains to those mydomain.example?
Yes I did :) on my actual system they are all actual, functioning domains.
Also before you changed to dev branch did you remove the config folder except your customized one? Next question - where there already functional letsencrypt certificates for the domains you tried to create one after you switched to dev branch?
I deleted the configs, but left the certs.
Currently we also need to be able to expose port 402 for the acmetool webserver to verify the domains
Ah, this might be the problem. This port needs to be exposed to the public internet? My rancher server is behind a NAT, and only 80 and 443 are tunneled through.
About the NAT - that shouldn't be a problem with exposed 402 port i only mean that there are no conflicts with other services inside the rancher environment - the nginx-config works as an proxy for le-auth so 80 and 443 are fine.
@Ramblurr - hey there any news on this topic?
I did a completely clean reinstall, waited until the rate limit ban was over, and it seems to be working now. But it just fetched news keys.
It hasn't attempted to renew yet though, which was what the problem was originally. Is there a way to force a renew to test if it works?
@Ramblurr please check your nginx.tmpl if it is the same with the one from the dev branch we added an additional well-known directive under the ssl-server part.
For some reason cert creation is failing, and the tool ends up in a loop where it spams authorization attempts and quickly gets locked out due to the rate limit:
I see this in the log file repeated hundreds of times:
Why exactly it is failing, I'm not sure. What's interesting is that it seems to confuse sub1.mydomain.example and sub2.mydomain.example.