openspace42 / aenigma

The | state-of-the-art | secure-by-default | one-touch-deployed | XMPP server for everyone.
https://aenigma.xyz
Other
176 stars 18 forks source link

🔥LE certificate doesn't update itself and etcd fails to start after manual update 🔥 #85

Open solus-hq opened 5 years ago

solus-hq commented 5 years ago

As noticed on one of my servers, the LE certificate didn't update in time therefore resulting in "certificate expired" errors with clients connecting to the server.

I tried updating manually by running "aenigma-push-certs", everything went just fine and I got a NEW one certificate since I had to alter TXT DNS records for LE verification once again (it's and old LE bug or something as far as I remember)

Here are some log files

Sep 12 20:55:55 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:55:55,368 INFO: Selected new etcd server http://EDITED_PRIVACY:2379
Sep 12 20:55:55 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:55:55,370 WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=0, status=None)) after connection broke
Sep 12 20:55:55 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:55:55,370 WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=0, status=None)) after connection broke
Sep 12 20:55:55 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:55:55,371 ERROR: Failed to get list of machines from http://EDITED_PRIVACY:2379/v2: MaxRetryError("HTTPConnectionPool(h
Sep 12 20:55:55 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:55:55,371 INFO: waiting on etcd
Sep 12 20:56:00 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:56:00,376 INFO: Selected new etcd server http://EDITED_PRIVACY:2379
Sep 12 20:56:00 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:56:00,378 WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=0, status=None)) after connection broke
Sep 12 20:56:00 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:56:00,378 WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=0, status=None)) after connection broke
Sep 12 20:56:00 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:56:00,378 ERROR: Failed to get list of machines from http://EDITED_PRIVACY:2379/v2: MaxRetryError("HTTPConnectionPool(h
Sep 12 20:56:00 ae01.EDITED_PRIVACY patroni[492]: 2019-09-12 20:56:00,379 INFO: waiting on etcd

digging further with etcd I found this

root@ae01:~# service etcd status
● etcd.service - etcd - highly-available key value store
  Loaded: loaded (/lib/systemd/system/etcd.service; enabled; vendor preset: enabled)
  Active: failed (Result: exit-code) since Thu 2019-09-12 20:57:16 UTC; 4min 53s ago
    Docs: https://github.com/coreos/etcd
          man:etcd
 Process: 480 ExecStart=/usr/bin/etcd $DAEMON_ARGS (code=exited, status=1/FAILURE)
Main PID: 480 (code=exited, status=1/FAILURE)

Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: Git SHA: Not provided (use ./build instead of go build)
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: Go Version: go1.10
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: Go OS/Arch: linux/amd64
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: setting maximum number of CPUs to 1, total number of available CPUs is 1
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: the server is already initialized as member before, starting as etcd member...
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: peerTLS: cert = /etc/ssl/aenigma/EDITED_PRIVACY.d/fullchain.pem, key = /etc/ssl/aenigma/EDITED_PRIVACY.d/privkey.pem, ca = , trusted-ca = , clie
Sep 12 20:57:16 ae01.EDITED_PRIVACY etcd[480]: open /etc/ssl/aenigma/EDITED_PRIVACY.d/fullchain.pem: permission denied
Sep 12 20:57:16 ae01.EDITED_PRIVACY systemd[1]: etcd.service: Main process exited, code=exited, status=1/FAILURE
Sep 12 20:57:16 ae01.EDITED_PRIVACY systemd[1]: etcd.service: Failed with result 'exit-code'.
Sep 12 20:57:16 ae01.EDITED_PRIVACY systemd[1]: Failed to start etcd - highly-available key value store.

making us somewhat sure that the problem is

open /etc/ssl/aenigma/EDITED_PRIVACY.d/fullchain.pem: permission denied

/etc/ssl/aenigma/EDITED_PRIVACY.d/ directory had its' permissions altered and I manually reset it to 740 along with chowning folder to ejabberd:aenigma

Help is needed to determine if it's a bug because I still can't even run aenigma-upgrade

solus-hq commented 5 years ago

Another server, same issue

Initiating upgrade...

/usr/local/bin/aenigma-upgrade: line 114: custom_branch: unbound variable
root@ae01:~/.ssh#
solus-hq commented 5 years ago

Regarding LE upgrade issue on both servers

root@ae01:/var/log/letsencrypt# tail letsencrypt.log
Traceback (most recent call last):
  File "/usr/bin/certbot", line 11, in <module>
    load_entry_point('certbot==0.31.0', 'console_scripts', 'certbot')()
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1365, in main
    return config.func(config, plugins)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1272, in renew
    renewal.handle_renewal_request(config)
  File "/usr/lib/python3/dist-packages/certbot/renewal.py", line 477, in handle_renewal_request
    len(renew_failures), len(parse_failures)))
certbot.errors.Error: 1 renew failure(s), 0 parse failure(s)
root@ae01:/var/log/letsencrypt# tail -50 letsencrypt.log
certbot.errors.Error: 1 renew failure(s), 0 parse failure(s)
2019-09-16 00:42:43,432:DEBUG:certbot.main:certbot version: 0.31.0
2019-09-16 00:42:43,433:DEBUG:certbot.main:Arguments: ['-q']
2019-09-16 00:42:43,434:DEBUG:certbot.main:Discovered plugins: PluginsRegistry(PluginEntryPoint#manual,PluginEntryPoint#null,PluginEntryPoint#standalone,PluginEntryPoint#webroot)
2019-09-16 00:42:43,454:DEBUG:certbot.log:Root logging level set at 30
2019-09-16 00:42:43,455:INFO:certbot.log:Saving debug log to /var/log/letsencrypt/letsencrypt.log
2019-09-16 00:42:43,462:DEBUG:certbot.plugins.selection:Requested authenticator <certbot.cli._Default object at 0x7f7b2a3ed160> and installer <certbot.cli._Default object at 0x7f7b2a3ed160>
2019-09-16 00:42:43,470:DEBUG:certbot.storage:Should renew, less than 30 days before certificate expiry 2019-09-15 17:45:51 UTC.
2019-09-16 00:42:43,470:INFO:certbot.renewal:Cert is due for renewal, auto-renewing...
2019-09-16 00:42:43,470:INFO:certbot.renewal:Non-interactive renewal: random delay of 41 seconds
2019-09-16 00:43:24,508:DEBUG:certbot.plugins.selection:Requested authenticator manual and installer None
2019-09-16 00:43:24,509:DEBUG:certbot.plugins.disco:Other error:(PluginEntryPoint#manual): An authentication script must be provided with --manual-auth-hook when using the manual plugin non-interactively.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/plugins/disco.py", line 132, in prepare
    self._initialized.prepare()
  File "/usr/lib/python3/dist-packages/certbot/plugins/manual.py", line 133, in prepare
    self.option_name('auth-hook')))
certbot.errors.PluginError: An authentication script must be provided with --manual-auth-hook when using the manual plugin non-interactively.
2019-09-16 00:43:24,511:DEBUG:certbot.plugins.selection:No candidate plugin
2019-09-16 00:43:24,511:DEBUG:certbot.plugins.selection:Selected authenticator None and installer None
2019-09-16 00:43:24,511:INFO:certbot.main:Could not choose appropriate plugin: The manual plugin is not working; there may be problems with your existing configuration.
The error was: PluginError('An authentication script must be provided with --manual-auth-hook when using the manual plugin non-interactively.',)
2019-09-16 00:43:24,513:WARNING:certbot.renewal:Attempting to renew cert (PRIVACY.biz) from /etc/letsencrypt/renewal/PRIVACY.biz.conf produced an unexpected error: The manual plugin is not working; there may be problems with your existing configuration.
The error was: PluginError('An authentication script must be provided with --manual-auth-hook when using the manual plugin non-interactively.',). Skipping.
2019-09-16 00:43:24,520:DEBUG:certbot.renewal:Traceback was:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/renewal.py", line 452, in handle_renewal_request
    main.renew_cert(lineage_config, plugins, renewal_candidate)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1187, in renew_cert
    installer, auth = plug_sel.choose_configurator_plugins(config, plugins, "certonly")
  File "/usr/lib/python3/dist-packages/certbot/plugins/selection.py", line 237, in choose_configurator_plugins
    diagnose_configurator_problem("authenticator", req_auth, plugins)
  File "/usr/lib/python3/dist-packages/certbot/plugins/selection.py", line 341, in diagnose_configurator_problem
    raise errors.PluginSelectionError(msg)
certbot.errors.PluginSelectionError: The manual plugin is not working; there may be problems with your existing configuration.
The error was: PluginError('An authentication script must be provided with --manual-auth-hook when using the manual plugin non-interactively.',)

2019-09-16 00:43:24,520:ERROR:certbot.renewal:All renewal attempts failed. The following certs could not be renewed:
2019-09-16 00:43:24,524:ERROR:certbot.renewal:  /etc/letsencrypt/live/PRIVACY.biz/fullchain.pem (failure)
2019-09-16 00:43:24,525:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/usr/bin/certbot", line 11, in <module>
    load_entry_point('certbot==0.31.0', 'console_scripts', 'certbot')()
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1365, in main
    return config.func(config, plugins)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1272, in renew
    renewal.handle_renewal_request(config)
  File "/usr/lib/python3/dist-packages/certbot/renewal.py", line 477, in handle_renewal_request
    len(renew_failures), len(parse_failures)))
certbot.errors.Error: 1 renew failure(s), 0 parse failure(s)