smallstep / certificates

🛡️ A private certificate authority (X.509 & SSH) & ACME server for secure automated certificate management, so you can use TLS everywhere & SSO for SSH.
https://smallstep.com/certificates
Apache License 2.0
6.38k stars 417 forks source link

Handle CA server startup errors #1751

Closed hslatman closed 4 months ago

hslatman commented 4 months ago

Before this change, if one of the CA servers failed to start, this fact would only be logged when the CA was fully stopped. The CA wasn't triggered to stop if one of the servers failed, so it would just continue operating, until it got a signal to restart or stop. Users would thus not know that one of the servers failed to start, unless they knew what to look for in the logging, and deducing that the absence of the server listening log means their configuration may not be OK.

After the change, the CA will be stopped when an error occurs that is not the expected http.ErrServerClosed. Effectively this will make the CA either run all servers successfully, or it'll not run at all.

Examples:

...
2024/03/05 11:05:51 X.509 Root Fingerprint: 5b66c191f67c0f5d6700d5ec99be65b75e78b2a9e7f14ab8a6cca66bec741d8d
2024/03/05 11:05:51 shutting down due to startup error ...
2024/03/05 11:05:51 Serving HTTPS on :8443 ...
badger 2024/03/05 11:05:51 INFO: Storing value log head: {Fid:0 Len:32 Offset:217901}
badger 2024/03/05 11:05:51 INFO: [Compactor: 173] Running compaction: {level:0 score:1.73 dropPrefixes:[]} for level: 0
badger 2024/03/05 11:05:51 INFO: LOG Compact 0->1, del 2 tables, add 1 tables, took 15.767209ms
badger 2024/03/05 11:05:51 INFO: [Compactor: 173] Compaction for level: 0 DONE
badger 2024/03/05 11:05:51 INFO: Force compaction on level 0 done
stopped CA after error occurred: listen tcp :9090: bind: address already in use
exit status 2
...
2024/03/05 11:00:36 Root certificates are available at https://127.0.0.1:8443/roots.pem
2024/03/05 11:00:36 X.509 Root Fingerprint: 5b66c191f67c0f5d6700d5ec99be65b75e78b2a9e7f14ab8a6cca66bec741d8d
2024/03/05 11:00:36 shutting down due to startup error ...
badger 2024/03/05 11:00:36 INFO: Storing value log head: {Fid:0 Len:31 Offset:214556}
2024/03/05 11:00:36 Serving HTTPS on :8443 ...
badger 2024/03/05 11:00:36 INFO: [Compactor: 173] Running compaction: {level:0 score:1.73 dropPrefixes:[]} for level: 0
badger 2024/03/05 11:00:36 INFO: LOG Compact 0->1, del 2 tables, add 1 tables, took 10.357708ms
badger 2024/03/05 11:00:36 INFO: [Compactor: 173] Compaction for level: 0 DONE
badger 2024/03/05 11:00:36 INFO: Force compaction on level 0 done
stopped server after error occurred: listen tcp: address 8080: missing port in address
exit status 2

Normal shutdown:

...
2024/03/05 11:06:25 Root certificates are available at https://127.0.0.1:8443/roots.pem
2024/03/05 11:06:25 X.509 Root Fingerprint: 5b66c191f67c0f5d6700d5ec99be65b75e78b2a9e7f14ab8a6cca66bec741d8d
2024/03/05 11:06:25 Serving HTTP on :8000 ...
2024/03/05 11:06:25 Serving HTTPS on :8443 ...
^C2024/03/05 11:06:32 shutting down ...
badger 2024/03/05 11:06:32 INFO: Storing value log head: {Fid:0 Len:32 Offset:219590}
badger 2024/03/05 11:06:32 INFO: [Compactor: 173] Running compaction: {level:0 score:1.73 dropPrefixes:[]} for level: 0
badger 2024/03/05 11:06:32 INFO: LOG Compact 0->1, del 2 tables, add 1 tables, took 11.123875ms
badger 2024/03/05 11:06:32 INFO: [Compactor: 173] Compaction for level: 0 DONE
badger 2024/03/05 11:06:32 INFO: Force compaction on level 0 done

This addresses #1750