fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.67k stars 379 forks source link

Catch TLS certificate issues in fleetd at package generation time #20142

Open lucasmrod opened 6 days ago

lucasmrod commented 6 days ago

Problem

Some users realize there's a TLS certificate issue after the fleetd agent is installed on devices. We could perform a "TLS connection check" to find these issues during package "building time" (during fleetctl package) instead of finding these issues at "deploy time" (when installing the packages on the hosts).

Examples:

Potential solutions

A. Have fleetctl package --type=... --fleet-url=https://fleet.example.com perform a connection check to https://fleet.example.com using the certificate that will be bundled with the generated package (to catch certificate issues sooner rather than later). And have a --disable-tls-check or the like to disable this check (in case the address is not up and running when the package is being built).

B. A less disruptive alternative is to add this feature request as an additional flag to fleetctl package e.g. --enable-tls-check/--tls-check, which, if set, would perform a TLS connection to Fleet using the certificate that will be bundled with the generated package .

noahtalerman commented 5 days ago

Hey @lucasmrod! Thanks for tracking this. This is a great idea.

Some users realize there's a TLS certificate issue

In what scenarios does this happen? Is it when self-hosted users deploy Fleet w/o a valid TLS certificate?

perform a "TLS connection check" to find these issues

Also, I'm wondering...is there any way to do this today w/o making changes to Fleet? Put differently, is there a third-party tool we can document / recommend today that does this check?

lucasmrod commented 5 days ago

In what scenarios does this happen? Is it when self-hosted users deploy Fleet w/o a valid TLS certificate?

I've documented the scenarios where this could happen here. So yes, basically self-hosted (hosted internally or with custom certificates) and in some cases with TLS certificates generated by Let's Encrypt (as shown in the linked issues).

Also, I'm wondering...is there any way to do this today w/o making changes to Fleet? Put differently, is there a third-party tool we can document / recommend today that does this check?

Yes. I've added some documentation on how to troubleshoot this here and here. Basically using the existing fleetctl debug connection command.

Given that I've documented fleetctl debug connection on the linked PR am ok to sticking to that for troubleshooting if this happens to any user again (and close this issue as won't do for now). So, feel free to close and we can re-open if more customers/users are hit by this issue.