siderolabs / omni

SaaS-simple deployment of Kubernetes - on your own hardware.
Other
574 stars 36 forks source link

[feature] Implement "TrustedRootsConfig" from version 1.8+ of Talos #678

Open Hacksawfred3232 opened 1 month ago

Hacksawfred3232 commented 1 month ago

Problem Description

This is most likely already a checklist item internally, but I'll go ahead and submit this anyway.

In version 1.8+ of Talos Linux, a new manifest called "TrustedRootsConfig" was added, which allows custom CA certificates to be implemented. In an issue I opened in the Talos Linux repo, I thought that the implementation of this was broken, as I thought a plugin was overwriting the certificates. However, investigating further by using Talhelper to establish the cluster instead of Omni revealed that Omni was actually at fault, since it sent a completely new config to the target machines, overwriting everything that was already there. A fact that I should have picked up on in hindsight, since I did check the "STATE" and "META" partition of one of the nodes in the cluster to see what was being written into the config, and should have noticed that the other manifests were there, such as the Siderolink and log sink parameters, but not the "TrustedRootsConfig" manifest.

Not having the "TrustedRootsConfig" manifest on nodes means that a cluster managed by Omni will not work inside an network where using an internal CA for internal services is required, and using Lets Encrypt or similar internally is disallowed by either head of IT or upper execs.

Solution

The most likely solution would be to inform the Omni server of what additional CA certificates are being used in the network, so Omni can then use those certificates in a "TrustedRootsConfig" manifest that gets sent to the nodes alongside the main config. Either through a command-line parameter, or for feature parity between SaaS and self-hosted, a config option on the Omni control panel.

This won't help with initial bootstrapping of nodes to Omni (on self-hosted), but I've worked around that issue by using a separate HTTP server that supplies a boot-time config file containing the Siderolink parameters plus a "TrustedRootsConfig" manifest that allows a node to connect to Omni for the first time. Omni can possibly fulfill that role as well, by allowing the user to either choose between having the parameters supplied as boot parameters or having them served over plain HTTP using the "talos.config" boot parameter, though some ACLs would have to be used for the later to prevent unauthorized access.

Alternative Solutions

No response

Notes

No response

smira commented 1 month ago

It looks like there are two parts here: