siderolabs / cluster-api-bootstrap-provider-talos

A cluster-api bootstrap provider for deploying Talos clusters.
https://www.talos-systems.com
Mozilla Public License 2.0
103 stars 27 forks source link

`aescbcEncryptionSecret` config is lost for new 1.3 machines #170

Closed bzub closed 1 year ago

bzub commented 1 year ago

The machineconfig in the "bootstrap-data" secret for talosconfigs created with spec.talosVersion set to 1.3 doesn't appear to include the cluster.aescbcEncryptionSecret field. Previous to setting 1.3 the behavior was to include a cluster.aescbcEncryptionSecret field.

The result I've seen is that the final machineconfig has neither aescbcEncryptionSecret or secretboxEncryptionSecret, so no etcd at-rest encryption enabled. This is only for clusters that were created some time ago. Newly created clusters starting with 1.3 get a secretboxEncryptionSecret, so maybe this is an issue with secrets/resources created by a previous version of cluster-api components.

This appears to break attempts to upgrade clusters from Talos version <1.3 to 1.3 if they previously used etcd at-rest encryption via aescbcEncryptionSecret, at least when using the machineconfig offered by Sidero.

bzub commented 1 year ago

I've confirmed the issue happens when the CLUSTER_NAME-talos secret contains secrets.aescbcencryptionsecret. With newly created clusters I see that secret contains secrets.secretboxencryptionsecret and the issue above does not occur.

smira commented 1 year ago

You should never update talosVersion: field if the cluster is being upgraded. If you update that field, new machine config generation features are enabled, but secrets were created for a previous version of Talos.

If you really want to upgrade, you'd have to update secrets to include secretbox encryption secret, but even that is not recommended, as you'd need to keep old AES-CBC secret around.

https://www.talos.dev/v1.3/introduction/what-is-new/#etcd-secrets-encryption-with-secretbox-algorithm

bzub commented 1 year ago

I'd like to continue using the siderolabs cluster-api components to generate machineconfigs over the course of a cluster's life. A machineconfig with both aescbcEncryptionSecret and secretboxEncryptionSecret sets things up nicely in encryptionconfig.yaml to enable users to migrate to the new algorithm. However, generating machineconfigs from a secret bundle that includes both of those secrets seems broken to me, as it results in no encryption settings at all, which then leads to the behavior in the first post of this issue.

I will see about opening an issue in the talos repository about this. Thanks.

smira commented 1 year ago

you can keep using CAPI for a cluster lifetime, but don't change the talosVersion: please. if you want to start using new features available in new Talos releases, please follow upgrade notes and build appropriate config patches. updating talosVersion: for already created clusters leads to unpredictable results.