siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.96k stars 565 forks source link

Improve documentation of "persist" #5545

Closed haslersn closed 1 year ago

haslersn commented 2 years ago

Currently the config reference just says that persist

Indicates whether to pull the machine config upon every boot.

After discussion in Slack, it seems that a more appropriate description would be that it indicates whether to persistently store the applied machine config, so that the machine config needs not be pulled upon boot.

Also, the following question arises: When applying some config with persist: true and afterwards applying a config with persist: false, will the previous config stay persisted, or does applying with persist: false remove the persisted config (if any)? IMO it would make sense to remove the persisted config, because that makes the node less stateful and more declaratively managed.

smira commented 2 years ago

I think we should deprecate/remove this persist flag altogether, as it's confusing in the way it integrates with other features.

haslersn commented 2 years ago

Can you elaborate on what you mean? I'm not sure how it currently works (see my original question), so I also don't know/understand what's confusing about it, yet.

For virtualized environments it's useful to have the option to read the cloud-init data source on each boot. This way, one can use the data source as a single source of thruth instead of having stateful nodes. A similar behaviour could be achieved by using a fresh image on each boot, but then also the META partition would be gone, so it would essentially be a new node and the old one would need to be manually removed from the cluster. Furthermore, always using a fresh image doesn't work for master nodes, since they need to keep at least the etcd state. Therefore, it would be nice to have the option to read the cloud-init data source on each subsequent boot.

smira commented 2 years ago

This is something we should discuss as a team, but historically Talos never persisted machine configuration, so it worked exactly like you described. At some point config persistence was introduced, and it is the default behavior.

There are many cons to loading config each time on boot, there are some arguments for it.

Some features of Talos like upgrading Kubernetes won't work with persist: false at all.

As to the single source of truth for the machine configuration, we might have an answer for that soon which is not specific to any environment.

haslersn commented 2 years ago

upgrading Kubernetes won't work with persist: false

Why is that? With persist: false, patching is not supported, so talosctl upgrade-k8s doesn't work. But I assume one could still take the same approach as when upgrading Talos Linux itself. The corresponding docs state:

Note that unless the Kubernetes version has been specified in the machine config, an upgrade of the Talos Linux OS will also apply an upgrade of the Kubernetes version. Each release of Talos Linux includes the latest stable Kubernetes version by default.

smira commented 2 years ago

I just wanted to state my position on persist: false, it's not a decision we made as a team, but we'll consider this issue and make a decision.

smira commented 1 year ago

The flag is removed in Talos 1.6, as part of the transition to multi-doc.