Closed haslersn closed 1 year ago
I think we should deprecate/remove this persist
flag altogether, as it's confusing in the way it integrates with other features.
Can you elaborate on what you mean? I'm not sure how it currently works (see my original question), so I also don't know/understand what's confusing about it, yet.
For virtualized environments it's useful to have the option to read the cloud-init data source on each boot. This way, one can use the data source as a single source of thruth instead of having stateful nodes. A similar behaviour could be achieved by using a fresh image on each boot, but then also the META partition would be gone, so it would essentially be a new node and the old one would need to be manually removed from the cluster. Furthermore, always using a fresh image doesn't work for master nodes, since they need to keep at least the etcd state. Therefore, it would be nice to have the option to read the cloud-init data source on each subsequent boot.
This is something we should discuss as a team, but historically Talos never persisted machine configuration, so it worked exactly like you described. At some point config persistence was introduced, and it is the default behavior.
There are many cons to loading config each time on boot, there are some arguments for it.
Some features of Talos like upgrading Kubernetes won't work with persist: false
at all.
As to the single source of truth for the machine configuration, we might have an answer for that soon which is not specific to any environment.
upgrading Kubernetes won't work with
persist: false
Why is that? With persist: false
, patching is not supported, so talosctl upgrade-k8s
doesn't work. But I assume one could still take the same approach as when upgrading Talos Linux itself. The corresponding docs state:
Note that unless the Kubernetes version has been specified in the machine config, an upgrade of the Talos Linux OS will also apply an upgrade of the Kubernetes version. Each release of Talos Linux includes the latest stable Kubernetes version by default.
I just wanted to state my position on persist: false
, it's not a decision we made as a team, but we'll consider this issue and make a decision.
The flag is removed in Talos 1.6, as part of the transition to multi-doc.
Currently the config reference just says that
persist
After discussion in Slack, it seems that a more appropriate description would be that it indicates whether to persistently store the applied machine config, so that the machine config needs not be pulled upon boot.
Also, the following question arises: When applying some config with
persist: true
and afterwards applying a config withpersist: false
, will the previous config stay persisted, or does applying withpersist: false
remove the persisted config (if any)? IMO it would make sense to remove the persisted config, because that makes the node less stateful and more declaratively managed.