Closed LittleFox94 closed 7 months ago
Hi @LittleFox94, sorry for the late reply on this. This is just my personal view and does not necessarily represent Kubermatic's, but I think this approach might not be desired for machine-controller. In general, the idea behind Cluster API has always been "immutable infrastructure", so rotating machines frequently (e.g. to replace keys or to use a newer OS image) and normalising such frequent rotation is one of the core ideas behind machine-controller. The gist is, I believe the current approach to be "working as intended".
Since usersshkeysagent
in KKP is pretty heavily tied to the platform, my personal recommendation would be to write a similar controller/agent running as a DaemonSet
that doesn't need KKP. Implementing such an agent is pretty straightforward. The problem with a Job
-based approach is that you might miss changes to the SSH authorized_keys
file from outside, and you might want to watch that continuously to ensure compliance with the list of approved SSH keys in your own controller/agent implementation. But such an implementation would best live separate from machine-controller so it can be applied on any type of Kubernetes cluster.
Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Hi @LittleFox94, sorry for the late reply on this.
Hi, thanks for your detailed reply and no worries for being late - I'm apparently just as late, time is flying by..
This is just my personal view and does not necessarily represent Kubermatic's, but I think this approach might not be desired for machine-controller. In general, the idea behind Cluster API has always been "immutable infrastructure", so rotating machines frequently (e.g. to replace keys or to use a newer OS image) and normalising such frequent rotation is one of the core ideas behind machine-controller. The gist is, I believe the current approach to be "working as intended".
I am familiar with that general idea and approach and, for the most part, I agree with it and think it's a good thing. However, I think this should not be an argument to not allow any changes to already-deployed infrastructure at all. I also like choice.
For some setups, regularly rotating the machines is cumbersome, non-ideal performance-wise or maybe just feasible to recover from a really broken state, maybe because manual work is required or customers yell at you when their apiserver is terminating a watch due to being moved to another node (heavy sigh). machine-controller
already helps a bit for such cases with allowing to disable automated updates for Flatcar (immutable infrastructure would imho also mean "no flatcar updates" but instead rotating machines with the current template, so there is already a precedent in machine-controller
for updating infrastructure instead of rotating).
Changing SSH keys is, imho, lot less of a change to already-deployed infrastructure than updating the OS of a machine and for some setups it might be a useful feature (it would be for us - we use openebs-lvm
on machine-controller-managed machines and while we only store data on nodes that is replicated across multiple nodes, have PodDisruptionBudgets to ensure nodes can only be drained (and thus rotated-away/deleted) when the data is moved and have a controller to facilitate rotating nodes automatically, it's slow and unnecessary "scary" for just rotating some SSH keys). I agree however, that updating SSH keys on deployed machines maybe should not be a default behavior and definitely not an always-on behavior.
Does that change your view a bit? Do you have a better place where other opinions on this could be collected, if anyone else is interested in this at all?
Since
usersshkeysagent
in KKP is pretty heavily tied to the platform, my personal recommendation would be to write a similar controller/agent running as aDaemonSet
that doesn't need KKP. Implementing such an agent is pretty straightforward. [..] But such an implementation would best live separate from machine-controller so it can be applied on any type of Kubernetes cluster.
Sure, having this as an extra DaemonSet independent from KKP would work. If we were to split the existing usersshkeysagent
out of KKP into something independent and changed KKP to use that instead, would that be a change you'd (you personally and you Kubermatic) like?
The problem with a
Job
-based approach is that you might miss changes to the SSHauthorized_keys
file from outside, and you might want to watch that continuously to ensure compliance with the list of approved SSH keys in your own controller/agent implementation.
Fair point, I agree.
(I'll proof-read my reply in a bit, my computer decided to do funny things with my mouse input (not sure if Chromium, Sway, Mouse, USB Dongle for it, ...) and I first want to ensure not loosing my text here^^' huh, funny: https://github.com/swaywm/sway/issues/4261#issuecomment-1842485027)
Does that change your view a bit? Do you have a better place where other opinions on this could be collected, if anyone else is interested in this at all?
I totally agree that "runtime configuration" (like SSH keys) could and maybe should be considered exempt from immutable infrastructure. But I think that also strengthens the point of SSH key management happening in a separate component, since machine-controller also doesn't mange Flatcar updates (instead, you need to deploy flatcar's update operator).
Sure, having this as an extra DaemonSet independent from KKP would work. If we were to split the existing usersshkeysagent out of KKP into something independent and changed KKP to use that instead, would that be a change you'd (you personally and you Kubermatic) like?
We for sure wouldn't mind a "fork" of the functionality based on the KKP code as long as its license is honoured, but at the moment our focus is elsewhere since the functionality is "good enough" for KKP, plus migration logic would be necessary etc. We can maybe revisit your solution at a later point (assuming it's going to be publicly available) and see if we can bring it under the Kubermatic umbrella if you are interested, but at the moment that would be unlikely to happen soon.
huh, funny: https://github.com/swaywm/sway/issues/4261#issuecomment-1842485027
Great choice of WM! Loved using sway back in the days 😄
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
/close
@kubermatic-bot: Closing this issue.
Hi,
it would be useful, if
machine-controller
could update the SSH keys deployed to the machines instead of rotating them with new keys.Right now, the SSH keys are given via
cloud-init
(yea, deprecated, but still in there)/ignition
, with the source of that being changed from templates inmachine-controller
to templates inoperating-system-manager
. Due to the nature of when those are applied, updating the SSH keys via those mechanisms is impractical to impossible.KKP has a component called
usersshkeysagent
, syncing SSH keys onto machines of user clusters. This can already update the keys, maybe it would be practical to move that component into machine-controller and have KKP sync the SSH keys into theMachineDeployment
resources instead of directly to the machines?I think instead of using a
DaemonSet
to have an agent on each node (whatusersshkeysagent
is doing now),machine-controller
could instead spawn aJob
for each node when SSH keys have to be updated.