SovereignCloudStack / standards

SCS standards in a machine readable format
https://scs.community/
Creative Commons Attribution Share Alike 4.0 International
30 stars 21 forks source link

Create v2 of node distribution standard (issues/#494) #524

Closed cah-hbaum closed 2 weeks ago

cah-hbaum commented 3 months ago

Adds the new label topology.scs.openstack.org/host-id to the standard and extend the standard to require providers to set the labels on their managed k8s clusters.

mbuechse commented 3 months ago

Let Team Container in its next meeting vote on whether this needs a v2; apart from that: good to go!

mbuechse commented 3 months ago

@cah-hbaum you are quick! Let's add a paragraph that explains the changes vs v1.

cah-hbaum commented 3 months ago

@mbuechse Sure thing. Gonna need to do something else first, but after that I'm on it.

martinmo commented 3 months ago

@cah-hbaum I forgot to mention that the title and description of this PR should be updated to reflect that we create a new version v2.

mbuechse commented 3 months ago

@cah-hbaum @martinmo @matofeder Please be careful before merging. This document will be merged in stable state, so let's be extra certain that it is good this time... I pushed the date to April 4, so there's no need to hurry.

cah-hbaum commented 3 months ago

This should be done now. After some more approvals fly in, I will merge this.

cah-hbaum commented 2 months ago

I changed the document state to Draft now like we discussed in the Standardization/Certification meeting.

I also fixed (probably) all problems that had remained, so it should be mergable now.

cah-hbaum commented 2 months ago

@mbuechse Please check this again, since this is in Draft now.

cah-hbaum commented 1 month ago

Wait, why is this sentence still in here? Have you found out in the meantime (with the help of Team Container) what the situation is like with a shared pool of control-plane nodes, and how this would present itself to the test?

No, I haven't (because I had two other meetings conflicting with the last container meeting).

My problem cutting this part would still be that smaller CSPs can't necessarily provide a setup like this, especially for all their clusters. And that probably isn't necessary either. IMO what would be needed here is a requirement for this standard to be used. Now I wouldn't call such a setup "high availability" necessarily, but something in that direction would be nice to have.

cah-hbaum commented 1 month ago

@mbuechse Please check this again and resolve the parts that are fine with you. I'm gonna coordinate the upstream work with @chess-knight now.

mbuechse commented 1 month ago

@cah-hbaum Looking good, thanks for all the work! Now, only the matter of the "optionality" of this standard needs some discussion IMO. The certificate scope yaml permits marking a standard optional, but then I don't know what good the standard is, because most people, when they look at the certificate, they will never notice whether this particular standard was satisfied or not. And I still don't understand why this standard is so hard to satisfy after all, even for small providers. If it's indeed possible that multiple k8s clusters share the same hosts for their control plane, then each cluster can have three different hosts, and all clusters in total still only need three hosts, so where's the problem?

cah-hbaum commented 1 month ago

@cah-hbaum Looking good, thanks for all the work! Now, only the matter of the "optionality" of this standard needs some discussion IMO. The certificate scope yaml permits marking a standard optional, but then I don't know what good the standard is, because most people, when they look at the certificate, they will never notice whether this particular standard was satisfied or not. And I still don't understand why this standard is so hard to satisfy after all, even for small providers. If it's indeed possible that multiple k8s clusters share the same hosts for their control plane, then each cluster can have three different hosts, and all clusters in total still only need three hosts, so where's the problem?

I'm gonna update the standard to make it NOT optional and take it to a vote or request in the "SCS Container" Room.

cah-hbaum commented 1 month ago

Posted into the "SCS Container" Matrix room, just putting the whole thing in here as a paraphrase to make it persistent:

Hello again,

I got some standardization work again for you today and a request for some input for https://github.com/SovereignCloudStack/standards/pull/524. This PR updates the "Node Distribution" standard. Before the latest update, the standard was OPTIONAL for a CSP, since we also had small CSPs with not much available infrastructure in mind. Sure there are possible solutions for these scenarios (e.g. solutions with shared control-planes to make the overall footprint of clusters smaller allegidly), but a change to these solutions isn't always possible.

Now while we tried to figure these problems out, some more potential points for problems appeared: https://github.com/SovereignCloudStack/standards/issues/527 With these failsafe levels, some things we probably need to talk about in the context of this standards come up, like (High) Availability, Failsafe, Failover, Redundancy and so on.

Personally, at this point in time, I would probably rework the standard in a broader fashion. I don't think that we can or should dictate this "Node Distribution" (which is kind of a "Redundancy"/"Availability" feature) for every K8s cluster that is offered through a KaaS offering. Instead, the standard should define the Use Case it wishes to accomplish/help with, which is probably some kind of "(High) Availability", and make this (at least a bit) dependent on the now developing "Failsafe Taxonomy" (https://github.com/SovereignCloudStack/standards/issues/527). We could then define the requirements (in this case the "Node Distribution") for such an offering. The standard should probably made broarder in this case, since "Availability" in K8s requires more than just distribution of nodes. But the currently used document could become a part of this future document.

Now I would like to think what your opinion about this is or if you would like this to go in another direction?

Thanks and best regards Hannes

cah-hbaum commented 1 month ago

Another thing we also maybe need to discuss came up while working on https://github.com/SovereignCloudStack/standards/issues/556. I installed a K8s cluster through yaook/k8s and ran into the problem, that the labels required by this standard were not available. This is kind of obvious, since we don't use the preexisting solutions we expected with this standard. But I'm also a bit doubtful about the way we did checking with this standard / test now. We can't really assure that the necessary labels are available for every solution that could possibly be used for a possibly SCS-compatible cluster.

In my case, is the cluster really "not distributed" just because the labels aren't there; the values even were available just under different labels. TBF I don't have a good solution here, just wanted to ask about our handling of this matter.

(Labels default in yaook/k8s are kubernetes.io/hostname=XXX and topology.cinder.csi.openstack.org/zone=YYY, a region wasn't available)

martinmo commented 1 month ago

IMHO it is ok to require conformant implementations to set some additional labels, even if some work needs to be done initially to support it (e.g., in yaook/k8s). And especially with topology.kubernetes.io/region and topology.kubernetes.io/zone, we have some well-known labels defined upstream by Kubernetes (see https://kubernetes.io/docs/reference/labels-annotations-taints) – basically, they are already standardized, but optional.

Standardized labels are not only useful for the conformance test. Another important use case is that the customers can use these labels to distribute their own applications over failure zones using Pod topology spread constraints.

Also, in https://github.com/SovereignCloudStack/standards/pull/524#discussion_r1609751538, an important question about topology.scs.community/host-id was raised:

And what about bare-metal clusters? In that case, we do not need this specific label right?

Theoretically, in a bare-metal cluster, the kubernetes.io/hostname label would be enough. However, the conformance test would need to detect this situation and correctly decide which label (host-id or hostname) should be used. This is an additonal source of error. I think even a bare metal K8s node should have a host-id label (every Linux machine should have it, accessible via hostid or in binary form at /etc/hostid).

mbuechse commented 1 month ago

I think even a bare metal K8s node should have a host-id label (every Linux machine should have it, accessible via hostid or in binary form at /etc/hostid).

Yes, that would be desireable from my POV. People shouldn't be expected to check whether the cluster is running on bare metal or not.

martinmo commented 1 month ago

(Labels default in yaook/k8s are kubernetes.io/hostname=XXX and topology.cinder.csi.openstack.org/zone=YYY, a region wasn't available)

I double checked this: yaook/k8s connects to the OpenStack layer via the OpenStack cloud controller manager (CCM) which sets the topology.kubernetes.io/region and topology.kubernetes.io/zone labels, amongst others. But it requires that the underlying OpenStack instance has the corresponding metadata set (in other words, it requires proper AZ and region configuration at the IaaS layer).

cah-hbaum commented 1 month ago

Ok makes sense to me, thank you for your answers. We're now just waiting for the discussion this thursday in the "SCS Container Meeting" about keeping/removing/rewriting the part of this standard that makes it optional. After this is resolved, this will be merged along with https://github.com/SovereignCloudStack/standards/pull/558.

cah-hbaum commented 4 weeks ago

I took to the question to the "SCS Container" Matrix room. Since nobody brought something forward against it, I would say we can merge this issue.

cah-hbaum commented 3 weeks ago

@artificial-intelligence I added you here for review, because we had the question, if this standard is possible to do with a Gardener setup and @mbuechse mentioned that you have some knowledge here. If that is not the case, you can "unrequest" yourself and maybe put in someone else who has more knowledge about this.

cah-hbaum commented 3 weeks ago

Also because we talked about this in the meeting: This standard is kind of "blocked" right now, since neither the reference implementation nor a few other K8s examples automatically provide the labels mentioned here. BUT the standard could be merged nonetheless, since it is only in Draft for now.

cah-hbaum commented 2 weeks ago

I am wondering if this standard is valid also for clusters with the managed control plane. E.g. Kamaji, where CP runs inside pods in the management cluster and when you do kubectl get nodes in the workload cluster you see only worker nodes there.

Thats also something we had some discussions about already and we decided that we want to have a discussion about this in the follow up issue that would make this standard Stable.

cah-hbaum commented 2 weeks ago

I'm gonna keep it like it is for now and will update the standard in the follow-up issue with regards to the things discussed here.