Closed rifelpet closed 8 months ago
Yes please! We have the exact same need.
Actually, I wonder if not the zone label could change to using zone IDs. I think the transition should be fairly painless as the new CCM leader would just patch all existing nodes.
If we reuse the zone
topology label then there may be strange behavior of pod topology spread constraints during the upgrade process when nodes have a mismatch of zone IDs and zone names.
Its common for zonal deployments to use nodeAffinity on specific zone
label values, so updating the label
value in place would break their scheduling unless they were updated prior to the CCM upgrade:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
- use1-az1
Spread constraints is only evaluated during scheduling, so this shouldn't be affected.
But nodeAffinity on a given zone label can cause problems, I suppose.
An alternative AWS-specific label could be topology.k8s.aws/zone-id
sig-cloud-provider agreed to use an AWS-specific label and/or the ability to configure CCM to use the zone ID in the topology.kubernetes.io/zone
label
configure CCM to use the zone ID in the topology.kubernetes.io/zone label
My only caution to using zone-id in topology.kubernetes.io/zone
is that it could negatively impact portability of deployments across different AWS accounts.
E.g EBS CSI drivers use topology keys and changing the convention on a live cluster may cause problems with PVs. So I think the safer route is to add new labels for now. I'd really like zone to use zone-ids, but it seems like too many things may break over it.
Reposting from https://github.com/aws/containers-roadmap/issues/1638#issuecomment-1049045011
topology.kubernetes.io/zone-id
is not registered as a Kubernetes label key. To report an AWS-specific detail, I recommend minting one or two new label keys. For example:
kubernetes.amazonaws.example/availability-zone: us-east-1a
kubernetes.amazonaws.example/availability-zone-id: use1-az1
topology.k8s.aws/zone-id
- suggested in https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1030199749 - is also fine as a label key; I hadn't seen that before.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/reopen
@dims: Reopened this issue.
This issue is currently awaiting triage.
If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/assign
/remove-lifecycle rotten
still seems relevant and referenced
An alternative AWS-specific label could be
topology.k8s.aws/zone-id
I'm in favor of this approach and possibly later adding the ability to configure CCM to populate topology.kubernetes.io/zone
differently. I'll start looking into the first idea.
/assign
Is there any parallel on other cloud providers? Do we know if Azure/GCP do the similar zone remapping that AWS does?
I asked in a sig-cloud-provider meeting around when I opened this issue and was told there is no equivalent concept in GCP and Azure.
OCI does it:
To help balance capacity in data centers, Oracle Cloud Infrastructure randomizes availability domains by tenancy . For example, the availability domain labeled PHX-AD-1 for tenancyA might be a different data center than the one labeled PHX-AD-1 for tenancyB.
(https://docs.oracle.com/en-us/iaas/Content/General/Concepts/regions.htm)
So if we want to implement this feature, it might make sense to have it consistent across clouds that support it, to do that, we need to know that they expose a way for the CCM to set it. Do we know if oracle has an API similar to AWS?
What happens if the zone-id isn't needed for a platform, lets take Azure for example, if they don't do this, then zone-id
isn't needed, we could set it to zone
and duplicate the data, or omit it?
If omitted, then manifest examples might not work on platforms that omit it right?
@JoelSpeed You are talking specifically about topology.kubernetes.io/zone-id
? https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1568536231 proposed a way forward without setting this label.
@sayap Yes, I'm trying to work out if there's merit to having this as a standard kube well known label that would work across clouds, seems like if there is precedent for multiple clouds, having a consistent label name would be preferential to specific cloud having their own right?
Could this maybe be accomplished via opt-in using a flag on the binary to change topology.kubernetes.io/zone
to the ID instead of the name? That'd retain existing compatibility while allowing people to use the feature.
I'm in favor of this approach and possibly later adding the ability to configure CCM to populate topology.kubernetes.io/zone differently. I'll start looking into the first idea.
@josh-ferrell This something your working on? Sounds reasonable to me.
@mmerkes: GitHub didn't allow me to assign the following users: merkes.
Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide
/assign mmerkes
@rifelpet I think the consensus view is to use topology.k8s.aws/zone-id
as the label key for the zone ID. If that looks right to you, would you be willing to update the issue description accordingly?
@rifelpet I think the consensus view is to use
topology.k8s.aws/zone-id
as the label key for the zone ID. If that looks right to you, would you be willing to update the issue description accordingly?
I've updated the issue description to match
What would you like to be added: Currently AWS CCM populates the
topology.kubernetes.io/region
andtopology.kubernetes.io/zone
node labels with the node's region and AZ name respectively:It would be useful to include a topology label for AZ IDs.
topology.kubernetes.io/zone-id
or something along those lines. Only the original two labels are standard in the k/k API so we'll need to decide on a new label to use.Why is this needed:
For inter-cluster communication that spans AWS accounts, ensuring zonal isolation becomes challenging when the accounts do not have the same AZ mappings. By including AZ ID information in topologies we can more easily ensure that workloads are scheduled to consistent AZ IDs and communicate with other workloads in the same AZ ID regardless of their AWS account or the account's AZ mapping.
/kind feature