kubernetes / cloud-provider-aws

Cloud provider for AWS
https://cloud-provider-aws.sigs.k8s.io/
Apache License 2.0
395 stars 304 forks source link

Support Availability Zone ID topology labels #300

Closed rifelpet closed 8 months ago

rifelpet commented 2 years ago

What would you like to be added: Currently AWS CCM populates the topology.kubernetes.io/region and topology.kubernetes.io/zone node labels with the node's region and AZ name respectively:

    topology.kubernetes.io/region: us-east-1
    topology.kubernetes.io/zone: us-east-1a

It would be useful to include a topology label for AZ IDs. topology.kubernetes.io/zone-id or something along those lines. Only the original two labels are standard in the k/k API so we'll need to decide on a new label to use.

    topology.k8s.aws/zone-id: use1-az1

Why is this needed:

For inter-cluster communication that spans AWS accounts, ensuring zonal isolation becomes challenging when the accounts do not have the same AZ mappings. By including AZ ID information in topologies we can more easily ensure that workloads are scheduled to consistent AZ IDs and communicate with other workloads in the same AZ ID regardless of their AWS account or the account's AZ mapping.

/kind feature

olemarkus commented 2 years ago

Yes please! We have the exact same need.

olemarkus commented 2 years ago

Actually, I wonder if not the zone label could change to using zone IDs. I think the transition should be fairly painless as the new CCM leader would just patch all existing nodes.

rifelpet commented 2 years ago

If we reuse the zone topology label then there may be strange behavior of pod topology spread constraints during the upgrade process when nodes have a mismatch of zone IDs and zone names.

Its common for zonal deployments to use nodeAffinity on specific zone label values, so updating the label value in place would break their scheduling unless they were updated prior to the CCM upgrade:

    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - us-east-1a
            - use1-az1
olemarkus commented 2 years ago

Spread constraints is only evaluated during scheduling, so this shouldn't be affected.

But nodeAffinity on a given zone label can cause problems, I suppose.

rifelpet commented 2 years ago

An alternative AWS-specific label could be topology.k8s.aws/zone-id

rifelpet commented 2 years ago

sig-cloud-provider agreed to use an AWS-specific label and/or the ability to configure CCM to use the zone ID in the topology.kubernetes.io/zone label

andrewsykim commented 2 years ago

configure CCM to use the zone ID in the topology.kubernetes.io/zone label

My only caution to using zone-id in topology.kubernetes.io/zone is that it could negatively impact portability of deployments across different AWS accounts.

olemarkus commented 2 years ago

E.g EBS CSI drivers use topology keys and changing the convention on a live cluster may cause problems with PVs. So I think the safer route is to add new labels for now. I'd really like zone to use zone-ids, but it seems like too many things may break over it.

sftim commented 2 years ago

Reposting from https://github.com/aws/containers-roadmap/issues/1638#issuecomment-1049045011

topology.kubernetes.io/zone-id is not registered as a Kubernetes label key. To report an AWS-specific detail, I recommend minting one or two new label keys. For example:

sftim commented 2 years ago

topology.k8s.aws/zone-id - suggested in https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1030199749 - is also fine as a label key; I hadn't seen that before.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1193175823): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dims commented 1 year ago

/reopen

k8s-ci-robot commented 1 year ago

@dims: Reopened this issue.

In response to [this](https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1543038758): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
josh-ferrell commented 1 year ago

/assign

sftim commented 1 year ago

/remove-lifecycle rotten

still seems relevant and referenced

josh-ferrell commented 1 year ago

An alternative AWS-specific label could be topology.k8s.aws/zone-id

I'm in favor of this approach and possibly later adding the ability to configure CCM to populate topology.kubernetes.io/zone differently. I'll start looking into the first idea.

josh-ferrell commented 1 year ago

/assign

JoelSpeed commented 1 year ago

Is there any parallel on other cloud providers? Do we know if Azure/GCP do the similar zone remapping that AWS does?

rifelpet commented 1 year ago

I asked in a sig-cloud-provider meeting around when I opened this issue and was told there is no equivalent concept in GCP and Azure.

sayap commented 1 year ago

OCI does it:

To help balance capacity in data centers, Oracle Cloud Infrastructure randomizes availability domains by tenancy . For example, the availability domain labeled PHX-AD-1 for tenancyA might be a different data center than the one labeled PHX-AD-1 for tenancyB.

(https://docs.oracle.com/en-us/iaas/Content/General/Concepts/regions.htm)

JoelSpeed commented 1 year ago

So if we want to implement this feature, it might make sense to have it consistent across clouds that support it, to do that, we need to know that they expose a way for the CCM to set it. Do we know if oracle has an API similar to AWS?

What happens if the zone-id isn't needed for a platform, lets take Azure for example, if they don't do this, then zone-id isn't needed, we could set it to zone and duplicate the data, or omit it?

If omitted, then manifest examples might not work on platforms that omit it right?

sayap commented 1 year ago

@JoelSpeed You are talking specifically about topology.kubernetes.io/zone-id? https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1568536231 proposed a way forward without setting this label.

JoelSpeed commented 1 year ago

@sayap Yes, I'm trying to work out if there's merit to having this as a standard kube well known label that would work across clouds, seems like if there is precedent for multiple clouds, having a consistent label name would be preferential to specific cloud having their own right?

isugimpy commented 1 year ago

Could this maybe be accomplished via opt-in using a flag on the binary to change topology.kubernetes.io/zone to the ID instead of the name? That'd retain existing compatibility while allowing people to use the feature.

mmerkes commented 1 year ago

I'm in favor of this approach and possibly later adding the ability to configure CCM to populate topology.kubernetes.io/zone differently. I'll start looking into the first idea.

@josh-ferrell This something your working on? Sounds reasonable to me.

k8s-ci-robot commented 10 months ago

@mmerkes: GitHub didn't allow me to assign the following users: merkes.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to [this](https://github.com/kubernetes/cloud-provider-aws/issues/300#issuecomment-1874305157): >/assign merkes Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
mmerkes commented 10 months ago

/assign mmerkes

sftim commented 9 months ago

@rifelpet I think the consensus view is to use topology.k8s.aws/zone-id as the label key for the zone ID. If that looks right to you, would you be willing to update the issue description accordingly?

rifelpet commented 9 months ago

@rifelpet I think the consensus view is to use topology.k8s.aws/zone-id as the label key for the zone ID. If that looks right to you, would you be willing to update the issue description accordingly?

I've updated the issue description to match