kubernetes-sigs / cloud-provider-equinix-metal

Kubernetes Cloud Provider for Equinix Metal (formerly Packet Cloud Controller Manager)
https://deploy.equinix.com/labs/cloud-provider-equinix-metal
Apache License 2.0
76 stars 27 forks source link

Add option to use a specific range of IPs #44

Open enkelprifti98 opened 4 years ago

enkelprifti98 commented 4 years ago

This is a necessary use case for customers that bring their own IPs, or Packet global anycast IPs.

displague commented 3 years ago

When you know the specific range that you want to use, you can use the spec.loadBalancerIP field to request it.

When you don't care about the specific range, but you know what tag it should be assigned (or draw from), or what reservation id it should come from, you need a way to identify that. This is like a match/selector for the reservations known in the Equinix Metal API.

In some instances, you may want to ensure that the same IP address is used between two separate services, say one representing the UDP ports of set of pods and another representing the TCP ports of that same set of pods (TCP and UDP can not be mixed in a K8s Service resource).

MetalLB has some annotations that allow for the same IP to be used between services, however we’ve been using kube-vip in the latest integrations, and metallb is not used at all in these cases.

The CCM will provision an extra IP if needed and tell kube-vip about it through annotations, but there is no way to request that the CCM created IP should be tagged with some keyword or request an IP from reservations with a given keyword as a tag.

I think a CCM recognized annotation that sets up this tag dependent behavior would allow us to ensure that multiple services claim the same address without knowing the address in advance (which is what spec.loadBalancerIP allows for today).

IP Reservations also support customdata descriptors, which can be any JSON. I think we would do best to avoid that, leaving the format unopinionated by any official Equinix Metal tools. Tags are flexible enough to support this use-case.

We don’t have an annotation that the CCM can recognize to associate addresses to a label/annotation in this way, that I know of.

deitch commented 3 years ago

We may actually be part of the way there. The following is paraphrased from the README, in both the kube-vip and metallb sections.

For each service of type=LoadBalancer currently in the cluster or added:

The question is, are those tags usable as is? Or do we need something else?

displague commented 3 years ago

Thanks for pointing this out, @deitch. This comment will include a little thinking-out-loud as I go through the README.

When starting the CCM, set the env var METAL_EIP_TAG=<tag>, where is whatever tag you set on the EIP

This tag becomes a CCM-scoped filter, I think we would benefit from an application-scoped setting, specified within each service (as an annotation or label).

Let's say METAL_EIP_TAG was set. If two Service/LoadBalancer resources were created with the same tag annotation, and we need a new /32 Elastic IP reservation to fulfill the LoadBalancer request, I would imagine the CCM would request a new reservation, tagging that reservation with both the METAL_EIP_TAG and the tag identified in the annotation. The second LoadBalancer resource would, barring race conditions, find that same address.

If we have an existing reservation that matches both tags, but is larger than /32, we would need a way to assign the same /32 from that reservation to both services. Perhaps the application-scoped tag on the reservation looks like: {single-ccm-annotation-tag}:{/32 IP}

Perhaps we already have this?

https://github.com/equinix/cloud-provider-equinix-metal#elastic-ip-configuration

The suggestion I am mulling around is whether we should introduce a tag such as:

Multiple of these could be added to an existing reservation or added to a new reservation that uses the annotation.

kind: Service
metadata:
  annotations:
    metal.equinix.com/ip-reservation-tag-annotation: keyword
    metal.equinix.com/ip-reservation-cidr: 28
spec:
  type: LoadBalancer

If the CIDR is specified, we would create an IP reservation request of that size. If not specified, we would find any reservation with the keyword in the annotation tag, or create a /32 with that annotation tag.

displague commented 3 years ago

And perhaps, to draw a /32 from an existing reservation,

kind: Service
metadata:
  annotations:
    metal.equinix.com/ip-reservation-id: {uuid}
    metal.equinix.com/ip-reservation-cidr: 28 # optional, used for creates. if set, must match on selects.
spec:
  type: LoadBalancer

or

kind: Service
metadata:
  annotations:
    metal.equinix.com/ip-reservation-tag: {keyword} # if set, IP reservation will be created with this added tag, or be selected only with this tag.
    metal.equinix.com/ip-reservation-cidr: 28 # optional, used for creates. if set, must match on selects.
spec:
  type: LoadBalancer
displague commented 3 years ago

My suggestions may not behave well in race conditions where two services are created at once with the same tags. I imagine multiple requests would be distributed to different CCM pods and requested at once.

displague commented 3 years ago

These suggestions also may not fit the alternate engines, metallb and kube-vip which may offer duplicating or conflicting options.

displague commented 3 years ago

We have an existing prefix metal.equinix.com/src-. Anywhere I said metal.equinix.com/ip- above, pretend I said metal.equinix.com/src-.

deitch commented 3 years ago

METAL_EIP_TAG

This is from the control plane load balancing. This doesn't control for Service type=LoadBalancer

You raise a good point. While it does have solid tag logic, it doesn't provide a way for use to tell it what tags to use. You could replicate the logic, but that is messy. It is consistent, but internal.

The hashing really is also to avoid names that might not be accepted on tags, but primarily to avoid leakage. It isn't the business of the EM API or its backend database to know what the names of services are inside clusters. We had some customer who needed that extra privacy guarantee, and there was no reason why not.

think we would benefit from an application-scoped setting, specified within each service (as an annotation or label)

I can get behind that. So when you create a Service, your annotations can determine the CCM behaviour:

Did I miss any? We would need to know what to do if the EIP already is in use. Do we allow reuse? Or call it an error?

I like where this is going.

displague commented 3 years ago

We would need to know what to do if the EIP already is in use.

If the EIP is used by the requesting cluster - this is fine?

your annotations can determine the CCM behaviour ... did I miss any?

https://github.com/equinix/terraform-metal-anthos-on-baremetal/issues/51 would like the CCM to offer private addresses. Private address provisioning works in the same way on the EM API, perhaps we just need another annotation to toggle private addresses (and we can use existing private IPs and pools just as we would public ones).

displague commented 3 years ago

We may be blocked by the Equinix Metal API here. IP Reservations tags can not be changed.

displague commented 1 year ago

Noting that tags can be changed now .It would still be helpful to have a way (that is well documented alongside the bring your own IP pattern) to accomplish the following:

When the parent block reservation has no available /32s, an error saying as much would be surfaced.

displague commented 1 year ago

There would be a difference in how the IP reservations flow between new /32 reservation and reservation of /32 from an existing pool.

https://github.com/equinix/cloud-provider-equinix-metal/blob/main/metal/loadbalancers.go#L397-L407 currently creates a new IP reservation and returns only the IP address. Assignment of that IP to nodes happens later by another process (kubevip, metallb). To divide an existing reservation into a /32 you must assign it through the API. Doing so in CPEM as part of the reservation flow would have CPEM putting an initial assignment in place that would downstream processes like kube-vip and metallb would need to reconcile. Since reconciliation for these processes involves aligning Service IPs with nodes (and their routing), I think this should "just work", even though the initial IP assignment happens earlier.

k8s-triage-robot commented 10 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 9 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 8 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cloud-provider-equinix-metal/issues/44#issuecomment-2010729921): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
cprivitere commented 6 months ago

/reopen

k8s-ci-robot commented 6 months ago

@cprivitere: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/cloud-provider-equinix-metal/issues/44#issuecomment-2110640791): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
cprivitere commented 6 months ago

/remove-lifecycle rotten

cprivitere commented 6 months ago

/triage accepted