Open enkelprifti98 opened 4 years ago
When you know the specific range that you want to use, you can use the spec.loadBalancerIP
field to request it.
When you don't care about the specific range, but you know what tag it should be assigned (or draw from), or what reservation id it should come from, you need a way to identify that. This is like a match/selector for the reservations known in the Equinix Metal API.
In some instances, you may want to ensure that the same IP address is used between two separate services, say one representing the UDP ports of set of pods and another representing the TCP ports of that same set of pods (TCP and UDP can not be mixed in a K8s Service resource).
MetalLB has some annotations that allow for the same IP to be used between services, however we’ve been using kube-vip in the latest integrations, and metallb is not used at all in these cases.
The CCM will provision an extra IP if needed and tell kube-vip about it through annotations, but there is no way to request that the CCM created IP should be tagged with some keyword or request an IP from reservations with a given keyword as a tag.
I think a CCM recognized annotation that sets up this tag dependent behavior would allow us to ensure that multiple services claim the same address without knowing the address in advance (which is what spec.loadBalancerIP allows for today).
IP Reservations also support customdata descriptors, which can be any JSON. I think we would do best to avoid that, leaving the format unopinionated by any official Equinix Metal tools. Tags are flexible enough to support this use-case.
We don’t have an annotation that the CCM can recognize to associate addresses to a label/annotation in this way, that I know of.
We may actually be part of the way there. The following is paraphrased from the README, in both the kube-vip and metallb sections.
For each service of type=LoadBalancer
currently in the cluster or added:
auto-assign: false
)auto-assign: false
)The question is, are those tags usable as is? Or do we need something else?
Thanks for pointing this out, @deitch. This comment will include a little thinking-out-loud as I go through the README.
When starting the CCM, set the env var
METAL_EIP_TAG=<tag>
, whereis whatever tag you set on the EIP
This tag becomes a CCM-scoped filter, I think we would benefit from an application-scoped setting, specified within each service (as an annotation or label).
Let's say METAL_EIP_TAG
was set. If two Service/LoadBalancer resources were created with the same tag annotation, and we need a new /32 Elastic IP reservation to fulfill the LoadBalancer request, I would imagine the CCM would request a new reservation, tagging that reservation with both the METAL_EIP_TAG and the tag identified in the annotation. The second LoadBalancer resource would, barring race conditions, find that same address.
If we have an existing reservation that matches both tags, but is larger than /32, we would need a way to assign the same /32 from that reservation to both services. Perhaps the application-scoped tag on the reservation looks like: {single-ccm-annotation-tag}:{/32 IP}
Perhaps we already have this?
https://github.com/equinix/cloud-provider-equinix-metal#elastic-ip-configuration
usage="cloud-provider-equinix-metal-auto"
Is this the default METAL_EIP_TAG
? service="<service-hash>"
This seems close to the tag
filter ideacluster=<clusterID>
This seems related to https://github.com/equinix/cloud-provider-equinix-metal/issues/154The suggestion I am mulling around is whether we should introduce a tag such as:
annotation="keyword{",/32 IP" pulled from the reservation if the reservation is larger than /32}"
Multiple of these could be added to an existing reservation or added to a new reservation that uses the annotation.
kind: Service
metadata:
annotations:
metal.equinix.com/ip-reservation-tag-annotation: keyword
metal.equinix.com/ip-reservation-cidr: 28
spec:
type: LoadBalancer
If the CIDR is specified, we would create an IP reservation request of that size. If not specified, we would find any reservation with the keyword in the annotation tag, or create a /32 with that annotation tag.
And perhaps, to draw a /32 from an existing reservation,
kind: Service
metadata:
annotations:
metal.equinix.com/ip-reservation-id: {uuid}
metal.equinix.com/ip-reservation-cidr: 28 # optional, used for creates. if set, must match on selects.
spec:
type: LoadBalancer
or
kind: Service
metadata:
annotations:
metal.equinix.com/ip-reservation-tag: {keyword} # if set, IP reservation will be created with this added tag, or be selected only with this tag.
metal.equinix.com/ip-reservation-cidr: 28 # optional, used for creates. if set, must match on selects.
spec:
type: LoadBalancer
My suggestions may not behave well in race conditions where two services are created at once with the same tags. I imagine multiple requests would be distributed to different CCM pods and requested at once.
These suggestions also may not fit the alternate engines, metallb and kube-vip which may offer duplicating or conflicting options.
We have an existing prefix metal.equinix.com/src-
. Anywhere I said metal.equinix.com/ip-
above, pretend I said metal.equinix.com/src-
.
METAL_EIP_TAG
This is from the control plane load balancing. This doesn't control for Service type=LoadBalancer
You raise a good point. While it does have solid tag logic, it doesn't provide a way for use to tell it what tags to use. You could replicate the logic, but that is messy. It is consistent, but internal.
The hashing really is also to avoid names that might not be accepted on tags, but primarily to avoid leakage. It isn't the business of the EM API or its backend database to know what the names of services are inside clusters. We had some customer who needed that extra privacy guarantee, and there was no reason why not.
think we would benefit from an application-scoped setting, specified within each service (as an annotation or label)
I can get behind that. So when you create a Service, your annotations can determine the CCM behaviour:
Did I miss any? We would need to know what to do if the EIP already is in use. Do we allow reuse? Or call it an error?
I like where this is going.
We would need to know what to do if the EIP already is in use.
If the EIP is used by the requesting cluster - this is fine?
your annotations can determine the CCM behaviour ... did I miss any?
https://github.com/equinix/terraform-metal-anthos-on-baremetal/issues/51 would like the CCM to offer private addresses. Private address provisioning works in the same way on the EM API, perhaps we just need another annotation to toggle private addresses (and we can use existing private IPs and pools just as we would public ones).
We may be blocked by the Equinix Metal API here. IP Reservations tags can not be changed.
Noting that tags can be changed now .It would still be helpful to have a way (that is well documented alongside the bring your own IP pattern) to accomplish the following:
When the parent block reservation has no available /32s, an error saying as much would be surfaced.
There would be a difference in how the IP reservations flow between new /32 reservation and reservation of /32 from an existing pool.
https://github.com/equinix/cloud-provider-equinix-metal/blob/main/metal/loadbalancers.go#L397-L407 currently creates a new IP reservation and returns only the IP address. Assignment of that IP to nodes happens later by another process (kubevip, metallb). To divide an existing reservation into a /32 you must assign it through the API. Doing so in CPEM as part of the reservation flow would have CPEM putting an initial assignment in place that would downstream processes like kube-vip and metallb would need to reconcile. Since reconciliation for these processes involves aligning Service IPs with nodes (and their routing), I think this should "just work", even though the initial IP assignment happens earlier.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/reopen
@cprivitere: Reopened this issue.
/remove-lifecycle rotten
/triage accepted
This is a necessary use case for customers that bring their own IPs, or Packet global anycast IPs.