k8snetworkplumbingwg / whereabouts

A CNI IPAM plugin that assigns IP addresses cluster-wide
Apache License 2.0
282 stars 124 forks source link

Make podRef to be unique #433

Closed pliurh closed 6 months ago

pliurh commented 7 months ago

We used podRef with format \/ to identify the IP allocation. However, in scenarios like replicaSet, a different pod can reuse the pod name. This patch changes the format of podRef to \/: to ensure each ip allocation will have a unique identifier. The legacy podRef format will be updated automatically during an upgrade.

pliurh commented 7 months ago

fix #176

maiqueb commented 7 months ago

Why have you chosen that format - /<pod_name>:<pod_uid> ?

If you just need the pod ref to be unique, shouldn't we simply use the pod's UID ?

Is the name part meant to help debugging or something like that ?

pliurh commented 7 months ago

The new naming format shall be namespace/pod_name:pod_uid. And yes, it's meant for debugging.

maiqueb commented 7 months ago

@dougbtv please allow me some time to review this PR.

@pliurh you need to sort out the e2e and the unit tests.

I'll keep peeking to see once they're good.

pliurh commented 6 months ago

/hold

pliurh commented 6 months ago

/unhold @maiqueb @dougbtv I revised the PR. Please take another look.

andreaskaris commented 6 months ago

I've got a question ... will this actually work with Statefulsets and guarantee that the same IP address is always assigned to the same Statefulset pod? Because the uid between Statefulset pods changes:

 resourceVersion: ""
[akaris@workstation ~]$ oc get pods -o yaml | grep uid
      uid: b5a56769-43e0-4d8c-85bd-8a5f105986fb
    uid: 4ecd9b66-976e-4be3-84fa-cfe7939abfa9
[akaris@workstation ~]$ oc get pods -o yaml | grep uid
      uid: b5a56769-43e0-4d8c-85bd-8a5f105986fb
    uid: 4ecd9b66-976e-4be3-84fa-cfe7939abfa9
[akaris@workstation ~]$ oc delete pod tools-ipv4-0
pod "tools-ipv4-0" deleted
[akaris@workstation ~]$ 
[akaris@workstation ~]$ oc get pods
NAME           READY   STATUS              RESTARTS   AGE
tools-ipv4-0   0/1     ContainerCreating   0          3s
[akaris@workstation ~]$ oc get pods -o yaml | grep uid
      uid: b5a56769-43e0-4d8c-85bd-8a5f105986fb
    uid: f4dc518a-da66-4bad-832b-32394ea08299
[akaris@workstation ~]$ 
[akaris@workstation ~]$ 
[akaris@workstation ~]$ oc get pods -o yaml | grep uid
      uid: b5a56769-43e0-4d8c-85bd-8a5f105986fb
    uid: f4dc518a-da66-4bad-832b-32394ea08299
[akaris@workstation ~]$ 
[akaris@workstation ~]$ 
[akaris@workstation ~]$ oc get pods -o yaml | grep uid
      uid: b5a56769-43e0-4d8c-85bd-8a5f105986fb
    uid: f4dc518a-da66-4bad-832b-32394ea08299

I'm trying to understand when and why it would ever be a problem that an allocation is being reused (in the case of Statefulsets, isn't it even desired that an allocation is reused?) when the combination of namespace/name is actually a unique identifier for a pod https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

.. well, just looking at this right now in an SNO cluster that I have, the IP address of my single statefulset pod is actually changed after reboot ..


To clarify my point:

to ensure each ip allocation will have a unique identifier.

The commit's comment is misleading, because the identifier <namespace>/<name> is guaranteed to be unique, always, by kubernetes: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/. That's used for e.g. statefulsets, where the pod uid is not the same, but the pod after recreation is "the same" for a set of specific attributes. So yes, identity is not the same ... but as an admin, I personally would expect key attributes to be the same, and I'd say I'd expect whereabouts to assign the same IP address! So, if pod uid-a goes down, and pod uid-b comes up with the same <namespace>/<name>, ideally, whereabouts would assign the exact same IP address; at the very least, it shouldn't matter if it assigned the same IP.

In kubernetes, there will never ever be 2 pods with the same <namespace>/<name> at the same time. So I would argue (not looking at the current implementation) that the allocator should look at the current allocations, and if an allocation for <namespace>/<pod> already exists, it should return the IP address and maintain the allocation, instead of trying to de-allocate and then re-allocate a new IP; wouldn't that likewise fix https://issues.redhat.com/browse/OCPBUGS-29648 and be easier / cleaner?

pliurh commented 6 months ago

@andreaskaris I think you made a great point. Instead of addressing it from the IP reconciler's perspective, resolving it at the allocation stage would be more effective.

pliurh commented 6 months ago

replaced by https://github.com/k8snetworkplumbingwg/whereabouts/pull/446