canonical / namespace-node-affinity-operator

Juju Charm for the Namespace Node Affinity tool
Apache License 2.0
1 stars 0 forks source link

namespace-node-affinity is not working as expected with DaemonSet #35

Closed sagittariuslee closed 2 months ago

sagittariuslee commented 3 months ago

Bug Description

The nodeSelector term injected to DaemonSet pods will be ignored due to the fact that

if multiple nodeSelectorTerms are associated with nodeAffinity types, then the Pod can be scheduled onto a node if one of the specified nodeSelectorTerms can be satisfied.

To Reproduce

  1. juju deploy metallb --channel 1.28/stable --trust --config namespace=metallb

  2. juju deploy namespace-node-affinity --trust

  3. kubectl label namespaces metallb namespace-node-affinity=enabled

  4. settings.yaml

    ~$ cat settings.yaml 
    metallb: |
    nodeSelectorTerms:
    - matchExpressions:
      - key: kubeflowserver
        operator: In
        values:
        - true

    SETTINGS_YAML=$(cat settings.yaml)

  5. juju config namespace-node-affinity settings_yaml="$SETTINGS_YAML"

  6. kubectl delete pods -n metallb metallb-0 (the operator pod for metallb app.)

  7. kubectl get pods -n metallb metallb-0 -o yaml, we can see in the yaml for the newly created pod:

    - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: StatefulSet
    name: metallb
    uid: 5214cbc3-20b5-41ba-b7f9-69f2c16da6ca
    resourceVersion: "84536"
    uid: c0bf5e77-8b82-43e8-8670-e8ee3ecd43f9
    spec:
    affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubeflowserver
            operator: In
            values:
            - "true"
  8. now pick another pod speaker-ngkg8 which was spawned by the charm pod, before deletion:

    kubectl get pods -n metallb speaker-ngkg8 -o yaml
    - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: speaker
    uid: 9332e100-c788-421d-b6a7-08751831a22d
    resourceVersion: "83722"
    uid: 1fffc42d-6d24-446f-bade-8ca1d7de6a15
    spec:
    affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - vm-0
    containers:
  9. kubectl delete pods -n metallb speaker-ngkg8

  10. kubectl get pods -n metallb speaker-k59pk -o yaml, check this newly created pod:

    - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: speaker
    uid: 9332e100-c788-421d-b6a7-08751831a22d
    resourceVersion: "84818"
    uid: d334c13c-fb14-487d-8cbd-b16352a42e21
    spec:
    affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - vm-0
        - matchExpressions:
          - key: kubeflowserver
            operator: In
            values:
            - "true"
    containers:

    We can see the nodeSelector term is injected. However, this newly created pod will stay in vm-0 due to the fact that

    If you specify multiple nodeSelectorTerms associated with nodeAffinity types, then the Pod can be scheduled onto a node if one of the specified nodeSelectorTerms can be satisfied. i.e., Multiple nodeSelectorTerms within nodeAffinity are evaluated using OR logic. If any one of the terms is satisfied, the Pod can be scheduled on that node.

In this case, the second term is going to be ignored.

Environment

juju version: 2.9.43 kubernetes: v1.24.17

Relevant Log Output

N/A

Additional Context

No response

syncronize-issues-to-jira[bot] commented 3 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5736.

This message was autogenerated

kimwnasptd commented 2 months ago

@sagittariuslee I understand that this is not an issue with how the node-affinity charm is working, but rather K8s mechanics when other Pods have set nodeSelectorTerms.

My understanding in this case is that the work needs to be done in the other Charms that have nodeSelectorTerms and if they need to be modified.

So I want to close this issue because of the above, but before that do I miss something?

sagittariuslee commented 2 months ago

@kimwnasptd Yes, you are right. This has to be done on the other charms that manage the DaemonSet (in this case, it is the Metallb charm).