devfile / devworkspace-operator

Apache License 2.0
59 stars 49 forks source link

Common PVC cleanup job can be assigned to incorrect node in multi-node cluster #1269

Open AObuchow opened 1 month ago

AObuchow commented 1 month ago

Description

In a multi-node cluster, it's possible when deleting a devworkspace that uses the per-user/common PVC strategy for the PVC cleanup pod to be scheduled on a node that is different than the node where the PVC is mounted. Since PVCs are created as ReadWriteOnce, only a single node can mount the PVC and thus the cleanup pod will fail to start with a PVC mount error. This causes the devworkspace to remain in a terminating state indefinitely.

Since you cannot modify the node that a pod is scheduled on after the pod has been created, you need to delete the cleanup pod and have it automatically re-created until it is assigned to the node where the PVC is mounted in order for the workspace to be deleted.

What's odd is that we are already applying a node selector label to the cleanup pod. Perhaps there are cases where the namespace is missing the node selector annotation? CC: @musienko-maxim

How To Reproduce

Does not always occur, requires a multi-node cluster.

  1. Create a devworkspace using the per-user/common storage strategy
  2. Delete the devworkspace
  3. If the cleanup-workspace pod is scheduled on a different node than where the PVC is mounted, the pod will fail to be created and the devworkspace will remain in the terminating state

Expected behavior

The cleanup-workspace pod is scheduled on thesame node where the PVC is mounted and terminates successfully. The deworkspace gets terminated successfully.

Additional context

Encountered this while testing on @musienko-maxim 's OCP 4.15 test cluster.

AObuchow commented 1 month ago

What's odd is that we are already applying a node selector label to the cleanup pod. Perhaps there are cases where the namespace is missing the node selector annotation?

It seems like this annotation is supposed to be applied from Che, as configured from this Che Cluster CR field.

However, if multiple nodes are selected, it's possible that the cleanup job may be assigned to a different node than the node where the PVC is mounted.

dkwon17 commented 1 month ago

Can we use pod affinity [doc] for the cleanup pod so that it's scheduled on the same node as the workspace pod?

AObuchow commented 3 weeks ago

Can we use pod affinity [doc] for the cleanup pod so that it's scheduled on the same node as the workspace pod?

Yes! This would be a good use case for pod affinity. This doc seems to give an example of how this can be achieved.