karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.49k stars 890 forks source link

delete PropagationPolicy and OverridePolicy at the same time, lead to workload recover to origin state #3873

Open wu0407 opened 1 year ago

wu0407 commented 1 year ago

What happened: I want to clean the member resource, so I delete OverridePolicy and PropagationPolicy on the same time. The OverridePolicy is to Override deployment spec.replicas to 0. Many pod create by deployment controller on member cluster after delete action.

rule.yaml:

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: workload-propagation-policy
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
    - apiVersion: apps/v1
      kind: StatefulSet
  propagateDeps: true
  placement:
    clusterAffinity:
      clusterNames:
        - sh5-online
---
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  name: workload-override-policy
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
    - apiVersion: apps/v1
      kind: StatefulSet
  overrideRules:
  - overriders:  
      plaintext:
      - path: /spec/replicas
        operator: replace
        value: 0
      - path: /spec/template/spec/tolerations
        operator: add
        value:
        - key: "eks.tke.cloud.tencent.com/eklet"
          operator: "Exists"
          effect: "NoSchedule"
      - path: /spec/template/spec/schedulerName
        operator: replace
        value: "tke-scheduler"

What you expected to happen: no pod be created and deployment is deleted. How to reproduce it (as minimally and precisely as possible):

  1. kubectl apply -f rule.yaml
  2. wait resouce is synced
  3. kubectl delete -f rule.yaml
  4. kubectl get pod -w -A Anything else we need to know?:

Environment:

kubectl karmada version: version.Info{GitVersion:"v1.6.1", GitCommit:"fdc7ac62c70b571d091a795cbe9b9fceac5f1c2c", GitTreeState:"clean", BuildDate:"2023-07-06T03:35:37Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"}
wu0407 commented 1 year ago

Similar behavior also occurs during creation OverridePolicy and PropagationPolicy.

XiShanYongYe-Chang commented 1 year ago

Hi @wu0407 Thank you for your feedback.

The deletion of PropagationPolicy and the deletion of OverridePolicy may need to be considered separately.

When users delete PropagationPolicy, Karmada will disconnect the resource template from pp, but ResourceBinding will still exist. At this time, when users delete op again, ResourceBinding will still synchronize down to the member cluster. Therefore, the resources in the member cluster will remain consistent with the template.

I would like to understand why you need to delete PropagationPolicy and OverridePolicy, but keep the resource template.

wu0407 commented 1 year ago

I don't want deployment running on that cluster, but continue run on karmada cluster. Because karmada apiserver is reuese exist apiserver.

relate issue: The deployment will be sync to member cluster after cluster is rejoin, but that cluster not have any PropagationPolicy. The reason is that ResourceBinding has been retained when PropagationPolicy is deleted even the cluster is removed.

wu0407 commented 1 year ago

ResourceBinding should be delete or relate info need to be clean.

spec:
  placement:
    clusterAffinity:
      clusterNames:
      - sh5-online
    clusterTolerations:
    - effect: NoExecute
      key: cluster.karmada.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: cluster.karmada.io/unreachable
      operator: Exists
      tolerationSeconds: 300
chaunceyjiang commented 1 year ago

ResourceBinding should be delete

ResourceBinding currently has the same lifecycle as the resource template. If the resource template is not deleted, then ResourceBinding will also not be deleted.

or relate info need to be clean.

I have created an issue to track this problem. https://github.com/karmada-io/karmada/issues/3886

XiShanYongYe-Chang commented 1 year ago

@wu0407 @chaunceyjiang Thanks~, Your discussion has provided me with a new perspective. In fact, this issue has been discussed before, #3681

One previous conclusion was:

I'm thinking if we can take a step forward and consider how to make this behavior more in line with user expectations. For example, can we give control of this behavior to users and let them decide whether rb needs to be synchronized and deleted?

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

whitewindmills commented 1 year ago

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

if users choose to clean up the previous scheduling results, should the resource template be matched by other new appropriate policies? I think cleanup conflicts with rescheduling. it would be great if you could explain it to me.

chaunceyjiang commented 1 year ago

I think now we can further optimize this conclusion. The lifecycle of rb and resource templates remains consistent, and users can decide whether to clean up the scheduling results on rb with pp information so that they will no longer be synchronized. What do you think?

spec:
  placement:
    clusterAffinity:
      clusterNames:
      - sh5-online
    clusterTolerations:
    - effect: NoExecute
      key: cluster.karmada.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: cluster.karmada.io/unreachable
      operator: Exists
      tolerationSeconds: 300

I don't think these are scheduling results. I believe they are scheduling information. Since the PP has already been deleted, the related information should also be cleaned up.

XiShanYongYe-Chang commented 1 year ago

if users choose to clean up the previous scheduling results, should the resource template be matched by other new appropriate policies?

I think that regardless of how users choose, it should be allowed for resource templates to match with new Propagation Policies. This point may not conflict.

I think cleanup conflicts with rescheduling.

When clearing the resource template of pp matching information, it is necessary to clean up the information on rb.

XiShanYongYe-Chang commented 1 year ago

I don't think these are scheduling results. I believe they are scheduling information. Since the PP has already been deleted,

I agree with you. The statement about the scheduling results is incorrect.

the related information should also be cleaned up.

Do you mean to clean directly without requiring user selection?

chaunceyjiang commented 1 year ago

Do you mean to clean directly without requiring user selection?

Yes.

XiShanYongYe-Chang commented 1 year ago

I tend to agree a little, but I'm not sure if there are users who rely on this. It might be a good idea to hear what others think at the community meeting.

XiShanYongYe-Chang commented 1 year ago

/cc @RainbowMango

liangyuanpeng commented 1 year ago

I would recommend turning on a featuregate as a first step, the default behavior remains the same as it is now.

Open the featuregate if the user want clean directly.

Then someday in the future, when everything is stable, the default behavior can be modified and cleanup this featuregate.

liangyuanpeng commented 1 year ago

Then someday in the future, when everything is stable, the default behavior can be modified and cleanup this featuregate.

A simple version evolution example: 1.7 alpha 1.8 beta and featuregate open by default 1.9 GA

Just let the featuregate open by default and do not need to cleanup.

If a direct cleaning solution is adopted in the end, I suggest notifying users in advance in 1.7, and execute in 1.8

weilaaa commented 1 year ago

In another issues #3681 , we've discussed it yet, like "I prefer to add a new gate in pp to determine this behavior, and the default is consistent with the current one to be compatible with the existing data. User can open new gate in pp to change this behavior if he knows what he is doing."

A simple version evolution example: 1.7 alpha 1.8 beta and featuregate open by default 1.9 GA

For compatibility reasons, I recommend using the current behavior as the default. But if we can do some research, and the research results show that users prefer to clean up rb when deleting pp, then I suggest that we can adopt the method that @liangyuanpeng said, just like k8s' strategy for new featuregate.

weilaaa commented 1 year ago

By the way, there will be new featuregate in pp that seems better than a global featuregate switch, casue user can control the action for each pp.

I just leave the problem here that how's the default action going.

cc @olderTaoist

RainbowMango commented 1 year ago

I'm looking and thinking about the use case:

I don't want deployment running on that cluster, but continue run on karmada cluster. Because karmada apiserver is reuese exist apiserver.

I'm asking a further question, why can't remove it even in case of reusing the API server?

Feel that it'd be better to have a chat at the community meeting.

XiShanYongYe-Chang commented 1 year ago

Hello everybody, @wu0407 @chaunceyjiang @liangyuanpeng @whitewindmills @weilaaa @olderTaoist

Can we discuss this issue at the community meeting next Tuesday (2023-08-15)? I hope that we can reach some consensus and make progress towards resolving this issue during the meeting.