karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.09k stars 798 forks source link

feat: Support modification synchronization of custom resources as dependency #3614

Closed chaunceyjiang closed 9 months ago

chaunceyjiang commented 10 months ago

What type of PR is this? /kind feature

What this PR does / why we need it: Support modification synchronization of custom resources as dependency

Which issue(s) this PR fixes: Fixes #3605

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

`karmada-controller-manager`: support modification synchronization of custom resources as dependency.
chaunceyjiang commented 10 months ago

/cc @whitewindmills @XiShanYongYe-Chang Please take a look.

whitewindmills commented 10 months ago

Glad to see your quick response. I'll check it when I'm free.

chaunceyjiang commented 10 months ago

/cc @whitewindmills Done.

whitewindmills commented 10 months ago

@chaunceyjiang Could you add e2e use case to cover this?

chaunceyjiang commented 10 months ago

@chaunceyjiang Could you add e2e use case to cover this?

https://github.com/karmada-io/karmada/blob/master/test/e2e/dependenciesdistributor_test.go#L426-L445

There is already a case covering this scenario. @whitewindmills

whitewindmills commented 10 months ago

@chaunceyjiang This PR is for custom resources that are interpreted by InterpretDependency instead of built-in resources. So I think it is necessary to add a case to cover this scene.

chaunceyjiang commented 10 months ago

This PR is for custom resources that are interpreted by InterpretDependency instead of built-in resources. So I think it is necessary to add a case to cover this scene.

I don'k think so, this patch is a universal way that doesn't care whether it's a built-in resource or a custom resource, but only cares if the InterpretDependency interface has been implemented.

whitewindmills commented 10 months ago

I don'k think so, this patch is a universal way that doesn't care whether it's a built-in resource or a custom resource, but only cares if the InterpretDependency interface has been implemented.

As shown in the issue, built-in resources can be modified and synchronized normally, but custom resources cannot be synchronized after modified. How do you promise that this problem has been resolved after this PR is merged?

chaunceyjiang commented 10 months ago

As shown in the https://github.com/karmada-io/karmada/issues/3605, built-in resources can be modified and synchronized normally, but custom resources cannot be synchronized after modified.

The following link is the root cause of this issue. https://github.com/karmada-io/karmada/blob/master/pkg/dependenciesdistributor/dependencies_distributor.go#L50-L55

whitewindmills commented 10 months ago

I know the root cause of the problem, but we need pipelines to ensure that this problem scenario is covered.

chaunceyjiang commented 10 months ago

Please forget about the built-in resources and custom resources. After this PR, these differences no longer exist.

Only cares if the InterpretDependencyinterface has been implemented.

chaunceyjiang commented 10 months ago

/cc @whitewindmills please take a look.

chaunceyjiang commented 10 months ago

For example, if I create some resources that are almost impossible to be dependencies like Deployment, they will also be enqueued, which would be very expensive.

The issue has been fixed, please take a look again. /cc @whitewindmills

chaunceyjiang commented 10 months ago

/cc @whitewindmills please take a look.

whitewindmills commented 10 months ago

@chaunceyjiang Thanks for your hard work. It's good for me. Leave approval to @RainbowMango. /lgtm /assign @RainbowMango

chaunceyjiang commented 10 months ago

Here imply all dependencies are namespace-scoped resources, how to handle the cluster-scoped resource parsed from the resource interpreter?

I think this is a new feature that goes beyond the scope of the current PR discussion. Prior to this PR, cluster-scoped resources were also not supported.

chaunceyjiang commented 10 months ago

/cc @RainbowMango @XiShanYongYe-Chang @whitewindmills PTAL.

chaunceyjiang commented 9 months ago

/cc @RainbowMango PTAL.

XiShanYongYe-Chang commented 9 months ago

/lgtm

whitewindmills commented 9 months ago

LGTM

RainbowMango commented 9 months ago

Please @chaosi-zju take a look:

make: Leaving directory '/home/runner/work/karmada/karmada'
Waiting for the host clusters to be ready...
Waiting for kubeconfig file /home/runner/.kube/karmada.config and clusters karmada-host to be ready...

Error:  Timeout waiting for file exist /home/runner/.kube/karmada.config
Error: Process completed with exit code 1.

This issue should be fixed by #3667.

karmada-bot commented 9 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/karmada-io/karmada/blob/master/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
chaosi-zju commented 9 months ago

Please @chaosi-zju take a look:

make: Leaving directory '/home/runner/work/karmada/karmada'
Waiting for the host clusters to be ready...
Waiting for kubeconfig file /home/runner/.kube/karmada.config and clusters karmada-host to be ready...

Error:  Timeout waiting for file exist /home/runner/.kube/karmada.config
Error: Process completed with exit code 1.

This issue should be fixed by #3667.

Log of Kind:

• Starting control-plane 🕹️ ... ✗ Starting control-plane 🕹️ Deleted nodes: ["karmada-host-control-plane"] ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged karmada-host-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1 ... I0613 08:20:07.763691 250 round_trippers.go:553] GET https://karmada-host-control-plane:6443/healthz?timeout=10s in 10003 milliseconds I0613 08:20:17.866527 250 round_trippers.go:553] GET https://karmada-host-control-plane:6443/healthz?timeout=10s in 10041 milliseconds ... Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by:

  • The kubelet is not running
  • The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) ... couldn't initialize a Kubernetes cluster k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1 cmd/kubeadm/app/cmd/phases/workflow/runner.go:234

Log of kubelet:

Jun 25 04:27:37 m01-control-plane kubelet[172]: I0625 04:27:37.959936 172 server.go:412] "Kubelet version" kubeletVersion="v1.26.0" Jun 25 04:27:37 m01-control-plane kubelet[172]: I0625 04:27:37.960093 172 server.go:414] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK="" Jun 25 04:27:37 m01-control-plane kubelet[172]: I0625 04:27:37.960506 172 server.go:836] "Client rotation is on, will bootstrap in background" Jun 25 04:27:37 m01-control-plane kubelet[172]: E0625 04:27:37.965279 172 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate signing request: Post "https://m01-control-plane:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp [fc00:f853:ccd:e793::2]:6443: connect: connection refused Jun 25 04:27:37 m01-control-plane kubelet[172]: I0625 04:27:37.965423 172 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt" Jun 25 04:27:37 m01-control-plane kubelet[172]: E0625 04:27:37.967479 172 run.go:74] "command failed" err="failed to run Kubelet: invalid configuration: cgroup [\"kubelet\"] has some missing paths: /sys/fs/cgroup/cpuset/kubelet.slice, /sys/fs/cgroup/hugetlb/kubelet.slice" Jun 25 04:27:37 m01-control-plane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Jun 25 04:27:37 m01-control-plane systemd[1]: kubelet.service: Failed with result 'exit-code'.


the key clue is the sentence above: failed to run Kubelet: invalid configuration: cgroup [\"kubelet\"] has some missing paths: /sys/fs/cgroup/cpuset/kubelet.slice, /sys/fs/cgroup/hugetlb/kubelet.slice


I have seen this error message before in issue #3667 , but the probability of occurrence was small before, and I accidentally overlooked it. Now, I will continue to update the positioning progress in this issue #3667.

chaunceyjiang commented 9 months ago

It seems that e2e testing can no longer be performed properly since the merge of this pr.

RainbowMango commented 9 months ago

Echo the second failure here in case you need it: https://github.com/karmada-io/karmada/actions/runs/5367211464/jobs/9739097410?pr=3614

chaosi-zju commented 9 months ago

It seems that e2e testing can no longer be performed properly since the merge of this pr.

The problem is not related to this PR, I just execute kind create cluster .. in CI Runner today, will directly report this failure.

RainbowMango commented 9 months ago

The problem is not related to this PR, I just execute kind create cluster .. in CI Runner today, will directly report this failure.

Can you help try it again with ubuntu-22.04? Note that currently we are using ubuntu-20.04. https://github.com/actions/runner-images#available-images

RainbowMango commented 9 months ago

It seems that e2e testing can no longer be performed properly since the merge of this https://github.com/karmada-io/karmada/pull/3682.

The tests on release-1.6 are falling too, we didn't cherry-pick #3682 to release branches.

RainbowMango commented 9 months ago

@chaunceyjiang Please rebase this PR after #3699 get merged.

RainbowMango commented 9 months ago

This test is failing: https://github.com/karmada-io/karmada/actions/runs/5369361839/jobs/9740779864?pr=3614

 [FAILED] [320.471 seconds]
Resource interpreter webhook testing CustomResource InterpreterOperation InterpretDependency testing [It] dependency cr propagation testing
/home/runner/work/karmada/karmada/test/e2e/resourceinterpreter_test.go:446

  Captured StdOut/StdErr Output >>
  dependency cr apiVersion: example-qhkhb.karmada.io/v1alpha1 kind: Foo582g8 name: cr-l2qmt namespace: karmadatest-nblt6W0625 11:19:47.964028   44476 warnings.go:70] unknown field "spec.template.metadata.creationTimestamp"
  W0625 11:19:47.964052   44476 warnings.go:70] unknown field "spec.template.metadata.labels"
  I0625 11:19:47.973474   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member1)
  I0625 11:19:58.006801   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member2)
  I0625 11:19:58.012850   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member3)
  I0625 11:19:58.023026   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member1)
  I0625 11:19:58.032882   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member2)
  I0625 11:19:58.039265   44476 customresourcedefine.go:56] Waiting for crd present on cluster(member3)
  I0625 11:19:58.065874   44476 resourceinterpreter_test.go:473] Waiting for dependency cr(karmadatest-nblt6/cr-l2qmt) present on cluster(member1)
  I0625 11:20:08.078325   44476 resourceinterpreter_test.go:473] Waiting for dependency cr(karmadatest-nblt6/cr-l2qmt) present on cluster(member2)
  I0625 11:20:08.085189   44476 resourceinterpreter_test.go:473] Waiting for dependency cr(karmadatest-nblt6/cr-l2qmt) present on cluster(member3)
  I0625 11:20:08.096171   44476 resourceinterpreter_test.go:508] Waiting for cr(karmadatest-nblt6/cr-l2qmt) synced on cluster(member1)
  << Captured StdOut/StdErr Output

  Timeline >>
  STEP: Creating ResourceInterpreterCustomization(interpreter-customizationtnzs6) @ 06/25/23 11:19:47.771
  STEP: Creating ClusterPropagationPolicy(foozrz4gs.example-jzdwr.karmada.io) @ 06/25/23 11:19:47.895
  STEP: Creating ClusterPropagationPolicy(foo582g8s.example-qhkhb.karmada.io) @ 06/25/23 11:19:47.904
  STEP: Creating crd(foozrz4gs.example-jzdwr.karmada.io) @ 06/25/23 11:19:47.917
  STEP: Creating crd(foo582g8s.example-qhkhb.karmada.io) @ 06/25/23 11:19:47.922
  STEP: Creating PropagationPolicy(karmadatest-nblt6/workload-6fnhr) @ 06/25/23 11:19:47.928
  STEP: Creating workload(karmadatest-nblt6/workload-6fnhr) @ 06/25/23 11:19:47.954
  STEP: Get crd(foozrz4gs.example-jzdwr.karmada.io) @ 06/25/23 11:19:47.964
  STEP: Check if crd(example-jzdwr.karmada.io/v1alpha1/Foozrz4g) present on member clusters @ 06/25/23 11:19:47.973
  STEP: Get crd(foo582g8s.example-qhkhb.karmada.io) @ 06/25/23 11:19:58.018
  STEP: Check if crd(example-qhkhb.karmada.io/v1alpha1/Foo582g8) present on member clusters @ 06/25/23 11:19:58.023
  STEP: Creating PropagationPolicy(karmadatest-nblt6/cr-p9w96) @ 06/25/23 11:19:58.045
  STEP: creating cr(karmadatest-nblt6/cr-p9w96) @ 06/25/23 11:19:58.053
  STEP: creating dependency cr(karmadatest-nblt6/cr-l2qmt) @ 06/25/23 11:19:58.059
  STEP: check if dependency cr present on member clusters @ 06/25/23 11:19:58.065
  STEP: updating dependency cr @ 06/25/23 11:20:08.089
  STEP: check if update has been synced to member clusters @ 06/25/23 11:20:08.096
  [FAILED] in [It] - /home/runner/work/karmada/karmada/test/e2e/resourceinterpreter_test.go:525 @ 06/25/23 11:25:08.115
  STEP: Removing PropagationPolicy(karmadatest-nblt6/cr-p9w96) @ 06/25/23 11:25:08.115
  STEP: Remove workload(karmadatest-nblt6/workload-6fnhr) @ 06/25/23 11:25:08.151
  STEP: Removing PropagationPolicy(karmadatest-nblt6/workload-6fnhr) @ 06/25/23 11:25:08.161
  STEP: Removing ClusterPropagationPolicy(foozrz4gs.example-jzdwr.karmada.io) @ 06/25/23 11:25:08.166
  STEP: Removing crd(foozrz4gs.example-jzdwr.karmada.io) @ 06/25/23 11:25:08.179
  STEP: Removing ClusterPropagationPolicy(foo582g8s.example-qhkhb.karmada.io) @ 06/25/23 11:25:08.191
  STEP: Removing crd(foo582g8s.example-qhkhb.karmada.io) @ 06/25/23 11:25:08.199
  STEP: Deleting ResourceInterpreterCustomization(interpreter-customizationtnzs6) @ 06/25/23 11:25:08.222
  << Timeline

  [FAILED] Unexpected error:
      <*errors.errorString | 0xc00031b0f0>: 
      timed out waiting for the condition
      {
          s: "timed out waiting for the condition",
      }
  occurred
  In [It] at: /home/runner/work/karmada/karmada/test/e2e/resourceinterpreter_test.go:525 @ 06/25/23 11:25:08.115
XiShanYongYe-Chang commented 9 months ago

/lgtm