Open tedli opened 1 year ago
Could you please tell me the details about how to reproduce this issue? You edit the ResourceBinding
directly?
Hi @Poor12 ,
Thanks for reply. Yes, edit resource binding directly could reproduce this.
But, I did it in a controller (out of tree, 3rd party controller), it's mentioned at #3540 .
The controller watches Cluster
, check cluster labels changes, evict resources no longer match placement cluster selector.
And I already change Patch
to Put
in my environment, and run for a week, the gracefulEvictionTasks
can be updated correctly by Put
.
From the comments, MergeFrom
is a replace behavior. I wonder why it does a merge
rather than replace
.
@tedli
I can check it if you can share some code to reproduce it.
Hi @liangyuanpeng ,
Just edit gracefulEvictionTasks
field of resource binding, whether by using kubectl
or through api, like I already told at previous comment.
I found this issue because once I check a resource binding using kubectl -o yaml
, it output a really huge content, which about hundreds eviction task items. After change patch to put, things fixed.
Currently all my environment had been updated using a mod version scheduler, that replace patch to put, by using put, this issue fixed. Recently I don't have time to setup a new environment to reproduce this.
Feel free to close this issue, if you can't reproduce.
+1 Hey we had the same issue, where a resource was not propagating due to gracefulEvictionTask not being removed and the cluster in question reports healthy. I manually deleted the gracefulEvictionTasks
and that fixed the issue. Is there a better way of cleaning out stale tasks?
Yes, I guess we can try to reproduce it and figure out the root cause. I'm just curious as @Poor12 why MergeForm not replace the whole list.
@SerenaTiede-Zen would you like to have a try?
@liangyuanpeng are you still interest in this issue?
also cc the author here @XiShanYongYe-Chang
There is a similar issue: #4951
if the task already finished, task should removed from gracefulEvictionTasks
Actually, is task finished, gracefulEvictionTasks
would be removed.
But, the gracefulEvictionTasks
of non-workload type resource would not be finished so quick, as https://github.com/karmada-io/karmada/issues/4951#issuecomment-2116568837 described: we don't have a default InterpretHealth resource interpretation behavior for ClusterRole/ConfigMap resources, so the cluster in the gracefunEvictionTasks will wait for the timeout.
Hey we had the same issue, where a resource was not propagating due to gracefulEvictionTask not being removed and the cluster in question reports healthy.
Hi @SerenaTiede-Zen, what type of resource did you use in this issue? Normally, gracefulEvictionTask
of deployment would not give you this trouble, while non-workload resource may indeed trouble you.
when cluster become healthy, gracefulEvictionTask
of deployment should be finished and removed, while gracefulEvictionTask
of non-workload resource will keep exist until task timeout.
What happened:
the
gracefulEvictionTasks
field in spec of resource binding, grows, never cleaned.What you expected to happen:
if the task already finished, task should removed from
gracefulEvictionTasks
How to reproduce it (as minimally and precisely as possible):
patch
gracefulEvictionTasks
add item to trigger eviction, ensure the task finished, check the graceful eviction tasks field, the task still remain.Anything else we need to know?:
it may because of line 73, the
Patch
acts a merge behaviour, which won't remove tasks not kept :https://github.com/karmada-io/karmada/blob/b01cf50caee8c895c808c2ba7d7dbb75eff2a5b8/pkg/controllers/gracefuleviction/rb_graceful_eviction_controller.go#L64-L76
Environment:
kubectl-karmada version
orkarmadactl version
):