Closed kutsyk closed 6 months ago
Hi @kutsyk Our current policy is that when a resource in a member cluster is modified, it is overwritten by the Karmada controller.
What kind of behavior do you expect?
Hi @XiShanYongYe-Chang ,
Rollback of change by karmada makes sense, but we need to identify the change that happened (if possible), with details what object, in which namespace and by whom?
As we have different security requirements, trail of changes is important to keep track of.
Additionally, is it possible to do a manual update of object and "pause" karmada rollback" on that object?
but we need to identify the change that happened (if possible), with details what object, in which namespace and by whom?
Is it possible to use an event to indicate that a resource has been modified?
However, who caused the change may not be known to the Karmada control plane. I understand that the member cluster needs to handle it. For example, if a resource is controlled by Karmada(we can judge it with the labels ownered by karmada), changes to that resource need to be reported as events.
Additionally, is it possible to do a manual update of object and "pause" karmada rollback" on that object?
Maybe you can prevent the "karmada rollback" by the retain operation.
Hi @XiShanYongYe-Chang , thanks for prompt response.
yeah, events can works, but it is not as straight forward as would metrics indicate for example.
Cause I would expect as karmada knows already that propagated resource state has been modified, it would store it in metrics. I've enabled monitoring in my setup and would like to explore what metrics Karmada exposes, is there a comprehensive list of what metrics are exposes for karmada?
Retain operation seems as something that we need, but in my understanding it is currently implemented only for replicas, right?
Meaning I can't use it for other object types, as configmaps for example.
Hi @kutsyk. What do you want the metrics to look like? Would you mind giving us an example? By the way, do you have a workable solution? If so, you can make a proposal so we can discuss it in PR faster.
Retain operation seems as something that we need, but in my understanding it is currently implemented only for replicas, right?
The retain action works with any resource, including configmaps. Maybe you can try it out and report back if you have problems. I'll see how I can help you.
Hi @XiShanYongYe-Chang ,
I don't have an example or workable solution with Karmada, but I can describe the use case:
Note: I do get that Karmada will revert the change(propagate version that is configured on Karmada cluster), but if it happens that Karmada fails to propagate the change or cluster "refuses" to apply them, we need be notified about this
Q: How is it possible to implement this with Karmada?
Currently we using deprecated kubefed and our implementations in brief is quite inefficient, it goes as next:
conditions
we trigger alert.Reached retain action section in the documentation, going through it and testing, will give my feedback if it does what we need.
Thanks for your help, Vasyl
I have a few additional questions, maybe you can point me to proper doc or resource, please.
Same setup for karmada cluster and N manager clusters, but let me try describe better our use case.
In Karmada cluster I have have:
All these resources are propagated to cluster n1
.
When I deleted example-args-override
in Karmada cluster, I can see that configmap-logger
in managed n1 cluster has not been updated and still contains override from example-args-override
.
Here is what I see in Karmada:
kak -n karmada get overridepolicies z4h recovery mode
NAME AGE
example-configmap-override 6d
IMPORTANT: This should not be there as it is from overridepolicy that has been deleted:
- ' cat /etc/config/example2;'
deployment.apps/configmap-logger
in namespace karmada
is not corresponding to what should be? Regarding retain
operation and manual changes, I've experimented with a ConfigMap, basically I've manually modified label: karmada.io/managed: "true"
–> karmada.io/managed: "false"
and I was able to modify it's content without being overriden, this is behaviour that I would like(meaning be able to have full control over managed object).
But we get back to question #3 from list above:
karmada
is not propagated/managed to what should be? One of my ideas is to listen for changes in resources of member clusters in karmada-controller-manager
. When a resource change is detected, an event or metric will be generated. But I don't know what this metric looks like. Do you have any ideas?
Ask for some ideas from those guys. /cc @RainbowMango @chaunceyjiang @whitewindmills
I have a few additional questions, maybe you can point me to proper doc or resource, please.
Let me take a look later.
karmada.io/managed: "true" –> karmada.io/managed: "false"
@kutsyk
I think you should not modify this label but implement it by the retain
operation, cause this label is a built-in label of Karmada.
And for your question 1 & 2.
For now, you need to make any changes to the resource deployment.apps/configmap-logger
to trigger an update after you deleted your overridepolicy. For example, just add a label. It's not very elegant anyway, so I think this is a problem that needs to be solved. cc @RainbowMango @XiShanYongYe-Chang @chaunceyjiang
When a resource change is detected, an event or metric will be generated.
@XiShanYongYe-Chang
Yes, it's a good idea, work-status-controller
can do that.
Hi @kutsyk~ As @whitewindmills said, if you change the label karmada.io/managed: "true" –> karmada.io/managed: "false", the resource will be completely controlled by the member cluster itself, and the modification of the resource on karmada will not be synchronized to the cluster.
This modification is comparable to the subsequent abandonment of synchronization from the Karmada control plane.
However, if you still want to synchronize the modification of the resource on the Karmada control plane to the cluster, the preceding modification method is not applicable.
The retain operation allows users to customize the behavior: which fields of resources in the member cluster are not overwritten by the resource template from the Karmada control plane after being modified.
You can see some examples of this in the default code implementation of Karmada: https://github.com/karmada-io/karmada/blob/master/pkg/resourceinterpreter/default/native/retain.go
- Is there a way to get notified that ConfgiMap object in namespace karmada is not propagated/managed to what should be?
There's no feature that directly provides the capability that this points out. I think we can discuss designing and implementing this capability.
@whitewindmills , updating deployment.apps/configmap-logger
didn't help with updating object to correct state.
I had to manually restart karmada-apiserver
pod for this to work, which is really strange, so if I deleted override policy and it didn't work, how can I be sure that my target object are in desired state?
How can I use override policies if I'm not sure if it has been update/removed?
Let me try repeating these steps:
Expected outcome that my deployment after step 3 will be the same as before step 1.
I'll try and give you feedback
Okay, this is just strange as it doesn't do what it should at all.
Here is my propagation policy:
(⎈ |minikube:karmada-system) ~/projects/ kak -n karmada get propagationpolicy aws-ec2-propagation-policy -o yaml z4h recovery mode
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"policy.karmada.io/v1alpha1","kind":"PropagationPolicy","metadata":{"annotations":{},"name":"aws-ec2-propagation-policy","namespace":"karmada"},"spec":{"placement":{"clusterAffinity":{"labelSelector":{"matchLabels":{"provider":"aws-ec2"}}}},"propagateDeps":true,"resourceSelectors":[{"apiVersion":"apps/v1","kind":"Deployment","name":"configmap-logger"},{"apiVersion":"v1","kind":"ConfigMap","name":"example-configmap"}]}}
creationTimestamp: "2024-04-19T10:00:57Z"
generation: 4
labels:
propagationpolicy.karmada.io/permanent-id: 8e01b556-b96f-4458-852d-7c8a2d2f5d33
name: aws-ec2-propagation-policy
namespace: karmada
resourceVersion: "629947"
uid: 2dfd4d45-f774-4be2-bb35-38cf39d0c9cd
spec:
conflictResolution: Abort
placement:
clusterAffinity:
labelSelector:
matchLabels:
provider: aws-eks
clusterTolerations:
- effect: NoExecute
key: cluster.karmada.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: cluster.karmada.io/unreachable
operator: Exists
tolerationSeconds: 300
preemption: Never
priority: 0
propagateDeps: true
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: configmap-logger
namespace: karmada
- apiVersion: v1
kind: ConfigMap
name: example-configmap
namespace: karmada
schedulerName: default-scheduler
Is clearly states that Cluster should have label: provider: aws-eks
.
Here is my Cluster object:
(⎈ |minikube:karmada-system) ~/projects/ kak -n karmada get cluster my_cluster_name -o yaml z4h recovery mode
apiVersion: cluster.karmada.io/v1alpha1
kind: Cluster
metadata:
creationTimestamp: "2024-04-17T11:43:07Z"
finalizers:
- karmada.io/cluster-controller
generation: 41
labels:
env: prod
network_environment: prod
provider: aws-ec2
region: bk-eu-west6
type: testing
name: my_cluster_name
resourceVersion: "630235"
uid: 89fbdb5b-e644-48ad-92c5-fc2775f8f4e7
spec:
...
As you can see this is the label value for it: provider: aws-ec2
, so propagation policy should not create any object.
Here is what I see in cluster:
(⎈ |minikube:karmada-system) ~/projects/ kt1 -n karmada get all z4h recovery mode
NAME READY STATUS RESTARTS AGE
pod/configmap-logger-78c87f6785-pvwh9 0/1 CrashLoopBackOff 5 (64s ago) 4m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/configmap-logger 0/1 1 0 4m1s
NAME DESIRED CURRENT READY AGE
replicaset.apps/configmap-logger-78c87f6785 1 1 0 4m1s
(⎈ |minikube:karmada-system) ~/projects/ kt1 -n karmada get deployment.apps/configmap-logger -o yaml z4h recovery mode
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
propagationpolicy.karmada.io/name: aws-ec2-propagation-policy
propagationpolicy.karmada.io/namespace: karmada
resourcebinding.karmada.io/name: configmap-logger-deployment
resourcebinding.karmada.io/namespace: karmada
resourcetemplate.karmada.io/managed-annotations: kubectl.kubernetes.io/last-applied-configuration,propagationpolicy.karmada.io/name,propagationpolicy.karmada.io/namespace,resourcebinding.karmada.io/name,resourcebinding.karmada.io/namespace,resourcetemplate.karmada.io/managed-annotations,resourcetemplate.karmada.io/managed-labels,resourcetemplate.karmada.io/uid,work.karmada.io/conflict-resolution,work.karmada.io/name,work.karmada.io/namespace
resourcetemplate.karmada.io/managed-labels: karmada.io/managed,propagationpolicy.karmada.io/name,propagationpolicy.karmada.io/namespace,propagationpolicy.karmada.io/permanent-id,resourcebinding.karmada.io/permanent-id
resourcetemplate.karmada.io/uid: 7be5c835-2f15-447a-89df-6291a09d425c
work.karmada.io/conflict-resolution: abort
work.karmada.io/name: configmap-logger-5dbf5577d8
work.karmada.io/namespace: karmada-es-bplatform-t1-app-az1-bk-eu-west6-prod
creationTimestamp: "2024-04-30T14:36:35Z"
generation: 1
labels:
karmada.io/managed: "true"
propagationpolicy.karmada.io/name: aws-ec2-propagation-policy
propagationpolicy.karmada.io/namespace: karmada
propagationpolicy.karmada.io/permanent-id: 8e01b556-b96f-4458-852d-7c8a2d2f5d33
resourcebinding.karmada.io/permanent-id: f188a07c-f483-49be-87ca-206c3d858d0b
Here is what karmadactl shows:
(⎈ |minikube:karmada-system) ~/projects/ kactl get all z4h recovery mode
NAME CLUSTER READY STATUS RESTARTS AGE
pod/configmap-logger-78c87f6785-pvwh9 bplatform-t1-app-az1-bk-eu-west6-prod 0/1 CrashLoopBackOff 5 (2m12s ago) 5m8s
NAME CLUSTER READY UP-TO-DATE AVAILABLE AGE
deployment.apps/configmap-logger bplatform-t1-app-az1-bk-eu-west6-prod 0/1 1 0 5m8s
NAME CLUSTER DESIRED CURRENT READY AGE
replicaset.apps/configmap-logger-78c87f6785 bplatform-t1-app-az1-bk-eu-west6-prod 1 1 0 5m8s
Also somehow deployment manifest in target cluster is wrong and contains Deployment that still have override that I already deleted.
(⎈ |minikube:karmada-system) ~/projects/ kactl describe deployment.apps/configmap-logger --cluster bplatform-t1-app-az1-bk-eu-west6-prod z4h recovery mode
Name: configmap-logger
Namespace: karmada
CreationTimestamp: Tue, 30 Apr 2024 16:36:35 +0200
Labels: karmada.io/managed=true
propagationpolicy.karmada.io/name=aws-ec2-propagation-policy
propagationpolicy.karmada.io/namespace=karmada
propagationpolicy.karmada.io/permanent-id=8e01b556-b96f-4458-852d-7c8a2d2f5d33
resourcebinding.karmada.io/permanent-id=f188a07c-f483-49be-87ca-206c3d858d0b
service-directory.installation=kubernetes-dev-vkutsyk-cfe1c3a6
service-directory.persona=b-karmada-karmada
service-directory.project=karmada
service-directory.rollout=random-string
service-directory.service=karmada
Annotations: deployment.kubernetes.io/revision: 1
propagationpolicy.karmada.io/name: aws-ec2-propagation-policy
propagationpolicy.karmada.io/namespace: karmada
resourcebinding.karmada.io/name: configmap-logger-deployment
resourcebinding.karmada.io/namespace: karmada
resourcetemplate.karmada.io/managed-annotations:
kubectl.kubernetes.io/last-applied-configuration,propagationpolicy.karmada.io/name,propagationpolicy.karmada.io/namespace,resourcebinding....
resourcetemplate.karmada.io/managed-labels:
karmada.io/managed,propagationpolicy.karmada.io/name,propagationpolicy.karmada.io/namespace,propagationpolicy.karmada.io/permanent-id,reso...
resourcetemplate.karmada.io/uid: 7be5c835-2f15-447a-89df-6291a09d425c
work.karmada.io/conflict-resolution: abort
work.karmada.io/name: configmap-logger-5dbf5577d8
work.karmada.io/namespace: karmada-es-bplatform-t1-app-az1-bk-eu-west6-prod
Selector: app=configmap-logger
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=configmap-logger
Containers:
logger:
Image: busybox
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
cat /etc/config/example2;
Args:
while true; do
sleep 10;
cat /etc/config/config.json;
cat /etc/config/example1;
done
Environment: <none>
Mounts:
/etc/config from config-volume (rw)
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: example-configmap
Optional: false
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: <none>
NewReplicaSet: configmap-logger-78c87f6785 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 6m19s deployment-controller Scaled up replica set configmap-logger-78c87f6785 to 1
What is going on and how do I debug what is happening?
Don't want to delete all and recreate as it's not most efficient thing to do
Deletion and recreation from scratch resolved all the issues and fixed propagations
Closing this issue as no further question about this from my side. Moved monitoring topic into different issue for clear trail of issue and thoughts - https://github.com/karmada-io/karmada/issues/4895
@kutsyk thanks We've just been on holiday.
Deletion and recreation from scratch resolved all the issues and fixed propagations
Should be before the residual resources on your operation caused by the impact.
Hi,
I would like to get some help/understanding on how would be the best way to solve our case.
We have set of propagated resources into N clusters.
What would be the best way to implement alert if 1 object in 1 of the clusters, was modified?
So we are alerted that that specific object, let's say it's a configmap, was modified and doesn't correspond to main configurations.
Thanks, Vasyl