Open mkloveyy opened 1 year ago
/cc @RainbowMango
For example, above rollout need to be updated in two clusters: cluster-x and cluster-z. The differences between two cluster are as follows:
apiVersion: policy.karmada.io/v1alpha1
kind: OverridePolicy
spec:
overrideRules:
- targetCluster:
clusterNames:
- cluster-x
overriders:
plaintext:
- path: /spec/strategy/canary/steps/0/pull
operator: replace
value: "false"
- path: /spec/strategy/canary/steps/1/pull
operator: replace
value: "true"
- targetCluster:
clusterNames:
- cluster-z
overriders:
plaintext:
- path: /spec/strategy/canary/steps/0/pull
operator: replace
value: nil
- path: /spec/strategy/canary/steps/1/pull
operator: replace
value: nil
Can this help you?
For example, above rollout need to be updated in two clusters: cluster-x and cluster-z. The differences between two cluster are as follows:
apiVersion: policy.karmada.io/v1alpha1 kind: OverridePolicy spec: overrideRules: - targetCluster: clusterNames: - cluster-x overriders: plaintext: - path: /spec/strategy/canary/steps/0/pull operator: replace value: "false" - path: /spec/strategy/canary/steps/1/pull operator: replace value: "true" - targetCluster: clusterNames: - cluster-z overriders: plaintext: - path: /spec/strategy/canary/steps/0/pull operator: replace value: nil - path: /spec/strategy/canary/steps/1/pull operator: replace value: nil
Can this help you?
Thanks for your reply!
Currently we do not use OverridePolicy
to put canary.steps
config in but implement the generating code in reviseReplicas
hook and it will be directly applied on works.
1. If replicas is 0, reviseReplicas hook will not be executed. 2. We cannot get cluster_name in reviseReplicas hook, so we cannot judge if we need to put fort pod.
And still we cannot decide which cluster to put the fort pod in as issues above.
Currently we do not use OverridePolicy to put canary.steps config in but implement the generating code in reviseReplicas hook and it will be directly applied on works.
I'm sorry, I didn't understand why you don't use OverridePolicy
and instead want to use reviseReplicas
?
I'm sorry, I didn't understand why you don't use
OverridePolicy
and instead want to usereviseReplicas
? Mainly for two reason:
- We need to regenerate
/spec/strategy/canary/steps
every time when user creates a release. If we put this dynamic config in op, we need to ensure op updated first before user updates the global resource. I think now we don't have this option.- We consider
OverridePolicy
is not very suitable for put dynamic configs. Now we put some static config inOverridePolicy
:apiVersion: policy.karmada.io/v1alpha1 kind: OverridePolicy metadata: annotations: versionid: "47677" name: namespace: spec: resourceSelectors: - apiVersion: kind: Rollout name: namespace: overrideRules: - targetCluster: clusterNames: - cluster-x overriders: plaintext: - path: /metadata/annotations/cdos~1az operator: replace value: X - path: /spec/template/spec/containers/0/env/11 operator: replace value: name: CDOS_K8S_CLUSTER value: cluster-x - targetCluster: clusterNames: - cluster-z overriders: plaintext: - path: /metadata/annotations/cdos~1az operator: replace value: Z - path: /spec/template/spec/containers/0/env/11 operator: replace value: name: CDOS_K8S_CLUSTER value: cluster-z
ReviseReplicas
is basicly for dividing replicas when replica scheduling policy is divided. I don't think your scenario is suitable to use it because you do not change replicas in resource template but change its strategy in different clusters.
ReviseReplicas
is basicly for dividing replicas when replica scheduling policy is divided. I don't think your scenario is suitable to use it because you do not change replicas in resource template but change its strategy in different clusters.
Thanks for your reply!
Where do you think would be the appropriate place to put this piece of logic? Or, to expand the scope a bit, how could we better implement such scenarios?
We cannot get cluster_name in reviseReplicas hook, so we cannot judge if we need to put fort pod.
Similar problem. https://github.com/karmada-io/karmada/issues/3159
I'm thinking about whether we should introduce cluster info in the ReviseReplica
step.
We cannot get cluster_name in reviseReplicas hook, so we cannot judge if we need to put fort pod.
Similar problem. #3159
I'm thinking about whether we should introduce cluster info in the
ReviseReplica
step.
Agreed. If do so, we can implement some custom logic that need to base on cluster info.
If replicas is 0, reviseReplicas hook will not be executed.
Do you mean you want to revise the Rollout
by reviseReplicas
hook point? Have you implemented InterpretReplica?
We need to regenerate /spec/strategy/canary/steps every time when user creates a release.
Does creates a release
means create a Rollout
?
I talked to @snowplayfire yesterday about this issue. Technically, we can do that by extending the resource interpreter but seems it essentially belongs to the override area. Not sure yet, maybe we can have a chat at community meeting.
Do you mean you want to revise the
Rollout
byreviseReplicas
hook point? Have you implemented InterpretReplica?
Yes, we have implemented InterpretReplica. However, as we need the newest ResourceBinding
, we need to revise the Rollout
between BuildResourceBinding
and ensureWork
.
We need to regenerate /spec/strategy/canary/steps every time when user creates a release.
Does
creates a release
means create aRollout
?
Roughly correct. A release
means a deploy access
which creates or updates a Rollout
.
I talked to @snowplayfire yesterday about this issue. Technically, we can do that by extending the resource interpreter but seems it essentially belongs to the override area.
My pleasure. Today we had a meeting with @snowplayfire and I've heard about this from her. And we've also communicated with @XiShanYongYe-Chang and @chaunceyjiang. Currently I think maybe both the two solutions(Expanding reviseReplica by adding cluster_name
and Giving a new hook point
) have more universal value and can solve our problems.
Not sure yet, maybe we can have a chat at community meeting.
Agreed. If possible, we hope to have some concrete discussions and reach some conclusions at next Tuesday's meeting. Anyway, thank you very much for your assistance!😊
Does this issue meet the requirements? https://github.com/karmada-io/karmada/issues/1567
Or add pause field for PP, issue like: https://github.com/karmada-io/karmada/issues/517
Does this issue meet the requirements? #1567
Maybe this plan is more suitable for us. However, as OverridePolicy
seems to be used for storing static parameters, I consider whether dynamic parameters can be applied on the Work directly.
@RainbowMango
I think maybe there is another option that we can introduce a new hook point maybe called reviseObject
. This hook appears after op has been applied on works.
cops, ops, err := overrideManager.ApplyOverridePolicies(clonedWorkload, targetCluster.Name)
if err != nil {
klog.Errorf("Failed to apply overrides for %s/%s/%s, err is: %v", clonedWorkload.GetKind(), clonedWorkload.GetNamespace(), clonedWorkload.GetName(), err)
return err
}
if resourceInterpreter.HookEnabled(clonedWorkload.GroupVersionKind(), configv1alpha1.InterpreterOperationReviseObject) {
clonedWorkload, err = resourceInterpreter.ReviseObject(clonedWorkload, targetCluster.Name, int64(targetCluster.Replicas))
if err != nil {
klog.Errorf("Failed to revise object for %s/%s/%s in cluster %s, err is: %v", workload.GetKind(), workload.GetNamespace(), workload.GetName(), targetCluster.Name, err)
return err
}
}
The definition of this hook is to allow users to directly modify the work.
Maybe there is another option that we can introduce a new hook point maybe called
reviseObject
. This hook appears after op has been applied on works.
Ask some guys to help take a look @RainbowMango @chaunceyjiang @CharlesQQ @Poor12
@shiyan2016 @snowplayfire @lxtywypc We can have a look together.
cops, ops, err := overrideManager.ApplyOverridePolicies(clonedWorkload, targetCluster.Name)
if err != nil {
klog.Errorf("Failed to apply overrides for %s/%s/%s, err is: %v", clonedWorkload.GetKind(), clonedWorkload.GetNamespace(), clonedWorkload.GetName(), err)
return err
}
if resourceInterpreter.HookEnabled(clonedWorkload.GroupVersionKind(), configv1alpha1.InterpreterOperationReviseObject) {
clonedWorkload, err = resourceInterpreter.ReviseObject(workload, clonedWorkload, targetCluster.Name)
if err != nil {
klog.Errorf("Failed to revise object for %s/%s/%s in cluster %s, err is: %v", workload.GetKind(), workload.GetNamespace(), workload.GetName(), targetCluster.Name, err)
return err
}
}
My suggestion is resourceInterpreter.ReviseObject(workload, clonedWorkload, targetCluster.Name)
I think maybe there is another option that we can introduce a new hook point maybe called reviseObject. This hook appears after op has been applied on works.
reviseObject
may be lead to user don't know which cluster's work object is revised?
reviseObject
may be lead to user don't know which cluster's work object is revised?
That's why we need to introduce targetCluster.Name
to it.
Below is a basic scenario in our release process:
A
fort pod
means the first machine deployed in a cluster. During the bastion deployment phase, no traffic is routed to it. And in the canary deployment phase, traffic is routed to it.We've implement a
deploy CRD called Rollout
based on openkruise to handle the whole process. Part of rollout's definition for above case is as follows:So in multi-cluster case, every time user creating a release, we need to do:
canary.steps
for all member clusters.fort pod
to one of member clusters (values in steps are different between fort and non-fort clusters).For example, above rollout need to be updated in two clusters:
cluster-x
andcluster-z
. The differences between two cluster are as follows:Fort pod is allocated by below rules:
ResourceBinding
. If weights are equal, allocate it the first cluster inResourceBinding
clusters by default.Currently, we are trying to put the code in the
reviseReplicas
hook (#3375)but we encountered some issues during the develop, most related to the fort pod:
reviseReplicas
hook will not be executed. We enable users to start a release if replicas is 0, but we cannot update in this case asreviseReplicas
hook won't be reached.cluster_name
inreviseReplicas
hook, so we cannot judge if we need to put fort pod. Usually we put the fort pod to cluster with higher weight. I the weights are evenly distributed, we decide to select the first cluster inResourceBinding
by default. But we can't know the cluster in this hook as it was not passed.So, is there any suitable ways to implement such differentiated configuration? And where is the appropriate place to put the code? Maybe
expanding reviseReplica by adding cluster_name and removing replicas > 0 judgement
andgiving a new hook point
?Environment: