Closed MolisXYliu closed 5 months ago
Could you please provide some steps for install operator and apply karmada CR? @MolisXYliu ayn yaml would be great.
Could you please provide some steps for install operator and apply karmada CR? @MolisXYliu ayn yaml would be great.
the operatoe version is latest the karmada yaml is
apiVersion: operator.karmada.io/v1alpha1
kind: Karmada
metadata:
name: karmada
namespace: karmada-system
spec:
components:
etcd:
local:
imageRepository: xxx/etcd
imageTag: 3.5.9-0
karmadaAPIServer:
imageRepository: xxx/kube-apiserver
imageTag: v1.25.4
serviceType: NodePort
karmadaAggregatedAPIServer:
imageRepository: xxx/karmada-aggregated-apiserver
imageTag: v1.7.0
karmadaControllerManager:
imageRepository: xxx/karmada-controller-manager
imageTag: v1.7.0
karmadaScheduler:
imageRepository: xxx/karmada-scheduler
imageTag: v1.7.0
karmadaWebhook:
imageRepository: xxx/karmada-webhook
imageTag: v1.7.0
kubeControllerManager:
imageRepository: xxx/kube-controller-manager
imageTag: v1.25.4
the env is offline and i cp the crds in /var/lib/karmada/1.6.0
I1012 07:47:31.688186 1 crd.go:126] "[unpack] These crds yaml files have been decompressed in the path" path="/var/lib/karmada/1.6.0/crds" karmada="karmada-system/karmada"
It doesn't seem right for the operator@v1.7 to read and use CRDs of v1.6.
I can see the default Karmada version is still v1.6.0 on the master. https://github.com/karmada-io/karmada/blob/master/operator/pkg/constants/constants.go#L18
cc @calvin0327 I think we should update it to v1.7.0 now. Can you confirm if I missed anything?
@MolisXYliu Would you like to send a PR for it? You can find an example from https://github.com/karmada-io/karmada/pull/3718.
@MolisXYliu Would you like to send a PR for it? You can find an example from #3718.
i send the pr https://github.com/karmada-io/karmada/pull/4127 is the reason why operator can not deploy karmada v1.7.0?
I'm not sure if this is the only reason, can you have a try after
#4127? (Note that, you might need to update operator version to latest
, instead of v1.7.0
).
I don't have a clue about another suspicious log yet:
E1012 07:47:21.428138 1 planner.go:93] "failed to executed the workflow" err="failed to install etcd component, err: error when creating etcd client service, err: Service \"karmada-etcd\" is invalid: metadata.resourceVersion: Invalid value: \"\": must be specified for an update" workflow=init karmada="karmada-system/karmada"
is the reason why operator can not deploy karmada v1.7.0?
I have test at yesterday and it's working for karmada v1.7.0,The difference is that I am not in an offline environment.
i cp the crds in /var/lib/karmada/1.6.0
Try to using the crd of 1.7.0
I think the above guess may be not the reason, refer to the karmada.yaml
which @MolisXYliu provided, the image version is already v1.7.0
.
I have tested in my lcoal env, it installed ok.
Can you try this:
helm install karmada-operator -n karmada-system --create-namespace --dependency-update ./charts/karmada-operator --debug
kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/release-1.7/operator/config/crds/operator.karmada.io_karmadas.yaml
kubectl apply -f karmada.yaml
while the karmada.yaml
just like this:
apiVersion: operator.karmada.io/v1alpha1
kind: Karmada
metadata:
name: karmada
namespace: karmada-system
spec:
components:
etcd:
local:
imageRepository: registry.k8s.io/etcd
imageTag: 3.5.9-0
karmadaAPIServer:
imageRepository: registry.k8s.io/kube-apiserver
imageTag: v1.25.4
serviceType: NodePort
karmadaAggregatedAPIServer:
imageRepository: docker.io/karmada/karmada-aggregated-apiserver
imageTag: v1.7.0
karmadaControllerManager:
imageRepository: docker.io/karmada/karmada-controller-manager
imageTag: v1.7.0
karmadaScheduler:
imageRepository: docker.io/karmada/karmada-scheduler
imageTag: v1.7.0
karmadaWebhook:
imageRepository: docker.io/karmada/karmada-webhook
imageTag: v1.7.0
kubeControllerManager:
imageRepository: registry.k8s.io/kube-controller-manager
imageTag: v1.25.4
hostCluster:
networking:
dnsDomain: cluster.local
i cp the crds in /var/lib/karmada/1.6.0
Try to using the crd of 1.7.0
Yes, I think the reason is the crd version is 1.6.0 caused, "try using the crd of 1.7.0" +1
i cp the crds in /var/lib/karmada/1.6.0
Try to using the crd of 1.7.0
Yes, I think the reason is the crd version is 1.6.0 caused, "try using the crd of 1.7.0" +1
the crd version is 1.7.0 the operator read crd path is /var/lib/karmada/1.6.0 so i put the 1.7.0crd in this path
I1012 07:47:31.687866 1 crd.go:49] "[prepare-crds] Using crd folder" folder="/var/lib/karmada/1.6.0" karmada="karmada-system/karmada" I1012 07:47:31.688168 1 crd.go:69] "[download-crds] Skip download crd yaml files, the crd tar exists on disk" karmada="karmada-system/karmada"
I1012 07:47:31.687866 1 crd.go:49] "[prepare-crds] Using crd folder" folder="/var/lib/karmada/1.6.0" karmada="karmada-system/karmada" I1012 07:47:31.688168 1 crd.go:69] "[download-crds] Skip download crd yaml files, the crd tar exists on disk" karmada="karmada-system/karmada"
the operator read crd path is"/var/lib/karmada/1.6.0 so i put the 1.7.0crd in this path
helm install karmada-operator -n karmada-system --create-namespace --dependency-update ./charts/karmada-operator --debug
kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/release-1.7/operator/config/crds/operator.karmada.io_karmadas.yaml
kubectl apply -f karmada.yaml
I have tested this way which described in above comments for twice, it is ok, can you have a try?
@MolisXYliu It looks like an error was thrown when creating the etcd service.
if i put the crd in /var/lib/karmada/1.7.0 the operator logs is
I1013 06:16:12.393055 1 planner.go:87] "Start execute the workflow" workflow=init karmada="karmada-system/karmada"
I1013 06:16:12.505169 1 crd.go:48] "[prepare-crds] Running prepare-crds task" karmada="karmada-system/karmada"
I1013 06:16:12.505192 1 crd.go:49] "[prepare-crds] Using crd folder" folder="/var/lib/karmada/1.6.0" karmada="karmada-system/karmada"
E1013 06:16:14.514554 1 planner.go:93] "failed to executed the workflow" err="failed to download crd tar, err: Get \"https://github.com/karmada-io/karmada/releases/download/v1.6.0/crds.tar.gz\": dial tcp: lookup github.com on 172.16.0.3:53: server misbehaving" workflow=init karmada="karmada-system/karmada"
when i put crd in 1.6.0 the operator log is
Finished syncing karmada" karmada="karmada-system/karmada" duration="70.10357ms"
E1013 06:20:46.289053 1 controller.go:324] "Reconciler error" err="failed to install etcd component, err: error when creating etcd client service, err: Service \"karmada-etcd\" is invalid: metadata.resourceVersion: Invalid value: \"\": must be specified for an update" controller="karmada" controllerGroup="operator.karmada.io" controllerKind="Karmada" Karmada="karmada-system/karmada" namespace="karmada-system" name="karmada" reconcileID=b8f099e9-3d27-4177-9a71-7442376a179d
I1013 06:20:48.849268 1 controller.go:49] "Started syncing karmada" karmada="karmada-system/karmada" startTime="2023-10-13 06:20:48.849238012 +0000 UTC m=+403.966856303"
I1013 06:20:48.849391 1 controller.go:84] "Reconciling karmada" name="karmada"
@MolisXYliu It looks like an error was thrown when creating the etcd service.
yes but why it happen ? is my env have some problems?
@MolisXYliu It looks like an error was thrown when creating the etcd service.
yes but why it happen ? is my env have some problems?
and if i use old version operator and crds the errors disapper
E1013 06:16:14.514554 1 planner.go:93] "failed to executed the workflow" err="failed to download crd tar, err: Get \"https://github.com/karmada-io/karmada/releases/download/v1.6.0/crds.tar.gz\": dial tcp: lookup github.com on 172.16.0.3:53: server misbehaving" workflow=init karmada="karmada-system/karmada"
This log clearly shows that the karmada-operator can not download the crds. Does the operator support runs in offline environment?
E1013 06:16:14.514554 1 planner.go:93] "failed to executed the workflow" err="failed to download crd tar, err: Get "https://github.com/karmada-io/karmada/releases/download/v1.6.0/crds.tar.gz\": dial tcp: lookup github.com on 172.16.0.3:53: server misbehaving" workflow=init karmada="karmada-system/karmada"
This log clearly shows that the karmada-operator can not download the crds. Does the operator support runs in offline environment?
yes offline environment so i copy the crds in this path and the error is
I1013 06:41:40.613207 1 etcd.go:39] "[etcd] Running etcd task" karmada="karmada-system/karmada"
E1013 06:41:40.628679 1 planner.go:93] "failed to executed the workflow" err="failed to install etcd component, err: error when creating etcd client service, err: Service \"karmada-etcd\" is invalid: metadata.resourceVersion: Invalid value: \"\": must be specified for an update" workflow=init karmada="karmada-system/karmada"
the k8s version is v1.20.7
Could you please try to run with other k8s version? like v1.26.0/v1.27.0/v1.28.0 @MolisXYliu
and if i use old version operator and crds the errors disapper
What's the old version operator?
and if i use old version operator and crds the errors disapper
What's the old version operator?
the 1.6.0 operator do not have the etcd service error
yes offline environment so i copy the crds in this path and the error is
Hi @MolisXYliu, I'm curious about a question, though it may not be the root cause:
the operator
will download the crds
to its container path (if exist then skip), I mean the /var/lib/karmada/1.6.0
shall be the path inner operator
pod, I wanna your "replace crds operation" is inner pod?
yes offline environment so i copy the crds in this path and the error is
Please stop doing that until the author @calvin0327 confirms if it is the right way to do so.
yes offline environment so i copy the crds in this path and the error is
Hi @MolisXYliu, I'm curious about a question, though it may not be the root cause:
the
operator
will download thecrds
to its container path (if exist then skip), I mean the/var/lib/karmada/1.6.0
shall be the path inneroperator
pod, I wanna your "replace crds operation" is inner pod?
yes i copy crds in pod
Hi @MolisXYliu, I find that v1.7.0 karmada-operator now may have some bugs, we can not correctly apply v1.7.0 crds
, I am working on it right now~
since v1.7.0 version crds' path is different:
If goes smoothly, after my merge, you will not use this hacky way to replace crds, I am hurry on it~
I find that v1.7.0 karmada-operator now may have some bugs, we can not correctly apply v1.7.0 crds
@chaosi-zju I have a PR is work for it, https://github.com/karmada-io/karmada/pull/4130 and this is not the reason of this issue.
more info:
Just using k8s v1.23.0 or higher and the latest operator is working for you @MolisXYliu
The operator say the log of :
E1012 07:47:21.428138 1 planner.go:93] "failed to executed the workflow" err="failed to install etcd component, err: error when creating etcd client service, err: Service "karmada-etcd" is invalid: metadata.resourceVersion: Invalid value: "": must be specified for an update" workflow=init karmada="karmada-system/karmada"
and it happen on https://github.com/karmada-io/karmada/blob/f1f1a82dc73ae3828971fbdaf763c087f87f1291/operator/pkg/util/apiclient/idempotency.go#L78-L79
Have not checkout what's the reason for this change.
Hi @liangyuanpeng, you did great job~
Basides, I have looked your PR #4130, I see you changed the const default version to 1.7.0, while, when we upgrade to 1.8.0 or higher, we also have to manually submit this PR, so I have another way to fix this problem in #4133
I think we can bind default karmada image version to operator itself's version, I mean, if you want to install v1.7.0 verison karmada, use v1.7.0 version karmada-operator.
besides, we discovered the crds path
issue at the same time, the my PR involves this fix too
karmada/operator/pkg/util/apiclient/idempotency.go
@liangyuanpeng @chaosi-zju Thanks a lot for finding the useful err message. Based on the err message, whether we should specify resourceVersion when updating the etcd service?
create karmada 1.7.0 by operator on offline env fail the operator log is