karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.24k stars 828 forks source link

failed to install etcd component using operator model #4332

Open chaunceyjiang opened 7 months ago

chaunceyjiang commented 7 months ago

What happened:

failed to install etcd component using operator model

failed to executed the workflow" err="failed to install etcd component, err: error when decoding Etcd StatefulSet: json: cannot unmarshal number into Go struct field ObjectMeta.metadata.namespace of type string

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

apiVersion: operator.karmada.io/v1alpha1
kind: Karmada
metadata:
  annotations:
    alias: jxy5
  labels:
    kairship.io/instance-name: "111"
    kairship.io/karmada-hosted-cluster-name: kpanda-global-cluster
  name: "111"
  namespace: "111"
spec:
  components:
    etcd:
      local:
        imageRepository: k8s.m.daocloud.io/etcd
        imageTag: 3.5.9-0
        replicas: 1
        resources: {}
        volumeData:
          volumeClaim:
            metadata: {}
            spec:
              accessModes:
              - ReadWriteOnce
              resources:
                requests:
                  storage: 8Gi
    karmadaAPIServer:
      imageRepository: k8s.m.daocloud.io/kube-apiserver
      imageTag: v1.25.4
      replicas: 1
      resources: {}
      serviceSubnet: 10.96.0.0/12
      serviceType: NodePort
    karmadaAggregatedAPIServer:
      imageRepository: docker.m.daocloud.io/karmada/karmada-aggregated-apiserver
      imageTag: v1.7.0
      replicas: 1
      resources: {}
    karmadaControllerManager:
      imageRepository: docker.m.daocloud.io/karmada/karmada-controller-manager
      imageTag: v1.7.0
      replicas: 1
      resources: {}
    karmadaDescheduler:
      imageRepository: docker.m.daocloud.io/karmada/karmada-descheduler
      imageTag: v1.7.0
      replicas: 1
      resources: {}
    karmadaMetricsAdapter:
      imageRepository: docker.m.daocloud.io/karmada/karmada-metrics-adapter
      imageTag: v1.7.0
      replicas: 0
      resources: {}
    karmadaScheduler:
      imageRepository: docker.m.daocloud.io/karmada/karmada-scheduler
      imageTag: v1.7.0
      replicas: 1
      resources: {}
    karmadaWebhook:
      imageRepository: docker.m.daocloud.io/karmada/karmada-webhook
      imageTag: v1.7.0
      replicas: 1
      resources: {}
    kubeControllerManager:
      imageRepository: k8s.m.daocloud.io/kube-controller-manager
      imageTag: v1.25.4
      replicas: 1
      resources: {}
  hostCluster:
    networking:
      dnsDomain: cluster.loca

Anything else we need to know?:

Environment:

chaunceyjiang commented 7 months ago

/assign

zhzhuang-zju commented 7 months ago

Hi~, @chaunceyjiang,Judging from the issue you described, this problem has been reflected in #4128 . You can see if #4315 has solved your problem.

zhzhuang-zju commented 6 months ago

@chaunceyjiang Sorry that I mistakenly confused your problem with the problem I solved in the pr #4315. I realized that the problem is caused by the following code. https://github.com/karmada-io/karmada/blob/08651832925a3eb238b6d56770400943d1307a2b/operator/pkg/controlplane/etcd/etcd.go#L87-L90 By printing the log I found that the value of etcdServicePeerBytes is

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    karmada-app: etcd
    app.kubernetes.io/managed-by: karmada-operator
  namespace: 111
...

Therefore, it will report an error when it reaches code L88. When I try to add "" to the namespace of etcd's manifest to ensure that the value will be correctly recognized as a string type,

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    karmada-app: etcd
    app.kubernetes.io/managed-by: karmada-operator
  namespace: "{{ .Namespace }}"

it does solve the problem that was reported before, but there will still be other errors. It looks like the operator installation needs some tweaking to be compatible with all-numeric namespaces!

chaunceyjiang commented 6 months ago

It looks like the operator installation needs some tweaking to be compatible with all-numeric namespaces!

Yes.

This is also why I asked you to test the 123 scenario.

zhzhuang-zju commented 6 months ago

It looks like the operator installation needs some tweaking to be compatible with all-numeric namespaces!

Yes.

This is also why I asked you to test the 123 scenario.

Yes. Are you interested in fixing that?

chaunceyjiang commented 6 months ago

Yes. Are you interested in fixing that?

Of course.

Hi~, @chaunceyjiang,Judging from the issue you described, this problem has been reflected in https://github.com/karmada-io/karmada/issues/4128 . You can see if https://github.com/karmada-io/karmada/pull/4315 has solved your problem.

If you can confirm that your PR #4315 did not solve this issue, I will continue to fix this problem.

zhzhuang-zju commented 6 months ago

If you can confirm that your PR #4315 did not solve this issue, I will continue to fix this problem.

My pr solves a different problem, you can go on~