karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.11k stars 805 forks source link

karmadactl init with external etcd #3898

Closed tedli closed 7 months ago

tedli commented 8 months ago

What type of PR is this?

/kind feature

What this PR does / why we need it:

Users may have existing etcd cluster which is under well maintenance by a team, so it's better to use such etcd cluster, rather than setup new one. (a poor one, maybe)

Which issue(s) this PR fixes: Fixes #3897

Special notes for your reviewer:

NOT tested yet, for review only, if this pr is worth to go on finishing, I will continue.

Does this PR introduce a user-facing change?:

`karmadactl`: The `init` now can setup production ready HA control plane by using an external HA etcd cluster.
codecov-commenter commented 8 months ago

Codecov Report

Patch coverage: 50.40% and project coverage change: -0.29% :warning:

Comparison is base (e5277b6) 54.83% compared to head (1ad8fd3) 54.55%. Report is 113 commits behind head on master.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the GitHub App Integration for your organization. Read more.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #3898 +/- ## ========================================== - Coverage 54.83% 54.55% -0.29% ========================================== Files 228 232 +4 Lines 21848 22999 +1151 ========================================== + Hits 11981 12547 +566 - Misses 9227 9766 +539 - Partials 640 686 +46 ``` | [Flag](https://app.codecov.io/gh/karmada-io/karmada/pull/3898/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/karmada-io/karmada/pull/3898/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | `54.55% <50.40%> (-0.29%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files Changed](https://app.codecov.io/gh/karmada-io/karmada/pull/3898?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [pkg/karmadactl/cmdinit/cert/cert.go](https://app.codecov.io/gh/karmada-io/karmada/pull/3898?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#diff-cGtnL2thcm1hZGFjdGwvY21kaW5pdC9jZXJ0L2NlcnQuZ28=) | `47.08% <0.00%> (-0.94%)` | :arrow_down: | | [pkg/karmadactl/cmdinit/kubernetes/deploy.go](https://app.codecov.io/gh/karmada-io/karmada/pull/3898?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#diff-cGtnL2thcm1hZGFjdGwvY21kaW5pdC9rdWJlcm5ldGVzL2RlcGxveS5nbw==) | `37.69% <35.00%> (-2.61%)` | :arrow_down: | | [pkg/karmadactl/cmdinit/kubernetes/deployments.go](https://app.codecov.io/gh/karmada-io/karmada/pull/3898?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#diff-cGtnL2thcm1hZGFjdGwvY21kaW5pdC9rdWJlcm5ldGVzL2RlcGxveW1lbnRzLmdv) | `99.27% <83.33%> (-0.73%)` | :arrow_down: | | [pkg/karmadactl/cmdinit/cmdinit.go](https://app.codecov.io/gh/karmada-io/karmada/pull/3898?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#diff-cGtnL2thcm1hZGFjdGwvY21kaW5pdC9jbWRpbml0Lmdv) | `75.00% <100.00%> (+1.58%)` | :arrow_up: | ... and [33 files with indirect coverage changes](https://app.codecov.io/gh/karmada-io/karmada/pull/3898/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

RainbowMango commented 8 months ago

NOT tested yet, for review only, if this pr is worth to go on finishing, I will continue.

Definitely worth doing. Thanks @tedli

chaosi-zju commented 8 months ago

/lgtm

I really appreciate this meaningful PR, and I have tried a simple test, it seems to run as expected~

Anyone interested in this PR can do more validation tests on top of my following steps (we use hack/create-cluster.sh script to simply create a Kind cluster, and use its self-contained etcd-karmada-host-control-plane as external etcd) :

## step 1: install a Kind cluster and copy certificates from docker-node to local file.

hack/create-cluster.sh karmada-host ~/.kube/karmada.config

mkdir -p /tmp/etcd

docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/ca.crt > /tmp/etcd/ca.crt

docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/peer.crt > /tmp/etcd/peer.crt

docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/peer.key > /tmp/etcd/peer.key

## step 2: build karmadactl from source and install karmada by  built karmadactl.

make karmadactl

export KUBECONFIG=~/.kube/karmada.config

# tip: you will get `advertise-client-urls` by this cmd, and put as the `--external-etcd-servers` parameter into the next cmd
kubectl describe po etcd-karmada-host-control-plane -n kube-system | grep 'advertise-client-urls='

_output/bin/linux/amd64/karmadactl init --crds https://github.com/karmada-io/karmada/releases/download/v1.7.0-alpha.3/crds.tar.gz --external-etcd-ca-cert-path /tmp/etcd/ca.crt --external-etcd-client-cert-path /tmp/etcd/peer.crt --external-etcd-client-key-path /tmp/etcd/peer.key --external-etcd-servers https://172.18.0.2:2379 --external-etcd-key-prefix /exetcd

## step 3: validation, we create a secret and then check whether the key wrote to external etcd

kubectl --kubeconfig /etc/karmada/karmada-apiserver.config create secret generic secret-xxxx -n default --from-literal=mykey=mydata

# tip: you should install etcdctl on your local
etcdctl --cacert=/tmp/etcd/ca.crt --cert=/tmp/etcd/peer.crt --key=/tmp/etcd/peer.key --endpoints=172.18.0.2:2379 get /exetcd --prefix --keys-only | grep secret-xxxx

image

image

karmada-bot commented 8 months ago

@chaosi-zju: changing LGTM is restricted to collaborators

In response to [this](https://github.com/karmada-io/karmada/pull/3898#issuecomment-1685815399): >/lgtm > >I really appreciate this meaningful PR, and I have tried a simple test, it seems to run as expected~ > >Anyone interested in this PR can do more validation tests on top of my following steps (we use `hack/create-cluster.sh` script to simply create a `Kind` cluster, and use its self-contained `etcd-karmada-host-control-plane` as external etcd) : > >```shell >## step 1: install a Kind cluster and copy certificates from docker-node to local file. > >hack/create-cluster.sh karmada-host ~/.kube/karmada.config > >docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/ca.crt > /tmp/etcd/ca.crt > >docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/peer.crt > /tmp/etcd/peer.crt > >docker exec -it karmada-host-control-plane cat /etc/kubernetes/pki/etcd/peer.key > /tmp/etcd/peer.key > >## step 2: build karmadactl from source and install karmada by built karmadactl. > >make karmadactl > ># tip: you will get `advertise-client-urls` by this cmd, and put as the `--external-etcd-servers` parameter into the next cmd >kubectl --kubeconfig ~/.kube/karmada.config describe po etcd-karmada-host-control-plane -n kube-system | grep 'advertise-client-urls=' > >_output/bin/linux/amd64/karmadactl init --crds https://github.com/karmada-io/karmada/releases/download/v1.6.0/crds.tar.gz --external-etcd-ca-cert-path /tmp/etcd/ca.crt --external-etcd-client-cert-path /tmp/etcd/peer.crt --external-etcd-client-key-path /tmp/etcd/peer.key --external-etcd-servers https://172.18.0.2:2379 --external-etcd-key-prefix /exetcd > >## step 3: validation, we create a secret and then check whether the key wrote to external etcd > >kubectl --kubeconfig /etc/karmada/karmada-apiserver.config create secret generic secret-xxxx -n default --from-literal=mykey=mydata > ># tip: you should install etcdctl on your local >etcdctl --cacert=/tmp/etcd/ca.crt --cert=/tmp/etcd/peer.crt --key=/tmp/etcd/peer.key --endpoints=172.18.0.2:2379 get /exetcd --prefix --keys-only | grep secret-xxxx >``` > >![image](https://github.com/karmada-io/karmada/assets/30589999/93581587-644f-4d11-8cee-f6447a306a46) > > > > > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
chaosi-zju commented 8 months ago

it occurred some problems, I find the karmada-controller-manager restart many times, now not sure whether related to this PR, to be continue...

image

I0821 08:37:26.982589       1 recorder.go:104] "events: No policy match for resource" type="Warning" object={Kind:CustomResourceDefinition Namespace: Name:federatedhpas.autoscaling.karmada.io UID:0affe531-2cc3-4421-b922-bf54fdb8f427 APIVersion:apiextensions.k8s.io/v1 ResourceVersion:853 FieldPath:} reason="ApplyPolicyFailed"
[controller-runtime] log.SetLogger(...) was never called, logs will not be displayed:
goroutine 375 [running]:
runtime/debug.Stack()
        /opt/hostedtoolcache/go/1.20.6/x64/src/runtime/debug/stack.go:24 +0x65
sigs.k8s.io/controller-runtime/pkg/log.eventuallyFulfillRoot()
        /home/runner/work/karmada/karmada/vendor/sigs.k8s.io/controller-runtime/pkg/log/log.go:59 +0xbd
sigs.k8s.io/controller-runtime/pkg/log.(*delegatingLogSink).Error(0xc0004a2180, {0x29ace80, 0xc0010369c0}, {0x262bcaa, 0x3d}, {0xc000ed8540, 0x2, 0x2})
        /home/runner/work/karmada/karmada/vendor/sigs.k8s.io/controller-runtime/pkg/log/deleg.go:139 +0x68
github.com/go-logr/logr.Logger.Error({{0x29d53c0?, 0xc0004a2180?}, 0xc001bab240?}, {0x29ace80, 0xc0010369c0}, {0x262bcaa, 0x3d}, {0xc000ed8540, 0x2, 0x2})
        /home/runner/work/karmada/karmada/vendor/github.com/go-logr/logr/logr.go:299 +0xda
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1({0x29ce620?, 0xc0007cd860?})
        /home/runner/work/karmada/karmada/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:63 +0x265
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1(0xc0019fde00?, {0x29ce620?, 0xc0007cd860?})
        /home/runner/work/karmada/karmada/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:62 +0x5d
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext({0x29ce620, 0xc0007cd860}, {0x29cbb50?, 0xc0008fd4e0}, 0x1, 0x0, 0x0?)
        /home/runner/work/karmada/karmada/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:63 +0x205
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel({0x29ce620, 0xc0007cd860}, 0x0?, 0xb0?, 0xc000094a18?)
        /home/runner/work/karmada/karmada/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33 +0x5c
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1()
        /home/runner/work/karmada/karmada/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:56 +0xfa
created by sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start
        /home/runner/work/karmada/karmada/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:48 +0x1e5
E0821 08:39:17.830174       1 controller.go:202] "Could not wait for Cache to sync" err="failed to wait for cronfederatedhpa caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.CronFederatedHPA" controller="cronfederatedhpa" controllerGroup="autoscaling.karmada.io" controllerKind="CronFederatedHPA"
I0821 08:39:17.830233       1 internal.go:581] "Stopping and waiting for non leader election runnables"
I0821 08:39:17.830266       1 server.go:43] "shutting down server" path="/metrics" kind="metrics" addr="[::]:8080"
tedli commented 8 months ago

Hi @chaosi-zju ,

it occurred some problems, I find the karmada-controller-manager restart many times, now not sure whether related to this PR

Can you take a look at apiserver log? It may caused by apiserver cannot connect to etcd.

chaosi-zju commented 8 months ago

Hi @chaosi-zju ,

it occurred some problems, I find the karmada-controller-manager restart many times, now not sure whether related to this PR

Can you take a look at apiserver log? It may caused by apiserver cannot connect to etcd.

ok, the environment is uninstalled temporarily, I will give you feedback tomorrow~

chaosi-zju commented 8 months ago

Can you take a look at apiserver log? It may caused by apiserver cannot connect to etcd.

hello @tedli , there are several messages:

  1. you can download all logs from here
  2. I invite @zishen to test by following my steps, the problem is inevitable.
  3. karmadactl v1.6.2 is ok, while built by the source code of this PR is not ok.
  4. From my test, apiserver has no problem connect to etcd, the restart of controller might be wrong with its panic.
tedli commented 8 months ago

Hi @chaosi-zju ,

My production karmada control plane was setup by this pr. It runs as expect until now.

I took a look at your log. It seemed your karmadactl was built with master branch.

controllermanager.go:118] karmada-controller-manager version: version.Info{GitVersion:"v1.7.0-alpha.3", GitCommit:"09259b1f10cadb612115a5e2769760790dde6647", GitTreeState:"clean", BuildDate:"2023-08-01T07:39:27Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}

My thought is, why karmada controller manager unable to start, is irrelevant with this pr.

~I noticed that, we've update sigs.k8s.io/controller-runtime to v0.15.0, 3 weeks ago, and v0.15.0 introduced a using pattern break, that we need to explicit call log.SetLogger (the same thing happend to other projects)~ <- no

This log also appeard in the log you provided.

https://github.com/karmada-io/karmada/blame/19fbab565aeaf747a506ccc00bb3363790ac673f/go.mod#L51

To confirm this, I think you can try with --karmada-controller-manager-image flag, to specify a older version controller manager image.

chaosi-zju commented 8 months ago

Sorry, the above mentioned is not a problem with this PR, I rectified two things: rebase this PR to master branch、update --crds https://github.com/karmada-io/karmada/releases/download/v1.6.0/crds.tar.gz to --crds https://github.com/karmada-io/karmada/releases/download/v1.7.0-alpha.3/crds.tar.gz, then problem gone.

chaosi-zju commented 8 months ago

So, I think your PR is generally ok to merge. You can mark related comments resolved.

ps: I didn't test for karmada-search while you modified it, if you have time, you can try to add a test~

wuyingjun-lucky commented 8 months ago

@tedli Can you post your test including addons command ?

tedli commented 8 months ago

Hi @wuyingjun-lucky ,

Did you mean, the command args that I init karmada and enable addons?

If so, then:

export PATH=/home/lizhen6/.local/bin:$PATH
export KUBECONFIG=/home/lizhen6/.kube/config

kubectl karmada init --private-image-registry=r.addops.soft.360.cn/karmada \
--kube-image-registry=r.addops.soft.360.cn/k8s-io --kube-image-tag=v1.24.13 \
--cert-external-ip=10.217.117.89,10.203.43.210,10.203.162.245,10.203.163.45 \
--cert-external-dns=pub-shbt.karmada.qihoo.net,stats1.safe.shbt.qihoo.net,v5eng18.safe.shbt.qihoo.net,v5eng21.safe.shbt.qihoo.net \
--crds=/home/lizhen6/crds.tar.gz --host-cluster-domain=shbt.karmada \
--karmada-scheduler-image=r.addops.soft.360.cn/karmada/karmada-scheduler:v1.6.0-6-gfc6b52ff \
--karmada-controller-manager-image=r.addops.soft.360.cn/karmada/karmada-controller-manager:v1.6.0-6-gfc6b52ff-dirty \
--external-etcd-ca-cert-path=/etc/kubernetes/ssl/kube-ca.pem \
--external-etcd-client-cert-path=/home/lizhen6/karmada.cert.pem \
--external-etcd-client-key-path=/home/lizhen6/karmada.key.pem \
--external-etcd-servers=https://10.203.43.210:2379,https://10.203.162.245:2379,https://10.203.163.45:2379 \
--external-etcd-key-prefix=/shbt-karmada

kubectl karmada addons enable all --karmada-search-image=r.addops.soft.360.cn/karmada/karmada-search:v1.6.0-6-gfc6b52ff --karmada-descheduler-image=r.addops.soft.360.cn/karmada/karmada-descheduler:v1.6.0 --karmada-scheduler-estimator-image=r.addops.soft.360.cn/karmada/karmada-scheduler-estimator:v1.6.0 --member-kubeconfig=/home/lizhen6/.kube/config --member-context="local" --host-cluster-domain=shbt.karmada
RainbowMango commented 8 months ago

Kindly ping @wuyingjun-lucky

wuyingjun-lucky commented 8 months ago

Hi @wuyingjun-lucky ,

Did you mean, the command args that I init karmada and enable addons?

If so, then:

export PATH=/home/lizhen6/.local/bin:$PATH
export KUBECONFIG=/home/lizhen6/.kube/config

kubectl karmada init --private-image-registry=r.addops.soft.360.cn/karmada \
--kube-image-registry=r.addops.soft.360.cn/k8s-io --kube-image-tag=v1.24.13 \
--cert-external-ip=10.217.117.89,10.203.43.210,10.203.162.245,10.203.163.45 \
--cert-external-dns=pub-shbt.karmada.qihoo.net,stats1.safe.shbt.qihoo.net,v5eng18.safe.shbt.qihoo.net,v5eng21.safe.shbt.qihoo.net \
--crds=/home/lizhen6/crds.tar.gz --host-cluster-domain=shbt.karmada \
--karmada-scheduler-image=r.addops.soft.360.cn/karmada/karmada-scheduler:v1.6.0-6-gfc6b52ff \
--karmada-controller-manager-image=r.addops.soft.360.cn/karmada/karmada-controller-manager:v1.6.0-6-gfc6b52ff-dirty \
--external-etcd-ca-cert-path=/etc/kubernetes/ssl/kube-ca.pem \
--external-etcd-client-cert-path=/home/lizhen6/karmada.cert.pem \
--external-etcd-client-key-path=/home/lizhen6/karmada.key.pem \
--external-etcd-servers=https://10.203.43.210:2379,https://10.203.162.245:2379,https://10.203.163.45:2379 \
--external-etcd-key-prefix=/shbt-karmada

kubectl karmada addons enable all --karmada-search-image=r.addops.soft.360.cn/karmada/karmada-search:v1.6.0-6-gfc6b52ff --karmada-descheduler-image=r.addops.soft.360.cn/karmada/karmada-descheduler:v1.6.0 --karmada-scheduler-estimator-image=r.addops.soft.360.cn/karmada/karmada-scheduler-estimator:v1.6.0 --member-kubeconfig=/home/lizhen6/.kube/config --member-context="local" --host-cluster-domain=shbt.karmada

Can you post the test result ?

tedli commented 8 months ago

Hi @wuyingjun-lucky ,

Can you post the test result ?

How do you define 'result'? Is that something like a screen capture, showing control plane pods running? Or run all test cases agnist the setuped control plane, to make sure all tests pass?

I didn't run any test cases. But my current production karmada control plane was setup by this pr.

wuyingjun-lucky commented 8 months ago

Hi @wuyingjun-lucky ,

Can you post the test result ?

How do you define 'result'? Is that something like a screen capture, showing control plane pods running? Or run all test cases agnist the setuped control plane, to make sure all tests pass?

You can post the setup output、component status and logs. I didn't run any test cases. But my current production karmada control plane was setup by this pr.

Does your environment include search and is it built by the latest version ?

wuyingjun-lucky commented 8 months ago

/lgtm

tedli commented 8 months ago

Hi @wuyingjun-lucky ,

Does your environment include search and is it built by the latest version ?

No, my search component was not built with latest version, it's a private mod.

RainbowMango commented 8 months ago

@tedli Can we do this after v1.7.0 which is planned today? I just got the release stuff(including documents release notes, and so on) ready.

karmada-bot commented 7 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liangyuanpeng, RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[pkg/karmadactl/addons/OWNERS](https://github.com/karmada-io/karmada/blob/master/pkg/karmadactl/addons/OWNERS)~~ [RainbowMango] - ~~[pkg/karmadactl/cmdinit/OWNERS](https://github.com/karmada-io/karmada/blob/master/pkg/karmadactl/cmdinit/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
RainbowMango commented 7 months ago

/hold

RainbowMango commented 7 months ago

I was going to ask a squash commit, but the bot just merged it. Sorry, my bad.