ERRO[09:20:51 MSK] PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init

MonaxGT commented 3 years ago

Hi Kubersphere team!

I used manual to deploy kubersphere. I installed all of depends. I created ./kk create config --with-kubesphere v3.1.0 and 1.20.4 Kubernetes version (I also tried v1.19.8)

Then I run sudo ./kk create cluster -f config-sample.yaml and received error:

ERRO[09:20:51 MSK] Failed to init kubernetes cluster: Failed to exec command: sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0706 09:20:51.194782   13677 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
    [WARNING FileExisting-ebtables]: ebtables not found in system path
    [WARNING FileExisting-ethtool]: ethtool not found in system path
    [WARNING FileExisting-tc]: tc not found in system path
error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR FileExisting-ip]: ip not found in system path
    [ERROR FileExisting-iptables]: iptables not found in system path
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1  node=10.10.10.101
WARN[09:20:51 MSK] Task failed ...
WARN[09:20:51 MSK] error: interrupted by error
Error: Failed to init kubernetes cluster: interrupted by error

But if I just copy sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crict and run from terminal - I don't see any problem

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0706 08:46:00.323861  127486 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0706 08:46:00.458826  127486 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.8
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master k8s-master.local k8s-node1 k8s-node1.local k8s-node2 k8s-node2.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.local lb.kubesphere.local localhost] and IPs [10.233.0.1 10.10.10.101 127.0.0.1 10.10.10.102 10.10.10.103]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 63.502290 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 70ezkf.c8t9f42yqmm714bh
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

I.tried to delete and create cluster again severals times but received the same error again and again...

After copy config file to $HOME/.kube, join worker's node with kuberadmin joint and run kubectl get nodes I've seen only nodes in NotReady status... and nothing changes

OS: Red Hat Enterprise Linux Server 7.9 (Maipo)

Can you help me with error?

24sama commented 3 years ago

Maybe the commands ip, iptables, ebtables, ethtool, tc not found in your system path. You can check them first.

MonaxGT commented 3 years ago

Hi! I've got all of this commands. I checked this at first) I really dunno why scrip write this

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "iptables"
iptables v1.4.21: no command specified
Try `iptables -h' or 'iptables --help' for more information.

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ebtables"
ebtables v2.0.10-4 (December 2011)
Usage:
ebtables -[ADI] chain rule-specification [options]
ebtables -P chain target
ebtables -[LFZ] [chain]
ebtables -[NX] [chain]
ebtables -E old-chain-name new-chain-name

Commands:
--append -A chain             : append to chain
--delete -D chain             : delete matching rule from chain
--delete -D chain rulenum     : delete rule at position rulenum from chain
--change-counters -C chain
          [rulenum] pcnt bcnt : change counters of existing rule
--insert -I chain rulenum     : insert rule at position rulenum in chain
--list   -L [chain]           : list the rules in a chain or in all chains
--flush  -F [chain]           : delete all rules in chain or in all chains
--init-table                  : replace the kernel table with the initial table
--zero   -Z [chain]           : put counters on zero in chain or in all chains
--policy -P chain target      : change policy on chain to target
--new-chain -N chain          : create a user defined chain
--rename-chain -E old new     : rename a chain
--delete-chain -X [chain]     : delete a user defined chain
--atomic-commit               : update the kernel w/t table contained in <FILE>
--atomic-init                 : put the initial kernel table into <FILE>
--atomic-save                 : put the current kernel table into <FILE>
--atomic-file file            : set <FILE> to file

Options:
--proto  -p [!] proto         : protocol hexadecimal, by name or LENGTH
--src    -s [!] address[/mask]: source mac address
--dst    -d [!] address[/mask]: destination mac address
--in-if  -i [!] name[+]       : network input interface name
--out-if -o [!] name[+]       : network output interface name
--logical-in  [!] name[+]     : logical bridge input interface name
--logical-out [!] name[+]     : logical bridge output interface name
--set-counters -c chain
          pcnt bcnt           : set the counters of the to be added rule
--modprobe -M program         : try to insert modules using this program
--concurrent                  : use a file lock to support concurrent scripts
--version -V                  : print package version

Environment variable:
EBTABLES_ATOMIC_FILE          : if set <FILE> (see above) will equal its value

Standard targets: DROP, ACCEPT, RETURN or CONTINUE;
The target can also be a user defined chain.

Supported chains for the filter table:
INPUT FORWARD OUTPUT

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "tc"
Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }
       tc [-force] [-OK] -batch filename
where  OBJECT := { qdisc | class | filter | action | monitor | exec }
       OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | -n[etns] name |
                    -nm | -nam[es] | { -cf | -conf } path }

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ethtool"
ethtool: bad command line argument(s)
For more information run ethtool -h

[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ip"
Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }
       ip [ -force ] -batch filename
where  OBJECT := { link | address | addrlabel | route | rule | neigh | ntable |
                   tunnel | tuntap | maddress | mroute | mrule | monitor | xfrm |
                   netns | l2tp | fou | macsec | tcp_metrics | token | netconf | ila |
                   vrf }
       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |
                    -h[uman-readable] | -iec |
                    -f[amily] { inet | inet6 | ipx | dnet | mpls | bridge | link } |
                    -4 | -6 | -I | -D | -B | -0 |
                    -l[oops] { maximum-addr-flush-attempts } | -br[ief] |
                    -o[neline] | -t[imestamp] | -ts[hort] | -b[atch] [filename] |
                    -rc[vbuf] [size] | -n[etns] name | -a[ll] | -c[olor]}

MonaxGT commented 3 years ago

I found out that problem with PATH environment... I run PATH=$PATH sudo ./kk create cluster -f config-sample.yaml and it helps but now I received next error with cert.

WARN[23:45:26 MSK] Task failed ...
WARN[23:45:26 MSK] error: Failed to patch kubeadm secret: Failed to exec command: sudo -E /bin/sh -c "/usr/local/bin/kubectl patch -n kube-system secret kubeadm-certs -p '{\"data\": {\"external-etcd-ca.crt\": \"\"}}'"
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"): Process exited with status 1
Error: Failed to get cluster status: Failed to patch kubeadm secret: Failed to exec command: sudo -E /bin/sh -c "/usr/local/bin/kubectl patch -n kube-system secret kubeadm-certs -p '{\"data\": {\"external-etcd-ca.crt\": \"\"}}'"
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"): Process exited with status 1

pixiake commented 3 years ago

You can try to delete the cluster (./kk delete cluster -f config-sample.yaml) and recreate it.

MonaxGT commented 3 years ago

You can try to delete the cluster (./kk delete cluster -f config-sample.yaml) and recreate it.

I've tried, it isn't help(

24sama commented 3 years ago

Check whether $HOME/.kube/config is correct. You can use base64 -d to decode it.

If it is wrong, you can cp /etc/kubernetes/admin.conf $HOME/.kube/config to change it.

Perhaps, recreate:

delete the cluster.
rm $HOME/.kube/config

pixiake commented 3 years ago

I think there may be two solutions:

Use root to perform the installation.
Modify the secure_path in /etc/sudoers
Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

MonaxGT commented 3 years ago

rm $HOME/.kube/config Use root to perform the installation. Modify the secure_path in /etc/sudoers Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

Thank but it isn't helps too... Maybe something useful in this part of log:

Push /opt/kubersphere/kubekey/v1.20.4/amd64/helm to 10.10.10.101:/tmp/kubekey/helm   Done
Push /opt/kubersphere/kubekey/v1.20.4/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.10.10.101:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz   Done
INFO[14:34:57 MSK] Initializing kubernetes cluster
[k8s-master 10.10.10.101] MSG:
[preflight] Running pre-flight checks
W0707 14:34:59.015477   93472 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0707 14:34:59.018992   93472 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
[k8s-master 10.10.10.101] MSG:
[preflight] Running pre-flight checks
W0707 14:34:59.941574   93723 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0707 14:34:59.944887   93723 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
ERRO[14:35:00 MSK] Failed to init kubernetes cluster: Failed to exec command: sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0707 14:35:00.189390   93769 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
    [WARNING FileExisting-ebtables]: ebtables not found in system path
    [WARNING FileExisting-ethtool]: ethtool not found in system path
    [WARNING FileExisting-tc]: tc not found in system path
error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR FileExisting-ip]: ip not found in system path
    [ERROR FileExisting-iptables]: iptables not found in system path
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1  node=10.10.10.101
WARN[14:35:00 MSK] Task failed ...
WARN[14:35:00 MSK] error: interrupted by error

BUT if I run this command - it is ended with success... I don't know why...

[root@k8s-master kubersphere]# sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0707 14:46:57.724253  106978 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master k8s-master.local k8s-node1 k8s-node1.local k8s-node2 k8s-node2.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.local lb.kubesphere.local localhost] and IPs [10.233.0.1 10.10.10.101 127.0.0.1 10.10.10.102 10.10.10.103]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 53.501607 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token:  tototнt
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join lb.kubesphere.local:6443 --token tototнt  \
    --discovery-token-ca-cert-hash sha256:3333 \
    --control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join lb.kubesphere.local:6443 --token tototнt \
    --discovery-token-ca-cert-hash sha256:3333

It's really strange .... like kk doesn't want read some paths. It can't be because my Kubernetes user have "-" inside? Maybe some logic of KK can be broken or something else?

MonaxGT commented 3 years ago

I tried to change user without "-" and gave sudoers permissions to this user but I got the same error

pixiake commented 3 years ago

It should have nothing to do with the user name.

Is it the same to report the error when installing directly with root?

MonaxGT commented 3 years ago

Is it the same to report the error when installing directly with root?

Yes, I used sudo su and run code after that

24sama commented 3 years ago

Can you use which ip and echo $PATH in your master01 node? Do you mind if we can take a look at the command echo?

MonaxGT commented 3 years ago

[kuberman@master kubersphere]$ which ip
/usr/sbin/ip

[kuberman@master kubersphere]$ echo $PATH
/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/monaxgt/.local/bin:/home/monaxgt/bin

I dunno how my base user get there... I create user kuberman use useradd command.

24sama commented 3 years ago

exec visudo

make sure there include your kuberman

## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
kuberman ALL=(ALL)       ALL

MonaxGT commented 3 years ago

--//--

#Defaults    secure_path = /sbin:/bin:/usr/sbin:/usr/bin
Defaults    secure_path = /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

## Next comes the main part: which users can run what software on
## which machines (the sudoers file can be shared between multiple
## systems).
## Syntax:
##
##      user    MACHINE=COMMANDS
##
## The COMMANDS section may have other options added to it.
##
## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
kuberman        ALL=(ALL)       ALL

--//--

24sama commented 3 years ago

Maybe the file ~/.bashrc doesn't load the $PATH correctly in the master nodes. I'm not sure...

Try to add a new line export PATH=$PATH in the end of the file ~/.bashrc.

MonaxGT commented 3 years ago

Maybe the file ~/.bashrc doesn't load the $PATH correctly in the master nodes. I'm not sure... Try to add a new line export PATH=$PATH in the end of the file ~/.bashrc.

No, still he writes the same error...

MonaxGT commented 3 years ago

I continued deploy cluster after error. I recently wrote that my cluster had status NotReady and a tried to undestand why.. I've found that I had to initialize cluster networking. I ran kubectl -n kube-system apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml and my cluster started working in Ready status.

But it doesn't help to deploy Kubesphere(

24sama commented 3 years ago

If you want to install KubeSphere on existing k8s cluster, you can use another tools: ks-installer.

MonaxGT commented 3 years ago

I just want to deploy Kubesphere)))

I tried to run "create cluster" after success deploy k8s cluster and at last have seen Kubepshere web UI, but I got another error

MonaxGT commented 3 years ago

If you want to install KubeSphere on existing k8s cluster, you can use another tools: ks-installer.

Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?

24sama commented 3 years ago

Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?

Sorry, my mistake. I already edit it.

MonaxGT commented 3 years ago

Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?

Sorry, my mistake. I already edit it.

Ok, thanks. I will try.

MonaxGT commented 3 years ago

I just want to deploy Kubesphere)))

I tried to run "create cluster" after success deploy k8s cluster and at last have seen Kubepshere web UI, but I got another error

Any idea how to pass this error? I find only one issue https://github.com/kubesphere/ks-installer/issues/33 I tried to translate but not all of the word translated or translated right

FeynmanZhou commented 3 years ago

Is there any unavailable DNS in your K8s cluster?

FeynmanZhou commented 3 years ago

You can also verify the Pod status of the namespace kubesphere-system to check if all Pods are running.

MonaxGT commented 3 years ago

Is there any unavailable DNS in your K8s cluster?

sorry, I'm not a professional DevOPS) I dunno... I think that kubernets deploy coredns service and this service will be internal dns for dns like in Docker. Or you ask about another DNS?

My /etc/resolv.conf

[root@master kubersphere]# cat /etc/resolv.conf
# Generated by NetworkManager
search net.home.ru
nameserver 172.16.0.202
nameserver 172.16.0.203

MonaxGT commented 3 years ago

You can also verify the Pod status of the namespace kubesphere-system to check if all Pods are running.

[root@master kubersphere]# kubectl get svc/ks-console -n kubesphere-system
NAME         TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
ks-console   NodePort   10.233.21.108   <none>        80:30880/TCP   11h

If I am not mistaken it's mean that everything works fine

24sama commented 3 years ago

Exec the command kubectl get svc -n kube-system to check there are only one dns svc on your cluster.

MonaxGT commented 3 years ago

kubectl get svc -n kube-system

[kuberman@master monaxgt]$ kubectl get svc -n kube-system
NAME                          TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                        AGE
kube-controller-manager-svc   ClusterIP   None          <none>        10257/TCP                      12h
kube-dns                      ClusterIP   10.233.0.10   <none>        53/UDP,53/TCP,9153/TCP         12h
kube-scheduler-svc            ClusterIP   None          <none>        10259/TCP                      12h
kubelet                       ClusterIP   None          <none>        10250/TCP,10255/TCP,4194/TCP   12h

MonaxGT commented 3 years ago

I go to sh shell container k8s_ks-controller-manager_ks-controller-manager-d84f68f46-r4gjx_kubesphere-system_583b98e5-7221-4d2f-8ab5-9aff9f16bdc9_0

And tried to lookup this dns name:

/ # nslookup ks-apiserver.kubesphere-system.svc
Server:     169.254.25.10
Address:    169.254.25.10:53

** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN

** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN

In /etc/kubernetes/kubeadmin-conf.yaml i find:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDNS:
- 169.254.25.10

But in the top message we can see: kube-dns ClusterIP 10.233.0.10

24sama commented 3 years ago

This is a result when creating a cluster successfully by kk:

NAME                          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                        AGE
coredns                       ClusterIP   10.233.0.3   <none>        53/UDP,53/TCP,9153/TCP         4h33m
kube-controller-manager-svc   ClusterIP   None         <none>        10257/TCP                      4h25m
kube-scheduler-svc            ClusterIP   None         <none>        10259/TCP                      4h25m
kubelet                       ClusterIP   None         <none>        10250/TCP,10255/TCP,4194/TCP   4h24m

The kk will delete the kube-dns and create core-dns.

So, maybe you can have a try to delete it manually. Here is the coredns-svc.yaml file:

---
apiVersion: v1
kind: Service
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "coredns"
    addonmanager.kubernetes.io/mode: Reconcile
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.233.0.3
  ports:
    - name: dns
      port: 53
      protocol: UDP
    - name: dns-tcp
      port: 53
      protocol: TCP
    - name: metrics
      port: 9153
      protocol: TCP

MonaxGT commented 3 years ago

Thanks! I apply this conf and see coredns but with Kubernetes-dns

[root@master kubersphere]# kubectl get svc -n kube-system
NAME                          TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                        AGE
coredns                       ClusterIP   10.233.0.3    <none>        53/UDP,53/TCP,9153/TCP         8m38s
kube-controller-manager-svc   ClusterIP   None          <none>        10257/TCP                      13h
kube-dns                      ClusterIP   10.233.0.10   <none>        53/UDP,53/TCP,9153/TCP         13h
kube-scheduler-svc            ClusterIP   None          <none>        10259/TCP                      13h
kubelet                       ClusterIP   None          <none>        10250/TCP,10255/TCP,4194/TCP   13h
[root@master kubersphere]#

I tried to remove Kubernetes-dns

[root@master kubersphere]# kubectl delete --namespace=kube-system deployment kube-dns
Error from server (NotFound): deployments.apps "kube-dns" not found

Kubersphere UI return the same error with request to http://ks-apiserver.kubesphere-system.svc/oauth/token failed, reason: getaddrinfo ENOTFOUND ks-apiserver.kubesphere-system.svc

MonaxGT commented 3 years ago

I get into pod and try to resolve

/ # nslookup  ks-apiserver.kubesphere-system.svc 10.233.0.10
nslookup: write to '10.233.0.10': Connection refused
;; connection timed out; no servers could be reached

/ # nslookup  ks-apiserver.kubesphere-system.svc
Server:     169.254.25.10
Address:    169.254.25.10:53

** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN

** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN

/ # cat /etc/resolv.conf
nameserver 169.254.25.10
search kubesphere-system.svc.local svc.local local home.ru
options ndots:5

24sama commented 3 years ago

There are 2 svc of dns in your cluster. Try to delete the svc kube-dns.

MonaxGT commented 3 years ago

Hi! I decided to recreate my VM and create nodes with Centos. But it didn't help and I have the same error. I used way to run kk create cluster and if I got error I just copied this command and run manually and then run create cluster again. At the end I got Kubesphere UI as early and now I could get into but without changing password.

Now I can't change password or create new user because: Internal error occurred: failed calling webhook «users.iam.kubesphere.io»: Post «https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s»: x509: certificate signed by unknown authority

I tried workaround and patches from this issue but it didn't help too.

[kuber@k8s-master04 kubernetes]$ kubectl get pods -n kube-system
NAME                                           READY   STATUS    RESTARTS   AGE
calico-kube-controllers-8f59968d4-9qtwn        1/1     Running   0          77m
calico-node-5vbgh                              1/1     Running   0          77m
calico-node-qpwb4                              1/1     Running   0          77m
calico-node-qxzh7                              1/1     Running   0          77m
coredns-86cfc99d74-5grtd                       1/1     Running   0          87m
coredns-86cfc99d74-r9856                       1/1     Running   0          87m
kube-apiserver-master                          1/1     Running   0          87m
kube-controller-manager-master                 1/1     Running   0          87m
kube-proxy-28t7w                               1/1     Running   0          78m
kube-proxy-gt7pb                               1/1     Running   0          87m
kube-proxy-nsndz                               1/1     Running   0          78m
kube-scheduler-master                          1/1     Running   0          87m
openebs-localpv-provisioner-7cfc686bc5-s8r6x   1/1     Running   0          77m
snapshot-controller-0                          1/1     Running   0          62m

NAME                          TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                        AGE
coredns                       ClusterIP   10.233.0.10   <none>        53/UDP,53/TCP,9153/TCP         48m
kube-controller-manager-svc   ClusterIP   None          <none>        10257/TCP                      61m
kube-scheduler-svc            ClusterIP   None          <none>        10259/TCP                      61m
kubelet                       ClusterIP   None          <none>        10250/TCP,10255/TCP,4194/TCP   61m

FeynmanZhou commented 3 years ago

@wansir @RolandMa1986 Could you please help to take a look at this issue?

RolandMa1986 commented 3 years ago

Hi @MonaxGT, The certificate issue should already be fixed in the v3.1.0 release. Please check your installer's version by the following command first. The image output should be "kubesphere/ks-installer:v3.1.0" if you are trying to deploy KubeSphere 3.1.0.

kubectl -n kubesphere-system get deployments.apps ks-installer -o=jsonpath='{$.spec.template.spec.containers[:1].image}'

For the KubeSphere 3.1.0 release, there are 2 possible issues.

DNS doesn't work. You can execute nslookup ks-controller-manager.kubesphere-system.svc in a ks-apiserver shell session. The IP should match to ks-controller-manager's cluster ip kubectl -n kubesphere-system get service ks-controller-manager.

The certificate didn't mount correctly. Re-apply the following 2 configs and try restart ks-apiserver.

kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/release-3.1/roles/ks-core/prepare/files/ks-init/webhook-secret.yaml
kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/release-3.1/roles/ks-core/prepare/files/ks-init/iam.kubesphere.io.yaml

MonaxGT commented 3 years ago

Hi @RolandMa1986!

kubectl -n kubesphere-system get deployments.apps ks-installer -o=jsonpath='{$.spec.template.spec.containers[:1].image}'

kubesphere/ks-installer:v3.1.0

I've tried to connect to ks-apiserver and received:

[kuber@master kubernetes]$ kubectl exec --stdin --tty -n kube-system kube-apiserver-master -- /bin/sh
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown
command terminated with exit code 126

I dunno how to connect to this pods but it works with basic pods

[kuber@master kubernetes]$ kubectl -n kubesphere-system get service ks-controller-manager
NAME                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
ks-controller-manager   ClusterIP   10.233.5.229   <none>        443/TCP   23h

[root@master kubernetes]# kubectl apply -f webhook.yaml
secret/ks-controller-manager-webhook-cert unchanged
[root@master kubernetes]# kubectl apply -f iam.yaml
Warning: admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration
validatingwebhookconfiguration.admissionregistration.k8s.io/users.iam.kubesphere.io unchanged

I think it's mean that nothing new changed about certs

The main problem with DNS server

MonaxGT commented 3 years ago

This is a result when creating a cluster successfully by kk:

NAME                          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                        AGE
coredns                       ClusterIP   10.233.0.3   <none>        53/UDP,53/TCP,9153/TCP         4h33m
kube-controller-manager-svc   ClusterIP   None         <none>        10257/TCP                      4h25m
kube-scheduler-svc            ClusterIP   None         <none>        10259/TCP                      4h25m
kubelet                       ClusterIP   None         <none>        10250/TCP,10255/TCP,4194/TCP   4h24m

The kk will delete the kube-dns and create core-dns.

So, maybe you can have a try to delete it manually. Here is the coredns-svc.yaml file:

---
apiVersion: v1
kind: Service
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "coredns"
    addonmanager.kubernetes.io/mode: Reconcile
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.233.0.3
  ports:
    - name: dns
      port: 53
      protocol: UDP
    - name: dns-tcp
      port: 53
      protocol: TCP
    - name: metrics
      port: 9153
      protocol: TCP

Maybe it will be useful, I used advice @24sama and apply code from this post

RolandMa1986 commented 3 years ago

You can verify the dns service by the following command on your host:

nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10

The "10.233.0.10" ip is the coredns service ip from

kubectl -n kube-system get service coredns

Or get a pod shell session:

kubectl -n kubesphere-system exec -it ks-apiserver-<tab> sh

I notice that you have a search domain "search net.home.ru" in your /etc/resolv.conf. Maybe you can try to delete it. if you are using a dhcp config. please try to set a fixed dns like 8.8.8.8.

Is there any unavailable DNS in your K8s cluster?

sorry, I'm not a professional DevOPS) I dunno... I think that kubernets deploy coredns service and this service will be internal dns for dns like in Docker. Or you ask about another DNS?

My /etc/resolv.conf
[root@master kubersphere]# cat /etc/resolv.conf
# Generated by NetworkManager
search net.home.ru
nameserver 172.16.0.202
nameserver 172.16.0.203

MonaxGT commented 3 years ago

On my host nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10

[root@master kubernetes]# nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
Server:     10.233.0.10
Address:    10.233.0.10#53

** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN

[kuber@master kubernetes]$ kubectl -n kube-system get service coredns
NAME      TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                  AGE
coredns   ClusterIP   10.233.0.10   <none>        53/UDP,53/TCP,9153/TCP   29h

Get pods names in namespace kubesphere-system

[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME                                    READY   STATUS    RESTARTS   AGE
ks-apiserver-66db8995-z6ttm             1/1     Running   0          28h
ks-console-67f59b8664-7lhpv             1/1     Running   0          27h
ks-controller-manager-bd5fb4db4-ztw8d   1/1     Running   0          28h
ks-installer-5d65c99d54-fr97x           1/1     Running   0          29h

[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-66db8995-z6ttm sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup ks-controller-manager.kubesphere-system.svc
Server:     10.233.0.10
Address:    10.233.0.10:53

** server can't find ks-controller-manager.kubesphere-system.svc: NXDOMAIN

** server can't find ks-controller-manager.kubesphere-system.svc: NXDOMAIN

/ #

I tried to reboot 2 modules and get into shell again and nslookup after that

[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME                                    READY   STATUS    RESTARTS   AGE
ks-apiserver-66db8995-z6ttm             1/1     Running   0          28h
ks-console-67f59b8664-7lhpv             1/1     Running   0          27h
ks-controller-manager-bd5fb4db4-ztw8d   1/1     Running   0          28h
ks-installer-5d65c99d54-fr97x           1/1     Running   0          29h

[kuber@master kubernetes]$ kubectl -n kubesphere-system rollout restart deploy ks-controller-manager
deployment.apps/ks-controller-manager restarted

[kuber@master kubernetes]$ kubectl -n kubesphere-system rollout restart deploy ks-apiserver
deployment.apps/ks-apiserver restarted

[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME                                     READY   STATUS    RESTARTS   AGE
ks-apiserver-5949c88fb6-nl7sg            1/1     Running   0          5s
ks-console-67f59b8664-7lhpv              1/1     Running   0          27h
ks-controller-manager-6cb59f9fd9-c7q8c   1/1     Running   0          14s
ks-installer-5d65c99d54-fr97x            1/1     Running   0          29h

[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-5949c88fb6-nl7sg sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
Server:     10.233.0.10
Address:    10.233.0.10:53

** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN

** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN

/ #

It seems that coredns didn't work at all(

MonaxGT commented 3 years ago

I forgot to tell, I changed /etc/resolv.conf on my host but it didn' t change anything

RolandMa1986 commented 3 years ago

My mistake, the DNS name should be "ks-controller-manager.kubesphere-system.svc.cluster.local". You can use the long dns name both in host and pod shell

MonaxGT commented 3 years ago

[kuber@master kubernetes]$ nslookup ks-controller-manager.kubesphere-system.svc.cluster.local 10.233.0.10
Server:     10.233.0.10
Address:    10.233.0.10#53

Name:   ks-controller-manager.kubesphere-system.svc.cluster.local
Address: 10.233.5.229

[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-5949c88fb6-nl7sg sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # nslookup ks-controller-manager.kubesphere-system.svc.cluster.local 10.233.0.10
Server:     10.233.0.10
Address:    10.233.0.10:53

Name:   ks-controller-manager.kubesphere-system.svc.cluster.local
Address: 10.233.5.229

RolandMa1986 commented 3 years ago

That looks good. Can you check your /etc/resolv.conf in the "ks-apiserver-5949c88fb6-nl7sg" pod? You can try to issue an HTTPS request too. like "curl -k https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2" or with full DNS name. It should return something like {"code":400}. You may need to install curl firstly by "# apk add curl".

MonaxGT commented 3 years ago

/ # curl -k https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2
{"response":{"uid":"","allowed":false,"status":{"metadata":{},"message":"contentType=, expected application/json","code":400}}}

Yeap, it returns exactly 400

RolandMa1986 commented 3 years ago

All service seems ok. Login with admin/create user should works... But, ok, maybe try to restart the service could fix the issue.

kubectl -n kubesphere-system rollout restart deployment ks-apiserver
kubectl -n kubesphere-system rollout restart deployment ks-controller-manager
kubectl -n kube-system delete pod kube-apiserver-node1

MonaxGT commented 3 years ago

I run

kubectl -n kubesphere-system rollout restart deployment ks-apiserver
kubectl -n kubesphere-system rollout restart deployment ks-controller-manager

And can't kubectl -n kube-system delete pod kube-apiserver-node1

[kuber@p7701v02k8s-master04 kubernetes]$ kubectl -n kube-system get pods
NAME                                           READY   STATUS    RESTARTS   AGE
calico-kube-controllers-8f59968d4-9qtwn        1/1     Running   0          47h
calico-node-5vbgh                              1/1     Running   0          47h
calico-node-qpwb4                              1/1     Running   0          47h
calico-node-qxzh7                              1/1     Running   0          47h
coredns-86cfc99d74-5grtd                       1/1     Running   0          47h
coredns-86cfc99d74-r9856                       1/1     Running   0          47h
kube-apiserver-master                          1/1     Running   0          47h
kube-controller-manager-master                 1/1     Running   0          47h
kube-proxy-28t7w                               1/1     Running   0          47h
kube-proxy-gt7pb                               1/1     Running   0          47h
kube-proxy-nsndz                               1/1     Running   0          47h
kube-scheduler-master                          1/1     Running   0          47h
openebs-localpv-provisioner-7cfc686bc5-s8r6x   1/1     Running   0          47h
snapshot-controller-0                          1/1     Running   0          47h
[kuber@p7701v02k8s-master04 kubernetes]$ kubectl -n kube-system delete pod kube-apiserver-node1
Error from server (NotFound): pods "kube-apiserver-node1" not found

Should I delete kube-apiserver-master ?

I try to logon and have the same error with cert x509(

MonaxGT commented 3 years ago

I dunno.. maybe it's important but I have access to internet thought proxy server, but it should influence on this

kubesphere / kubekey

ERRO[09:20:51 MSK] PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init #572