Closed MonaxGT closed 3 years ago
Maybe the commands ip
, iptables
, ebtables
, ethtool
, tc
not found in your system path. You can check them first.
Hi! I've got all of this commands. I checked this at first) I really dunno why scrip write this
[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "iptables"
iptables v1.4.21: no command specified
Try `iptables -h' or 'iptables --help' for more information.
[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ebtables"
ebtables v2.0.10-4 (December 2011)
Usage:
ebtables -[ADI] chain rule-specification [options]
ebtables -P chain target
ebtables -[LFZ] [chain]
ebtables -[NX] [chain]
ebtables -E old-chain-name new-chain-name
Commands:
--append -A chain : append to chain
--delete -D chain : delete matching rule from chain
--delete -D chain rulenum : delete rule at position rulenum from chain
--change-counters -C chain
[rulenum] pcnt bcnt : change counters of existing rule
--insert -I chain rulenum : insert rule at position rulenum in chain
--list -L [chain] : list the rules in a chain or in all chains
--flush -F [chain] : delete all rules in chain or in all chains
--init-table : replace the kernel table with the initial table
--zero -Z [chain] : put counters on zero in chain or in all chains
--policy -P chain target : change policy on chain to target
--new-chain -N chain : create a user defined chain
--rename-chain -E old new : rename a chain
--delete-chain -X [chain] : delete a user defined chain
--atomic-commit : update the kernel w/t table contained in <FILE>
--atomic-init : put the initial kernel table into <FILE>
--atomic-save : put the current kernel table into <FILE>
--atomic-file file : set <FILE> to file
Options:
--proto -p [!] proto : protocol hexadecimal, by name or LENGTH
--src -s [!] address[/mask]: source mac address
--dst -d [!] address[/mask]: destination mac address
--in-if -i [!] name[+] : network input interface name
--out-if -o [!] name[+] : network output interface name
--logical-in [!] name[+] : logical bridge input interface name
--logical-out [!] name[+] : logical bridge output interface name
--set-counters -c chain
pcnt bcnt : set the counters of the to be added rule
--modprobe -M program : try to insert modules using this program
--concurrent : use a file lock to support concurrent scripts
--version -V : print package version
Environment variable:
EBTABLES_ATOMIC_FILE : if set <FILE> (see above) will equal its value
Standard targets: DROP, ACCEPT, RETURN or CONTINUE;
The target can also be a user defined chain.
Supported chains for the filter table:
INPUT FORWARD OUTPUT
[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "tc"
Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }
tc [-force] [-OK] -batch filename
where OBJECT := { qdisc | class | filter | action | monitor | exec }
OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | -n[etns] name |
-nm | -nam[es] | { -cf | -conf } path }
[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ethtool"
ethtool: bad command line argument(s)
For more information run ethtool -h
[k8s-user@k8s-master kubersphere]$ sudo env PATH=$PATH /bin/sh -c "ip"
Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }
ip [ -force ] -batch filename
where OBJECT := { link | address | addrlabel | route | rule | neigh | ntable |
tunnel | tuntap | maddress | mroute | mrule | monitor | xfrm |
netns | l2tp | fou | macsec | tcp_metrics | token | netconf | ila |
vrf }
OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |
-h[uman-readable] | -iec |
-f[amily] { inet | inet6 | ipx | dnet | mpls | bridge | link } |
-4 | -6 | -I | -D | -B | -0 |
-l[oops] { maximum-addr-flush-attempts } | -br[ief] |
-o[neline] | -t[imestamp] | -ts[hort] | -b[atch] [filename] |
-rc[vbuf] [size] | -n[etns] name | -a[ll] | -c[olor]}
I found out that problem with PATH environment...
I run PATH=$PATH sudo ./kk create cluster -f config-sample.yaml
and it helps but now I received next error with cert.
WARN[23:45:26 MSK] Task failed ...
WARN[23:45:26 MSK] error: Failed to patch kubeadm secret: Failed to exec command: sudo -E /bin/sh -c "/usr/local/bin/kubectl patch -n kube-system secret kubeadm-certs -p '{\"data\": {\"external-etcd-ca.crt\": \"\"}}'"
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"): Process exited with status 1
Error: Failed to get cluster status: Failed to patch kubeadm secret: Failed to exec command: sudo -E /bin/sh -c "/usr/local/bin/kubectl patch -n kube-system secret kubeadm-certs -p '{\"data\": {\"external-etcd-ca.crt\": \"\"}}'"
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes"): Process exited with status 1
You can try to delete the cluster (./kk delete cluster -f config-sample.yaml
) and recreate it.
You can try to delete the cluster (./kk delete cluster -f config-sample.yaml) and recreate it.
I've tried, it isn't help(
Check whether $HOME/.kube/config
is correct. You can use base64 -d
to decode it.
If it is wrong, you can cp /etc/kubernetes/admin.conf $HOME/.kube/config
to change it.
Perhaps, recreate:
rm $HOME/.kube/config
I think there may be two solutions:
secure_path
in /etc/sudoers
Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
rm $HOME/.kube/config Use root to perform the installation. Modify the secure_path in /etc/sudoers Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
Thank but it isn't helps too... Maybe something useful in this part of log:
Push /opt/kubersphere/kubekey/v1.20.4/amd64/helm to 10.10.10.101:/tmp/kubekey/helm Done
Push /opt/kubersphere/kubekey/v1.20.4/amd64/cni-plugins-linux-amd64-v0.8.6.tgz to 10.10.10.101:/tmp/kubekey/cni-plugins-linux-amd64-v0.8.6.tgz Done
INFO[14:34:57 MSK] Initializing kubernetes cluster
[k8s-master 10.10.10.101] MSG:
[preflight] Running pre-flight checks
W0707 14:34:59.015477 93472 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0707 14:34:59.018992 93472 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
[k8s-master 10.10.10.101] MSG:
[preflight] Running pre-flight checks
W0707 14:34:59.941574 93723 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0707 14:34:59.944887 93723 cleanupnode.go:99] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
ERRO[14:35:00 MSK] Failed to init kubernetes cluster: Failed to exec command: sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0707 14:35:00.189390 93769 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
[WARNING FileExisting-ebtables]: ebtables not found in system path
[WARNING FileExisting-ethtool]: ethtool not found in system path
[WARNING FileExisting-tc]: tc not found in system path
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileExisting-ip]: ip not found in system path
[ERROR FileExisting-iptables]: iptables not found in system path
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1 node=10.10.10.101
WARN[14:35:00 MSK] Task failed ...
WARN[14:35:00 MSK] error: interrupted by error
BUT if I run this command - it is ended with success... I don't know why...
[root@k8s-master kubersphere]# sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl"
W0707 14:46:57.724253 106978 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master k8s-master.local k8s-node1 k8s-node1.local k8s-node2 k8s-node2.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.local lb.kubesphere.local localhost] and IPs [10.233.0.1 10.10.10.101 127.0.0.1 10.10.10.102 10.10.10.103]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 53.501607 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: tototнt
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join lb.kubesphere.local:6443 --token tototнt \
--discovery-token-ca-cert-hash sha256:3333 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join lb.kubesphere.local:6443 --token tototнt \
--discovery-token-ca-cert-hash sha256:3333
It's really strange .... like kk doesn't want read some paths. It can't be because my Kubernetes user have "-" inside? Maybe some logic of KK can be broken or something else?
I tried to change user without "-" and gave sudoers permissions to this user but I got the same error
It should have nothing to do with the user name.
Is it the same to report the error when installing directly with root?
Is it the same to report the error when installing directly with root?
Yes, I used sudo su and run code after that
Can you use which ip
and echo $PATH
in your master01 node? Do you mind if we can take a look at the command echo?
[kuberman@master kubersphere]$ which ip
/usr/sbin/ip
[kuberman@master kubersphere]$ echo $PATH
/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/monaxgt/.local/bin:/home/monaxgt/bin
I dunno how my base user get there... I create user kuberman use useradd command.
exec visudo
make sure there include your kuberman
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
kuberman ALL=(ALL) ALL
--//--
#Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin
Defaults secure_path = /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
## Next comes the main part: which users can run what software on
## which machines (the sudoers file can be shared between multiple
## systems).
## Syntax:
##
## user MACHINE=COMMANDS
##
## The COMMANDS section may have other options added to it.
##
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
kuberman ALL=(ALL) ALL
--//--
Maybe the file ~/.bashrc
doesn't load the $PATH correctly in the master nodes. I'm not sure...
Try to add a new line export PATH=$PATH
in the end of the file ~/.bashrc
.
Maybe the file ~/.bashrc doesn't load the $PATH correctly in the master nodes. I'm not sure... Try to add a new line export PATH=$PATH in the end of the file ~/.bashrc.
No, still he writes the same error...
I continued deploy cluster after error. I recently wrote that my cluster had status NotReady and a tried to undestand why.. I've found that I had to initialize cluster networking. I ran kubectl -n kube-system apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml
and my cluster started working in Ready status.
But it doesn't help to deploy Kubesphere(
If you want to install KubeSphere on existing k8s cluster, you can use another tools: ks-installer.
I just want to deploy Kubesphere)))
I tried to run "create cluster" after success deploy k8s cluster and at last have seen Kubepshere web UI, but I got another error
If you want to install KubeSphere on existing k8s cluster, you can use another tools: ks-installer.
Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?
Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?
Sorry, my mistake. I already edit it.
Your link to ks-installer go to https://github.com/wenyan-lang/wenyan project, maybe you miss click?
Sorry, my mistake. I already edit it.
Ok, thanks. I will try.
I just want to deploy Kubesphere)))
I tried to run "create cluster" after success deploy k8s cluster and at last have seen Kubepshere web UI, but I got another error
Any idea how to pass this error? I find only one issue https://github.com/kubesphere/ks-installer/issues/33 I tried to translate but not all of the word translated or translated right
Is there any unavailable DNS in your K8s cluster?
You can also verify the Pod status of the namespace kubesphere-system
to check if all Pods are running.
Is there any unavailable DNS in your K8s cluster?
sorry, I'm not a professional DevOPS) I dunno... I think that kubernets deploy coredns service and this service will be internal dns for dns like in Docker. Or you ask about another DNS?
My /etc/resolv.conf
[root@master kubersphere]# cat /etc/resolv.conf
# Generated by NetworkManager
search net.home.ru
nameserver 172.16.0.202
nameserver 172.16.0.203
You can also verify the Pod status of the namespace
kubesphere-system
to check if all Pods are running.
[root@master kubersphere]# kubectl get svc/ks-console -n kubesphere-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ks-console NodePort 10.233.21.108 <none> 80:30880/TCP 11h
If I am not mistaken it's mean that everything works fine
Exec the command kubectl get svc -n kube-system
to check there are only one dns svc on your cluster.
kubectl get svc -n kube-system
[kuberman@master monaxgt]$ kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-controller-manager-svc ClusterIP None <none> 10257/TCP 12h
kube-dns ClusterIP 10.233.0.10 <none> 53/UDP,53/TCP,9153/TCP 12h
kube-scheduler-svc ClusterIP None <none> 10259/TCP 12h
kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 12h
I go to sh shell container k8s_ks-controller-manager_ks-controller-manager-d84f68f46-r4gjx_kubesphere-system_583b98e5-7221-4d2f-8ab5-9aff9f16bdc9_0
And tried to lookup this dns name:
/ # nslookup ks-apiserver.kubesphere-system.svc
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN
** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN
In /etc/kubernetes/kubeadmin-conf.yaml i find:
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDNS:
- 169.254.25.10
But in the top message we can see: kube-dns ClusterIP 10.233.0.10
This is a result when creating a cluster successfully by kk:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 4h33m
kube-controller-manager-svc ClusterIP None <none> 10257/TCP 4h25m
kube-scheduler-svc ClusterIP None <none> 10259/TCP 4h25m
kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 4h24m
The kk will delete the kube-dns
and create core-dns
.
So, maybe you can have a try to delete it manually. Here is the coredns-svc.yaml
file:
---
apiVersion: v1
kind: Service
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "coredns"
addonmanager.kubernetes.io/mode: Reconcile
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.233.0.3
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
Thanks! I apply this conf and see coredns but with Kubernetes-dns
[root@master kubersphere]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 8m38s
kube-controller-manager-svc ClusterIP None <none> 10257/TCP 13h
kube-dns ClusterIP 10.233.0.10 <none> 53/UDP,53/TCP,9153/TCP 13h
kube-scheduler-svc ClusterIP None <none> 10259/TCP 13h
kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 13h
[root@master kubersphere]#
I tried to remove Kubernetes-dns
[root@master kubersphere]# kubectl delete --namespace=kube-system deployment kube-dns
Error from server (NotFound): deployments.apps "kube-dns" not found
Kubersphere UI return the same error with request to http://ks-apiserver.kubesphere-system.svc/oauth/token failed, reason: getaddrinfo ENOTFOUND ks-apiserver.kubesphere-system.svc
I get into pod and try to resolve
/ # nslookup ks-apiserver.kubesphere-system.svc 10.233.0.10
nslookup: write to '10.233.0.10': Connection refused
;; connection timed out; no servers could be reached
/ # nslookup ks-apiserver.kubesphere-system.svc
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN
** server can't find ks-apiserver.kubesphere-system.svc: NXDOMAIN
/ # cat /etc/resolv.conf
nameserver 169.254.25.10
search kubesphere-system.svc.local svc.local local home.ru
options ndots:5
There are 2 svc of dns in your cluster. Try to delete the svc kube-dns
.
Hi! I decided to recreate my VM and create nodes with Centos. But it didn't help and I have the same error. I used way to run kk create cluster and if I got error I just copied this command and run manually and then run create cluster again. At the end I got Kubesphere UI as early and now I could get into but without changing password.
Now I can't change password or create new user because: Internal error occurred: failed calling webhook «users.iam.kubesphere.io»: Post «https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2?timeout=30s»: x509: certificate signed by unknown authority
I tried workaround and patches from this issue but it didn't help too.
[kuber@k8s-master04 kubernetes]$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-8f59968d4-9qtwn 1/1 Running 0 77m
calico-node-5vbgh 1/1 Running 0 77m
calico-node-qpwb4 1/1 Running 0 77m
calico-node-qxzh7 1/1 Running 0 77m
coredns-86cfc99d74-5grtd 1/1 Running 0 87m
coredns-86cfc99d74-r9856 1/1 Running 0 87m
kube-apiserver-master 1/1 Running 0 87m
kube-controller-manager-master 1/1 Running 0 87m
kube-proxy-28t7w 1/1 Running 0 78m
kube-proxy-gt7pb 1/1 Running 0 87m
kube-proxy-nsndz 1/1 Running 0 78m
kube-scheduler-master 1/1 Running 0 87m
openebs-localpv-provisioner-7cfc686bc5-s8r6x 1/1 Running 0 77m
snapshot-controller-0 1/1 Running 0 62m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.10 <none> 53/UDP,53/TCP,9153/TCP 48m
kube-controller-manager-svc ClusterIP None <none> 10257/TCP 61m
kube-scheduler-svc ClusterIP None <none> 10259/TCP 61m
kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 61m
@wansir @RolandMa1986 Could you please help to take a look at this issue?
Hi @MonaxGT, The certificate issue should already be fixed in the v3.1.0 release. Please check your installer's version by the following command first. The image output should be "kubesphere/ks-installer:v3.1.0" if you are trying to deploy KubeSphere 3.1.0.
kubectl -n kubesphere-system get deployments.apps ks-installer -o=jsonpath='{$.spec.template.spec.containers[:1].image}'
For the KubeSphere 3.1.0 release, there are 2 possible issues.
nslookup ks-controller-manager.kubesphere-system.svc
in a ks-apiserver shell session. The IP should match to ks-controller-manager's cluster ip kubectl -n kubesphere-system get service ks-controller-manager
.kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/release-3.1/roles/ks-core/prepare/files/ks-init/webhook-secret.yaml
kubectl apply -f https://raw.githubusercontent.com/kubesphere/ks-installer/release-3.1/roles/ks-core/prepare/files/ks-init/iam.kubesphere.io.yaml
Hi @RolandMa1986!
kubectl -n kubesphere-system get deployments.apps ks-installer -o=jsonpath='{$.spec.template.spec.containers[:1].image}'
kubesphere/ks-installer:v3.1.0
[kuber@master kubernetes]$ kubectl exec --stdin --tty -n kube-system kube-apiserver-master -- /bin/sh
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown
command terminated with exit code 126
I dunno how to connect to this pods but it works with basic pods
[kuber@master kubernetes]$ kubectl -n kubesphere-system get service ks-controller-manager
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ks-controller-manager ClusterIP 10.233.5.229 <none> 443/TCP 23h
[root@master kubernetes]# kubectl apply -f webhook.yaml
secret/ks-controller-manager-webhook-cert unchanged
[root@master kubernetes]# kubectl apply -f iam.yaml
Warning: admissionregistration.k8s.io/v1beta1 ValidatingWebhookConfiguration is deprecated in v1.16+, unavailable in v1.22+; use admissionregistration.k8s.io/v1 ValidatingWebhookConfiguration
validatingwebhookconfiguration.admissionregistration.k8s.io/users.iam.kubesphere.io unchanged
I think it's mean that nothing new changed about certs
The main problem with DNS server
This is a result when creating a cluster successfully by kk:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 4h33m kube-controller-manager-svc ClusterIP None <none> 10257/TCP 4h25m kube-scheduler-svc ClusterIP None <none> 10259/TCP 4h25m kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 4h24m
The kk will delete the
kube-dns
and createcore-dns
.So, maybe you can have a try to delete it manually. Here is the
coredns-svc.yaml
file:--- apiVersion: v1 kind: Service metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: "coredns" addonmanager.kubernetes.io/mode: Reconcile annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" spec: selector: k8s-app: kube-dns clusterIP: 10.233.0.3 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP - name: metrics port: 9153 protocol: TCP
Maybe it will be useful, I used advice @24sama and apply code from this post
You can verify the dns service by the following command on your host:
nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
The "10.233.0.10" ip is the coredns service ip from
kubectl -n kube-system get service coredns
Or get a pod shell session:
kubectl -n kubesphere-system exec -it ks-apiserver-<tab> sh
I notice that you have a search domain "search net.home.ru" in your /etc/resolv.conf. Maybe you can try to delete it. if you are using a dhcp config. please try to set a fixed dns like 8.8.8.8.
Is there any unavailable DNS in your K8s cluster?
sorry, I'm not a professional DevOPS) I dunno... I think that kubernets deploy coredns service and this service will be internal dns for dns like in Docker. Or you ask about another DNS?
My /etc/resolv.conf
[root@master kubersphere]# cat /etc/resolv.conf # Generated by NetworkManager search net.home.ru nameserver 172.16.0.202 nameserver 172.16.0.203
On my host nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
[root@master kubernetes]# nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
Server: 10.233.0.10
Address: 10.233.0.10#53
** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN
[kuber@master kubernetes]$ kubectl -n kube-system get service coredns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.10 <none> 53/UDP,53/TCP,9153/TCP 29h
Get pods names in namespace kubesphere-system
[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME READY STATUS RESTARTS AGE
ks-apiserver-66db8995-z6ttm 1/1 Running 0 28h
ks-console-67f59b8664-7lhpv 1/1 Running 0 27h
ks-controller-manager-bd5fb4db4-ztw8d 1/1 Running 0 28h
ks-installer-5d65c99d54-fr97x 1/1 Running 0 29h
[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-66db8995-z6ttm sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup ks-controller-manager.kubesphere-system.svc
Server: 10.233.0.10
Address: 10.233.0.10:53
** server can't find ks-controller-manager.kubesphere-system.svc: NXDOMAIN
** server can't find ks-controller-manager.kubesphere-system.svc: NXDOMAIN
/ #
I tried to reboot 2 modules and get into shell again and nslookup after that
[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME READY STATUS RESTARTS AGE
ks-apiserver-66db8995-z6ttm 1/1 Running 0 28h
ks-console-67f59b8664-7lhpv 1/1 Running 0 27h
ks-controller-manager-bd5fb4db4-ztw8d 1/1 Running 0 28h
ks-installer-5d65c99d54-fr97x 1/1 Running 0 29h
[kuber@master kubernetes]$ kubectl -n kubesphere-system rollout restart deploy ks-controller-manager
deployment.apps/ks-controller-manager restarted
[kuber@master kubernetes]$ kubectl -n kubesphere-system rollout restart deploy ks-apiserver
deployment.apps/ks-apiserver restarted
[kuber@master kubernetes]$ kubectl -n kubesphere-system get pods
NAME READY STATUS RESTARTS AGE
ks-apiserver-5949c88fb6-nl7sg 1/1 Running 0 5s
ks-console-67f59b8664-7lhpv 1/1 Running 0 27h
ks-controller-manager-6cb59f9fd9-c7q8c 1/1 Running 0 14s
ks-installer-5d65c99d54-fr97x 1/1 Running 0 29h
[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-5949c88fb6-nl7sg sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # nslookup ks-controller-manager.kubesphere-system.cluster.local 10.233.0.10
Server: 10.233.0.10
Address: 10.233.0.10:53
** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN
** server can't find ks-controller-manager.kubesphere-system.cluster.local: NXDOMAIN
/ #
It seems that coredns didn't work at all(
I forgot to tell, I changed /etc/resolv.conf on my host but it didn' t change anything
My mistake, the DNS name should be "ks-controller-manager.kubesphere-system.svc.cluster.local". You can use the long dns name both in host and pod shell
[kuber@master kubernetes]$ nslookup ks-controller-manager.kubesphere-system.svc.cluster.local 10.233.0.10
Server: 10.233.0.10
Address: 10.233.0.10#53
Name: ks-controller-manager.kubesphere-system.svc.cluster.local
Address: 10.233.5.229
[kuber@master kubernetes]$ kubectl -n kubesphere-system exec -it ks-apiserver-5949c88fb6-nl7sg sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # nslookup ks-controller-manager.kubesphere-system.svc.cluster.local 10.233.0.10
Server: 10.233.0.10
Address: 10.233.0.10:53
Name: ks-controller-manager.kubesphere-system.svc.cluster.local
Address: 10.233.5.229
That looks good. Can you check your /etc/resolv.conf in the "ks-apiserver-5949c88fb6-nl7sg" pod?
You can try to issue an HTTPS request too. like "curl -k https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2" or with full DNS name. It should return something like {"code":400}
. You may need to install curl firstly by "# apk add curl".
/ # curl -k https://ks-controller-manager.kubesphere-system.svc:443/validate-email-iam-kubesphere-io-v1alpha2
{"response":{"uid":"","allowed":false,"status":{"metadata":{},"message":"contentType=, expected application/json","code":400}}}
Yeap, it returns exactly 400
All service seems ok. Login with admin/create user should works... But, ok, maybe try to restart the service could fix the issue.
kubectl -n kubesphere-system rollout restart deployment ks-apiserver
kubectl -n kubesphere-system rollout restart deployment ks-controller-manager
kubectl -n kube-system delete pod kube-apiserver-node1
I run
kubectl -n kubesphere-system rollout restart deployment ks-apiserver
kubectl -n kubesphere-system rollout restart deployment ks-controller-manager
And can't kubectl -n kube-system delete pod kube-apiserver-node1
[kuber@p7701v02k8s-master04 kubernetes]$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-8f59968d4-9qtwn 1/1 Running 0 47h
calico-node-5vbgh 1/1 Running 0 47h
calico-node-qpwb4 1/1 Running 0 47h
calico-node-qxzh7 1/1 Running 0 47h
coredns-86cfc99d74-5grtd 1/1 Running 0 47h
coredns-86cfc99d74-r9856 1/1 Running 0 47h
kube-apiserver-master 1/1 Running 0 47h
kube-controller-manager-master 1/1 Running 0 47h
kube-proxy-28t7w 1/1 Running 0 47h
kube-proxy-gt7pb 1/1 Running 0 47h
kube-proxy-nsndz 1/1 Running 0 47h
kube-scheduler-master 1/1 Running 0 47h
openebs-localpv-provisioner-7cfc686bc5-s8r6x 1/1 Running 0 47h
snapshot-controller-0 1/1 Running 0 47h
[kuber@p7701v02k8s-master04 kubernetes]$ kubectl -n kube-system delete pod kube-apiserver-node1
Error from server (NotFound): pods "kube-apiserver-node1" not found
Should I delete kube-apiserver-master ?
I try to logon and have the same error with cert x509(
I dunno.. maybe it's important but I have access to internet thought proxy server, but it should influence on this
Hi Kubersphere team!
I used manual to deploy kubersphere. I installed all of depends. I created
./kk create config --with-kubesphere v3.1.0
and 1.20.4 Kubernetes version (I also tried v1.19.8)Then I run
sudo ./kk create cluster -f config-sample.yaml
and received error:But if I just copy
sudo env PATH=$PATH /bin/sh -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crict
and run from terminal - I don't see any problemI.tried to delete and create cluster again severals times but received the same error again and again...
After copy config file to $HOME/.kube, join worker's node with kuberadmin joint and run
kubectl get nodes
I've seen only nodes in NotReady status... and nothing changesOS: Red Hat Enterprise Linux Server 7.9 (Maipo)
Can you help me with error?