canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.51k stars 772 forks source link

Cannot connect worker node to master #3611

Closed MonkzCode closed 1 year ago

MonkzCode commented 1 year ago

Summary

I have single-node microk8s instance. Status on it:

microk8s is running high-availability: no datastore master nodes: 10.10.20.20:19001 datastore standby nodes: none addons: enabled: dashboard # (core) The Kubernetes dashboard dns # (core) CoreDNS ha-cluster # (core) Configure high availability on the current node helm3 # (core) Helm 3 - Kubernetes package manager ingress # (core) Ingress controller for external access metrics-server # (core) K8s Metrics Server for API access to service metrics rbac # (core) Role-Based Access Control for authorisation disabled: ....

Firewall disabled.

Also i installed new worker node with:

microk8s is running high-availability: no datastore master nodes: 127.0.0.1:19001 datastore standby nodes: none addons: enabled: dns # (core) CoreDNS ha-cluster # (core) Configure high availability on the current node helm3 # (core) Helm 3 - Kubernetes package manager rbac # (core) Role-Based Access Control for authorisation disabled: ....

When i make microk8s join 10.10.20.20:25000/<redacted> --worker command on worker node i got:

Contacting cluster at 10.10.20.20 Traceback (most recent call last): File "/snap/microk8s/4175/scripts/cluster/join.py", line 993, in join(prog_name="microk8s join") File "/snap/microk8s/4175/usr/lib/python3/dist-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/snap/microk8s/4175/usr/lib/python3/dist-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/snap/microk8s/4175/usr/lib/python3/dist-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/snap/microk8s/4175/usr/lib/python3/dist-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/snap/microk8s/4175/scripts/cluster/join.py", line 986, in join join_dqlite(connection_parts, verify, worker) File "/snap/microk8s/4175/scripts/cluster/join.py", line 762, in join_dqlite join_dqlite_worker_node(info, master_ip, master_port, token) File "/snap/microk8s/4175/scripts/cluster/join.py", line 846, in join_dqlite_worker_node update_cert_auth_kubeproxy(token, info["ca"], master_ip, master_port, hostname_override) File "/snap/microk8s/4175/scripts/cluster/join.py", line 423, in update_cert_auth_kubeproxy cert = get_client_cert(master_ip, master_port, "kube-proxy", proxy_token, "system:kube-proxy") File "/snap/microk8s/4175/scripts/cluster/join.py", line 277, in get_client_cert subprocess.check_call(cmd_cert.split(), stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) File "/snap/microk8s/4175/usr/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/snap/microk8s/4175/usr/bin/openssl', 'req', '-new', '-sha256', '-key', '/var/snap/microk8s/current/certs/kube-proxy.key', '-out', '/var/snap/microk8s/current/certs/kube-proxy.csr', '-subj', '/CN=system:kube-proxy']' returned non-zero exit status 1.

In journal on master node i see: microk8s.daemon-cluster-agent[189902]: 2022/12/05 07:50:37 POST 200 "/cluster/api/v2.0/join" 2112 bytes in 833.686178ms

Both servers - Centos7 with OpenSSL 1.0.2k-fips

How i can connect my worker node to master node? Please, help.

neoaggelos commented 1 year ago

Hi, it appears that the openssl commands that generate kube-proxy (and most likely the second one for kubelet) fail, and I would assume that OpenSSL 1.0.2k-fips is closely related to this.

Can you please include more details by spawning a shell inside the snap environment and running the command manually? I.e.:

sudo snap run --shell microk8s

# inside the new shell, run the openssl command directly
 /snap/microk8s/current/usr/bin/openssl req -new -sha256 -key /var/snap/microk8s/current/certs/kube-proxy.key -out /var/snap/microk8s/current/certs/kube-proxy.csr -subj /CN=system:kube-proxy

The openssl command should fail, but ideally stderr will contain some useful bits to keep debugging things.

MonkzCode commented 1 year ago

Hi! Thanks, @neoaggelos for reply! Command output is:

Can't open /usr/lib/ssl/openssl.cnf for reading, No such file or directory 139971289392128:error:02001002:system library:fopen:No such file or directory:../crypto/bio/bss_file.c:72:fopen('/usr/lib/ssl/openssl.cnf','r') 139971289392128:error:2006D080:BIO routines:BIO_new_file:no such file:../crypto/bio/bss_file.c:79: unable to find 'distinguished_name' in config problems making Certificate Request 139971289392128:error:0E06D06A:configuration file routines:NCONF_get_string:no conf or environment variable:../crypto/conf/conf_lib.c:270:

neoaggelos commented 1 year ago

OK, I think /usr/lib/ssl/openssl.cnf missing is some progress on the issue.

Do you know how to locate the default openssl.cnf file on your system? On a deb-based system, that would be dpkg -S openssl.cnf, not sure what's the command for yum/dnf.

On a related-note, can you also check whether the following helps?

sudo snap run --shell microk8s

export OPENSSL_CONF=$SNAP/usr/lib/ssl/openssl.cnf
/snap/microk8s/current/usr/bin/openssl req -new -sha256 -key /var/snap/microk8s/current/certs/kube-proxy.key -out /var/snap/microk8s/current/certs/kube-proxy.csr -subj /CN=system:kube-proxy
MonkzCode commented 1 year ago

@neoaggelos , i searched for default and found it only in /etc/pki/tls/openssl.cnf (except /var/lib/snap/ directories, there are several dirs with openssl.cnf) Tried with your suggestion - no luck, even with export OPENSSL_CONF=/etc/pki/tls/openssl.cnf. The error is: Can't load /root/.rnd into RNG 140480045253632:error:2406F079:random number generator:RAND_load_file:Cannot open file:../crypto/rand/randfile.c:88:Filename=/root/.rnd

MonkzCode commented 1 year ago

Figured out, I had to execute the command openssl rand -out .rnd 16 then i used Your, @neoaggelos , suggestion with my path to openssl.cnf. After that worker node successfully connected to master. HUGE thanks for Your support!