Closed RaMaTHA closed 8 months ago
I'm trying to set up kubespray on bare metal. I have two servers to add to my cluster (one master, one client).
Your inventory show a single host, where is your second ?
Now I keep getting an error, which is different not only on the latest release (
release-2.24
), but also on earlier builds (e.g.release-2.20
or the currentmaster
branch).
What do you mean ? Which versions have the problem, and which versions don't ?
/triage needs-information
Your inventory show a single host, where is your second ?
Currently, and in my report, I have only one node enabled at a time (here, node1). If I swap (disable) node1 with (enable) node2 in my inventory.ivi, I can run the script without any errors.
For example, right now it is (simplified, short view):
[all]
node1 ansible_host=<host-ip>
# node2 ansible_host=<client-ip>
[kube_control_plane]
node1
[etcd]
node1
[kube_node]
node1
This example doesn't work. But when I run the script on my host and enable it on my client, it works.
[all]
# node1 ansible_host=<host-ip>
node2 ansible_host=<client-ip>
[kube_control_plane]
node2
[etcd]
node2
[kube_node]
node2
However, when I add both servers (host-node and client) to the setup, I also get the error as shown in the error description.
My inventory with both nodes enabled.
[all]
node1 ansible_host=<host-ip>
node2 ansible_host=<client-ip>
[kube_control_plane]
node1
[etcd]
node1
[kube_node]
node2
So the question here is, why did it work perfectly the first time, but not the second time?
What do you mean ? Which versions have the problem, and which versions don't ?
I could not find out which version did not have the problem. So the error persists from release-2.20
to master
.
If you''re running with these 3 inventories in order, you'll be building two separate cluster of 1 node, then trying to build a 2 cluster nodes. This probably does not work at all.
For example, right now it is (simplified, short view):
I think you simplified too much. Could you give the full list of command runs to get the error, starting from a blank state, and all the files involved ? (including your inventory variables)
I reset the server and removed the repo, so I have a clean new system. But I am still struggling with the same error. Just to be clear, I didn't misconfigure anything. These are the exact steps I did:
git clone https://github.com/kubernetes-sigs/kubespray.git
cd kubespray
git checkout release-2.24
VENVDIR=kubespray-venv
KUBESPRAYDIR=kubespray
python3 -m venv $VENVDIR
source $VENVDIR/bin/activate
pip install -U -r requirements.txt
cp -rfp inventory/sample inventory/mycluster
declare -a IPS=(<host ip-address>)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
echo "$(cat ~/.ssh/id_rsa.pub)" >> ~/.ssh/authorized_keys
[all]
node1 ansible_host=<host-ip>
# node2 ansible_host=<client-ip>
[kube_control_plane] node1
[etcd] node1
[kube_node] node1
- `ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml -kK`
And nearly at the end of the script, I ran into the openssl error.
My openssl version:
- `openssl version`: `OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)`
I figured out what caused the error.
After removing all running docker images with the help of sudo docker rm -f $(sudo docker ps -qa)
it worked again.
@VannTen Thanks for your help.
What happened?
I'm trying to set up kubespray on bare metal. I have two servers to add to my cluster (one master, one client). The setup went smoothly the first time I ran the script (last week). Unfortunately, I had to run the setup a second time on the same server.
Now I keep getting an error, which is different not only on the latest release (
release-2.24
), but also on earlier builds (e.g.release-2.20
or the currentmaster
branch).The error I get is related to openssl, where it tries to generate a x509 certificate for the api server (see error description).
What did you expect to happen?
I would have expected the script to run normally as it did before.
How can we reproduce it (as minimally and precisely as possible)?
I just followed the instructions in the readme (README.md).
OS
Linux 5.15.0-94-generic x86_64 PRETTY_NAME="Ubuntu 22.04.4 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.4 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=jammy
Version of Ansible
ansible [core 2.15.9] config file = /home/kubespray/kubespray/ansible.cfg configured module search path = ['/home/kubespray/kubespray/library'] ansible python module location = /home/kubespray/kubespray/kubespray-venv/lib/python3.10/site-packages/ansible ansible collection location = /home/kubespray/.ansible/collections:/usr/share/ansible/collections executable location = /home/kubespray/kubespray/kubespray-venv/bin/ansible python version = 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (/home/kubespray/kubespray/kubespray-venv/bin/python3) jinja version = 3.1.2 libyaml = True
Version of Python
Python 3.10.12
Version of Kubespray (commit)
aeaa04ca8
Network plugin used
calico
Full inventory with variables
My inventory file:
Command used to invoke ansible
ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
Output of ansible run
Anything else we need to know
I have found that the error only occurs when I add my master to the cluster (where I run the script). If I only add my second server (client node) to the cluster, it still works. Does this have anything to do with my setup?