Closed fhemberger closed 4 years ago
/sig openstack
Can confirm the same behavior with latest Debian Stretch and Fedora Cloud 29. Any idea what might cause it or how to get to the bottom of this?
Any further logs, configs, etc. that might be helpful for debugging?
Can you check logs from controller manager pod, if it's up and running at that stage? Is it your own OpenStack cloud or are you using a public provider?
kubelet fails to start on the master node (see attached logs), so there are no pods running. Only the external etcd Docker container is up an running at this point.
OpenStack ('Queens' AFAIK) is running on premise.
The problem happens to me too in ubuntu 18.04.
kubespray: tag: v2.10.0, origin/release-2.10
i got it solved after upgrading the libs and running it again
pip freeze --local | grep -v '^\-e' | cut -d = -f 1 | xargs -n1 sudo pip install -U
Unfortunately, this didn't solve it for me. To make sure there are no other side effects, I'm running the entire setup in a Docker container:
FROM ubuntu:18.04
RUN export DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get install -y \
python \
openssh-client \
iputils-ping \
python-pip \
software-properties-common \
&& rm -rf /var/lib/apt/lists/*
RUN pip install \
ansible>=2.7.6 \
jinja2>=2.9.6 \
netaddr \
pbr>=1.6 \
hvac \
jmespath \
ruamel.yaml \
python-openstackclient
RUN mkdir -p /root/ansible
WORKDIR /root/ansible
docker build -t ansible-kubespray .
docker run --rm -ti \
-v $(pwd):/root/ansible \
-v ~/.ssh/id_rsa:/root/.ssh/id_rsa:ro \
-v ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub:ro \
ansible-kubespray \
bash
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
@fhemberger did you find any solution?
@ChoppinBlockParty Kind of. We stopped using kubespray for our setup. 🤷
@fhemberger What did you decide to use instead?
@trydalch Went with RKE (Rancher Kubernetes Engine), but that was over two years ago. There may be other viable solutions as well by now.
We ended up writing our own scripts for the setup. But the setup is not too complicated and does not change often.
Environment:
Cloud provider or hardware configuration: OpenStack Queens (via Kolla)
OS (
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
): Ubuntu 18.04 LTS, Kernel 4.15.0-23-genericVersion of Ansible (
ansible --version
): 2.7.9Kubespray version (commit) (
git rev-parse --short HEAD
): v2.9.0Network plugin used:
Copy of your inventory file: https://gist.github.com/fhemberger/15de65d6ba3e1322616f974d7e145917#file-hosts-json (generated from Terraform)
Command used to invoke ansible: From inventory directory:
ansible-playbook -i hosts --become ../../kubespray/cluster.yml
Output of ansible run: https://gist.github.com/fhemberger/15de65d6ba3e1322616f974d7e145917
Anything else do we need to know:
kubelet logs:
Terraform config to create resources on OpenStack:
Also happened with Kubespray v2.8.4/Kubernetes v1.12.5, see: https://github.com/kubernetes/kubeadm/issues/1497