Closed stefansundin closed 5 years ago
Good news. I think I figured it out.
I tried a lot of things. I tested Kubernetes 1.13, 1.12, and different versions of Flannel, etc. I even tried the Amazon CNI. It all worked fine until I tried to terminate my instance and restore from a backup.
I eventually found out that Docker uses 172.17.0.0/16 for its internal network, which co-incidentally overlapped with my VPC CIDR range. So when I figured this out I immediately thought this was the problem. I recreated my VPC with a 10.0.0.0/16 range. However, I still had the same issue. Bummer.
Eventually I figured out that kube-proxy is responsible for setting up the iptables rules, and it was having issues communicating with the apiserver. It looks like the apiserver was not recognizing the token.
There are a lot of keys and certs present in /etc/kubernetes/pki/
, so I decided to try to back all of them up, and not only ca.crt
and ca.key
. And incredibly enough, this seems to have worked!
My question is now.. how did this ever work? Did you test it thoroughly enough? I can't see how this ever worked??
Hi!
I'm testing out this project now, and it's great! But I have an issue when trying to restore my etcd snapshot, after manually killing the instance. I am testing without worker nodes for now.
kubeadm runs successfully, but flannel is crash looping and my other pods are not coming up.
I have modified things a bit, but I think the only significant difference is that I am running Kubernetes 1.14.0. I made sure that my new instance has the same private IP as the old instance.
I have a feeling that this is something iptables-related, but I don't know how to figure this one out. Have anyone else seen this?
And then after a moment, the state changes from
Error
toCrashLoopBackOff
.