Closed mbolek closed 4 years ago
@mbolek could you try setting the version to for both etcd clusters manually and see if helps?
spec:
etcdClusters:
- cpuRequest: 200m
...
cluster.spec.etcdClusters[*].version=3.3.10
- cpuRequest: 100m
...
cluster.spec.etcdClusters[*].version=3.3.10
I think I've got it (and maybe it's related to the general etcd issue with certs?) etcd tries to run with a cert from the EBS volume which has expired :(
[root@ip-172-20-33-210 ~]# openssl x509 -in /mnt/master-vol-0affaaafe8ae78f4d/pki/MbgaZ62t6d-2KJFas1RofQ/peers/me.crt -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 6456588394770812402 (0x599a68efc3394df2)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=etcd-peers-ca-main
Validity
Not Before: May 26 06:53:50 2019 GMT
Not After : May 27 07:44:51 2020 GMT
Subject: CN=etcd-d
@hakman I think I had it set to 3.3.10 in the etcd settings etc. Can't really tell right now as I've tried to roll back but it seems the issue was possibly the certs all along
Great. One more thing, you should probably use Kops 1.17.1 to manage your cluster even if you use an older version of k8s. There are many bug fixes since 1.14 and, if you get into trouble, you may have to switch to a newer version of Kops anyway. (not to confuse with directly migration to k8s 1.17)
yup... was certs all along, I got sidetracked by not knowing it would use cert from EBS volume:/ I've recreated the cert as with the current etcd advisory and 1.13.12 stood up. Updating to 1.14.10 now but I expect it to work. as for 1.17 I planned to do so but understood there were some major changes in 1.14 -> 1.15 for kops so wanted to build it up gradually. will move ASAP Thanks for a super quick reply @hakman :+1:
1. What
kops
version are you running? The commandkops version
, will display this information. Version 1.14.1 (git-b7c25f9a9)2. What Kubernetes version are you running?
kubectl version
will print the version if a cluster is running or provide the Kubernetes version specified as akops
flag. 1.14.103. What cloud provider are you using? AWS
4. What commands did you run? What is the simplest way to reproduce this issue? Upgrading k8s 1.13.12 to 1.14.10 with kops 1.13.2 -> 1.14.1 and terraform. Ran:
5. What happened after the commands executed? cluster failed validation after master rollout
6. What did you expect to happen? cluster to validate
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest. You may want to remove your cluster name and other sensitive information.8. Please run the commands with most verbose logging by adding the
-v 10
flag. Paste the logs into this report, or in a gist and provide the gist link here.9. Anything else do we need to know? apiserver doesn't start, it fails with:
even though cert seems ok
and etcd boots with:
when I think it should go 3.3.10 (it's 3.3.10 in the Launch Configuration)
the etcd mounted volumes (main and events) have 3.2.24 in the state file so maybe that's the reason? I expected it to simply update in-place to 3.3.10
Should I have updated etcd manually to 3.3.10 and then start k8s upgrade?