Closed teknowill closed 2 years ago
kept newest code, but going back to all.yml with:
k3s_version: v1.23.4+k3s1
ansible_user: ---- systemd_dir: /etc/systemd/system
system_timezone: "America/New_York"
flannel_iface: "eth0"
apiserver_endpoint: "192.168.1.220"
k3s_token: "some-SUPER-DEDEUPER-secret-password"
extra_server_args: "--no-deploy servicelb --no-deploy traefik" extra_agent_args: ""
kube_vip_tag_version: "v0.4.4"
metal_lb_speaker_tag_version: "v0.12.1" metal_lb_controller_tag_version: "v0.12.1"
metal_lb_ip_range: "192.168.1.221-192.168.1.239"
the api IP# and get nodes are now stable rancher will deploy
however after kubectl expose deployment rancher -n cattle-system --type=LoadBalancer --name=rancher-lb --port=443
external IP# is pending forever
kubectl get all -A NAMESPACE NAME READY STATUS RESTARTS AGE cattle-fleet-local-system pod/fleet-agent-699b5fb945-nsxnx 1/1 Running 0 13m cattle-fleet-system pod/fleet-controller-784d6fbcd8-hngpn 1/1 Running 0 14m cattle-fleet-system pod/gitjob-6b977748fc-7rsh8 1/1 Running 0 14m cattle-system pod/helm-operation-7vpxj 0/2 Completed 0 14m cattle-system pod/helm-operation-8m924 0/2 Completed 0 13m cattle-system pod/helm-operation-fcx2g 0/2 Completed 0 15m cattle-system pod/helm-operation-v89kg 0/2 Completed 0 14m cattle-system pod/rancher-7fd65d9cd6-2f5vv 1/1 Running 0 17m cattle-system pod/rancher-7fd65d9cd6-qlqp7 1/1 Running 0 17m cattle-system pod/rancher-7fd65d9cd6-slnqr 1/1 Running 0 17m cattle-system pod/rancher-webhook-5b65595df9-l5b7x 1/1 Running 0 13m cert-manager pod/cert-manager-76d44b459c-kdzqv 1/1 Running 0 18m cert-manager pod/cert-manager-cainjector-9b679cc6-wp959 1/1 Running 0 18m cert-manager pod/cert-manager-webhook-57c994b6b9-tqgnv 1/1 Running 0 18m kube-system pod/coredns-5789895cd-cbzzm 1/1 Running 0 56m kube-system pod/kube-vip-ds-4lntv 1/1 Running 0 56m kube-system pod/kube-vip-ds-j65z7 1/1 Running 2 (16m ago) 56m kube-system pod/kube-vip-ds-m8vcq 1/1 Running 0 56m kube-system pod/local-path-provisioner-6c79684f77-zk8jj 1/1 Running 0 56m kube-system pod/metrics-server-7cd5fcb6b7-czzjp 1/1 Running 0 56m metallb-system pod/controller-74df79bb55-qvldk 1/1 Running 0 56m metallb-system pod/speaker-28fk6 1/1 Running 0 53m metallb-system pod/speaker-2mhzf 1/1 Running 0 56m metallb-system pod/speaker-8zwrg 1/1 Running 0 53m metallb-system pod/speaker-96mb5 1/1 Running 0 53m metallb-system pod/speaker-bmhpn 1/1 Running 0 56m metallb-system pod/speaker-jggcr 1/1 Running 0 53m metallb-system pod/speaker-mr7mc 1/1 Running 0 53m metallb-system pod/speaker-rb8dp 1/1 Running 0 53m metallb-system pod/speaker-rkktx 1/1 Running 0 53m metallb-system pod/speaker-t89s6 1/1 Running 0 53m metallb-system pod/speaker-v7vss 1/1 Running 0 56m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cattle-fleet-system service/gitjob ClusterIP 10.43.59.181
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/kube-vip-ds 3 3 3 3 3
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE cattle-fleet-local-system deployment.apps/fleet-agent 1/1 1 1 13m cattle-fleet-system deployment.apps/fleet-controller 1/1 1 1 14m cattle-fleet-system deployment.apps/gitjob 1/1 1 1 14m cattle-system deployment.apps/rancher 3/3 3 3 17m cattle-system deployment.apps/rancher-webhook 1/1 1 1 13m cert-manager deployment.apps/cert-manager 1/1 1 1 18m cert-manager deployment.apps/cert-manager-cainjector 1/1 1 1 18m cert-manager deployment.apps/cert-manager-webhook 1/1 1 1 18m kube-system deployment.apps/coredns 1/1 1 1 56m kube-system deployment.apps/local-path-provisioner 1/1 1 1 56m kube-system deployment.apps/metrics-server 1/1 1 1 56m metallb-system deployment.apps/controller 1/1 1 1 56m
NAMESPACE NAME DESIRED CURRENT READY AGE cattle-fleet-local-system replicaset.apps/fleet-agent-699b5fb945 1 1 1 13m cattle-fleet-local-system replicaset.apps/fleet-agent-86b78d86bf 0 0 0 13m cattle-fleet-system replicaset.apps/fleet-controller-784d6fbcd8 1 1 1 14m cattle-fleet-system replicaset.apps/gitjob-6b977748fc 1 1 1 14m cattle-system replicaset.apps/rancher-7fd65d9cd6 3 3 3 17m cattle-system replicaset.apps/rancher-webhook-5b65595df9 1 1 1 13m cert-manager replicaset.apps/cert-manager-76d44b459c 1 1 1 18m cert-manager replicaset.apps/cert-manager-cainjector-9b679cc6 1 1 1 18m cert-manager replicaset.apps/cert-manager-webhook-57c994b6b9 1 1 1 18m kube-system replicaset.apps/coredns-5789895cd 1 1 1 56m kube-system replicaset.apps/local-path-provisioner-6c79684f77 1 1 1 56m kube-system replicaset.apps/metrics-server-7cd5fcb6b7 1 1 1 56m metallb-system replicaset.apps/controller-74df79bb55 1 1 1 56m
api enpoint ip stable ping with all below
will try again with newer k3s
but with 1.23
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v1.7.1 gets stuck even though kubectl get pods --namespace cert-manager NAME READY STATUS RESTARTS AGE cert-manager-76d44b459c-wr4bp 1/1 Running 0 3m3s cert-manager-cainjector-9b679cc6-nnx9m 1/1 Running 0 3m3s cert-manager-startupapicheck-nrrkb 1/1 Running 2 (58s ago) 3m2s cert-manager-webhook-57c994b6b9-7w959 1/1 Running 0 3m3s
k3s_version: v1.23.4+k3s1
ansible_user: ---- systemd_dir: /etc/systemd/system
system_timezone: "America/New_York"
flannel_iface: "eth0"
apiserver_endpoint: "192.168.1.220"
k3s_token: "some-SUPER-DEDEUPER-secret-password"
extra_server_args: "--no-deploy servicelb --no-deploy traefik" extra_agent_args: ""
kube_vip_tag_version: "v0.5.0"
metal_lb_speaker_tag_version: "v0.13.4" metal_lb_controller_tag_version: "v0.13.4"
metal_lb_ip_range: "192.168.1.221-192.168.1.239"
kubectl get pods --namespace cert-manager NAME READY STATUS RESTARTS AGE cert-manager-76d44b459c-wr4bp 1/1 Running 0 5m6s cert-manager-cainjector-9b679cc6-nnx9m 1/1 Running 0 5m6s cert-manager-startupapicheck-nrrkb 0/1 CrashLoopBackOff 3 (18s ago) 5m5s cert-manager-webhook-57c994b6b9-7w959 1/1 Running 0 5m6s
https://github.com/cert-manager/cert-manager/issues/2773 helps if I don't skip a line in the docs...
ok get nodes and ping to the API are stable, using the newest all,yml able to cert manager
k3s_version: v1.24.3+k3s1
ansible_user: --- systemd_dir: /etc/systemd/system
system_timezone: "America/New_York"
flannel_iface: "eth0"
apiserver_endpoint: "192.168.1.220"
k3s_token: "some-SUPER-DEDEUPER-secret-password"
extra_server_args: "--no-deploy servicelb --no-deploy traefik" extra_agent_args: ""
kube_vip_tag_version: "v0.5.0"
metal_lb_speaker_tag_version: "v0.13.4" metal_lb_controller_tag_version: "v0.13.4"
metal_lb_ip_range: "192.168.1.221-192.168.1.239"
helm install rancher rancher-stable/rancher \ --namespace cattle-system ....
Error: INSTALLATION FAILED: chart requires kubeVersion: < 1.24.0-0 which is incompatible with Kubernetes v1.24.3+k3s1
looks there is 1.24 rancher out there https://github.com/rancher/client-go/releases/tag/v1.24.0-rancher1
but not fully ready yet https://github.com/rancher/rancher/issues/37711
trying to figure out how to point to it, but guess I just need to stick with k3s_version: v1.23.4+k3s1 for now, but at least I can use the newer VIP and metalLB, not sure what is "different"
Yea that did it NAME: rancher LAST DEPLOYED: Mon Aug 1 19:10:43 2022 NAMESPACE: cattle-system STATUS: deployed
suggest setting all.yml back to v1.23.4+k3s1
until 1.24 rancher is ready, at least in your main branch
Rancher is not yet compatible with k3s 1.24. It may be soon but that is really going to be up to Rancher to make it compatible.
I think support is coming in k3s 2.67, I would check their release notes.
also rancher isn't compatible with the latest cert-manager
also, I've been pining my vip for over an hour now and it's stable.
after the did a pull I had a unstable ping for 2x re-deployment rounds with no error in output before I put up the post. I think just something minor was off in my all.yml copy, as later deployments were very stable. I mentioned in later comment that I ended up being able to get stable ping, sorry you wasted time pining. can no longer reproduce.
Re: Support is coming in 2.67, thank you, yes found that as well. https://github.com/rancher/rancher/issues/37711
Re: 1.24, just trying to point out that the source copy of all.yml on your repo has a line that installs 1.24 I know a k3s deployment and rancher are separate, and I don't know the k3s deployment's target, but if people follow your guidance docs & videos, pull /clone the all.yml script that points to 1.24 and try to deploy rancher they'll run into it, not a big deal.
just suggesting a known rancher compatible version stack/branch for your k3s deployment, as well as a latest k3s stack, in this case it's only a line, but I think rancher will always lag behind k3s and you also mentioned cert manager versions, could be other things down the line. I could try to expand an ansiable script for a rancher on top stack, post it, if there's any interest / value
Thank you for bringing this up and all the details
this link gets a 404, but i did the k3s trouble shooting check list, https://github.com/techno-tim/k3s-ansible/discussions/20
I was able to get this working with a older release with most current I get:
unstable Ping VIP IP# K3S need to be < v1.24 for Rancher
Expected Behavior
VIP end point ping should be stable Helm should be able to deploy Rancher with documented commands
Current Behavior
You can ping the VIP/API IP# intermittently , so get node and helm deployments are hit or miss Rancher Deployment will fail because K3S is > 1.24
Steps to Reproduce
Context (variables)
Operating system: Ubuntu 22
Hardware: 2x dual xeon nodes, 48 thread, 256GB Ram, 1x 1 liter node with a i5 10th 64GB Ram
Variables Used:
I didn't alter these, save adding my own token and IP# they are what have been listed in the repo
all.yml
Hosts
host.ini
Possible Solution
It seems to deploy ok, just something isn't quite the same when it comes to the VIP/API IP# access From what I can the version changes are very particular
I tried taking various nodes offline one at a time, this didn't really help in any repeatable way. So I don't think it's any one node/vm or it's physical networking
tried just going to a older k3s , which might of helped rancher, but didn't help vip/metallb. I'll try to see if I can find a vip or metallb log file (but I don't really know vip or metallb)
for now, going to try reversing the pull request and deployment with older stack