techno-tim / k3s-ansible

The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more. Build. Destroy. Repeat.
https://technotim.live/posts/k3s-etcd-ansible/
Apache License 2.0
2.41k stars 1.05k forks source link

added fix for metallb version upgrades #394

Closed egandro closed 9 months ago

egandro commented 12 months ago

Proposed Changes

For whatever reason metallb keeps an empty replica set when upgrading to a new version. There are 0/0 replicas running. It just sticks there.

Checklist

Tested with all variations from

metal_lb_speaker_tag_version: "v0.13.9"
metal_lb_controller_tag_version: "v0.13.9"

to

metal_lb_speaker_tag_version: "v0.13.12"
metal_lb_controller_tag_version: "v0.13.12"

Test Code

kubectl version -o json 2>/dev/null | jq -r '.serverVersion.gitVersion'

POD=$(kubectl -n metallb-system get pod -l "component=controller,app=metallb" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace metallb-system exec -it ${POD} -- /controller version | head -n 1 | jq -rM '.version'

POD=$(kubectl -n metallb-system get pod -l "component=speaker,app=metallb" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace metallb-system exec -it ${POD} -- /bin/sh -c '/speaker version || true' | head -n 1 | jq -rM '.version' || true

POD=$(kubectl -n kube-system get pod -l "name=kube-vip-ds" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace kube-system exec -it ${POD} -- /kube-vip version | grep Version | sed -e 's/^.*v//g'
egandro commented 12 months ago

The bash script is a bit of a hack - I wish we had everywhere jq installed.

According to my knowledge there is no support for queries in kubectl -o jsonpath.

Probably somebody has a smarter solution that I hacked in bash - but it's working now.

onedr0p commented 12 months ago

What's the purpose of deleting previous replicasets? What problem are you trying to solve?

egandro commented 12 months ago

What's the purpose of deleting previous replicasets? What problem are you trying to solve?

This problem: https://github.com/techno-tim/k3s-ansible/blob/e880f08d26989299cdd1b8a39f7e1f7c8a85f163/roles/k3s_server_post/tasks/metallb.yml#L53

As there is a 0/0 line in the replicasets - and a new 1/1 the quoted code doesn't work.

Also the 0/0 with the old image doesn't serve any purpose.

timothystewart6 commented 9 months ago

This PR is failing tests. If you want this to be merged it will need to be fixed 😅. Thank you!

egandro commented 9 months ago

This PR is failing tests. If you want this to be merged it will need to be fixed 😅. Thank you!

I think I fixed it. It was a missing

  args:
    executable: /bin/bash
timothystewart6 commented 9 months ago

Thank you!

egandro commented 9 months ago

I don't understand the issue with that test failing. It reports some nginx not being there.

Can you please enlighten me? or much better - how do I run these tests on my own setup? I know that there is some (direct) github action (local) runner. Can you point me to some documentation please?

Thx.

timothystewart6 commented 9 months ago

I am going to run the tests again, this might have been a side effect of switching to self hosted runners. Stay tuned!

timothystewart6 commented 9 months ago

Disregard, it was the runners! Thank you for the PR!