osism / k8s-capi-images

Images intended for use with Kubernetes CAPI providers
https://www.osism.tech
14 stars 2 forks source link

Images above 1.28 not working #251

Closed flyersa closed 2 months ago

flyersa commented 4 months ago

Hello,

i finally managed to get Magnum CAPI Driver working. Updated images with osism manage image clusterapi.

Only 1.27.15 and 1.28.11 work flawless. Everything above has exact the same error something with calico cni install not working properly (1.29x and 1.30.x):

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  85s                default-scheduler  Successfully assigned kube-system/calico-node-bcsqq to kube-dj61o-gpdjp-r8hlk
  Normal   Pulling    84s                kubelet            Pulling image "docker.io/calico/cni:v3.24.2"
  Normal   Pulled     78s                kubelet            Successfully pulled image "docker.io/calico/cni:v3.24.2" in 5.972s (5.972s including waiting)
  Normal   Created    78s                kubelet            Created container upgrade-ipam
  Normal   Started    78s                kubelet            Started container upgrade-ipam
  Normal   Pulled     44s (x2 over 77s)  kubelet            Container image "docker.io/calico/cni:v3.24.2" already present on machine
  Normal   Created    44s (x2 over 77s)  kubelet            Created container install-cni
  Normal   Started    44s (x2 over 77s)  kubelet            Started container install-cni
  Warning  BackOff    12s                kubelet            Back-off restarting failed container install-cni in pod calico-node-bcsqq_kube-system(1e4183b2-fc13-4fd2-8ca7-313b8cb1c200)

(.install-venv) root@open-mgmt:~# kubectl logs calico-node-bcsqq -n kube-system
Defaulted container "calico-node" out of: calico-node, upgrade-ipam (init), install-cni (init), mount-bpffs (init)
Error from server: no preferred addresses found; known addresses: []

clusterctrl components, clusterapi helm and clusterapi components are on latest current versions. I have this in two different environments. Does someone else use the images and can confirm they work?

After digging around a bit that may have something todo with internal IPs and not distributing the FIP or something hmm. will continue digging on it. Not sure if the issue is correct here or belongs somewhere else

flyersa commented 4 months ago

seems to be related to this imho:

https://github.com/kubernetes/kubernetes/issues/125348

berendt commented 4 months ago

Jan from SCS did some quick tests yesterday and 1.29 and 1.30 worked. Can you test if it works with CIlium? Jan's guess was that it was due to Calico. Cilium is only used in SCS.

flyersa commented 4 months ago

sure will try and report, but i think the magnum stuff only worked with calico so far. Most likely need to sync with vexxhost if it doesnt

update:

cant, doesnt even give me the option. Looks this will be added, at least there was a commit like 5 months ago?

https://opendev.org/openstack/magnum/commit/de6796bd1023d36fa84d3bd154b89b35354413f7?style=split&whitespace=show-all&show-outdated=#diff-38460226c8c0f7e84439cdbdf6d4e9da0b23ba05

guess we dont really need todo something here, next realease should support cilium then i guess and with 1.31 in august the issue should also be fixed and with next 1.29 and 1.30 releases. Until that at least 1.28 is stil not EOL.

i guess jan tested with the scs capi implementation and not magnum right?

berendt commented 2 months ago

@flyersa This is resolved?

flyersa commented 2 months ago

Yes sorry. Works flawless up to 1.30.x now, 1.31 is not yet supported by vexxhost capi driver