aws / eks-anywhere

Run Amazon EKS on your own infrastructure 🚀
https://anywhere.eks.amazonaws.com
Apache License 2.0
1.95k stars 284 forks source link

upgrade EKSA cluster with rook/ceph/wordpress installed stuck #7900

Open ygao-armada opened 5 months ago

ygao-armada commented 5 months ago

What happened: In EKSA cluster, if I install storage service rook/ceph and create wordpress in it. Then upgrade the version of kuernetes of the EKSA cluster will stuck.

What you expected to happen: I expect to see upgrade works even with rook/ceph/wordpress

How to reproduce it (as minimally and precisely as possible):

  1. create EKSA cluster with kubernetes 1.26.7
  2. create rook/ceph cluster https://rook.io/docs/rook/latest-release/Getting-Started/quickstart/
  3. install metallb v0.13.12 https://metallb.universe.tf/installation/
  4. install storage, mysql and wordpress https://rook.io/docs/rook/v1.12/Storage-Configuration/Block-Storage-RBD/block-storage/
  5. upgrade EKSA cluster with kubernetes 1.27.11 (stuck)

Anything else we need to know?: There is smoking boots log like this: {"level":"info","ts":1711485549.163217,"caller":"syslog/receiver.go:113","msg":"host=10.20.22.226 facility=daemon severity=ERR app-name=9c576943b82e procid=3895 msg=\"Error: worker Finished with error: failed to get workflow context: rpc error: code = Unavailable desc = connection error: desc = \\\"transport: Error while dialing dial tcp 10.20.22.223:42113: connect: connection refused\\\"\\n\"","service":"github.com/tinkerbell/boots","pkg":"syslog"}

Environment:

ygao-armada commented 5 months ago

Look like it's related to MetalLB