aws / eks-anywhere

Run Amazon EKS on your own infrastructure 🚀
https://anywhere.eks.amazonaws.com
Apache License 2.0
1.96k stars 285 forks source link

migrate EKSA nodes across vSphere Vcenters #7416

Open saiteja313 opened 8 months ago

saiteja313 commented 8 months ago

What happened:

What you expected to happen:

jiayiwang7 commented 8 months ago

Hello, currently we do not support migrating nodes between different vSphere datacenter during EKS-A upgrade. Please recreate your EKS-A cluster in the new data center.

pbs-jyu commented 8 months ago

When VMware says vmotion, it can be happen between different esxi servers. The source and destination esxi servers can cross different Vcenter; or same vcenter, different vcenter virtual datacenter; or same virtual datacenter, different cluster; or same cluster, different esxi servers. What does AWS support? Has this been tested and looked into this in details while aware of those different levels?

ahreehong commented 8 months ago

EKS-A nodes can be moved to another ESXi host within the same vSphere cluster. We do not support cross datacenter migration/upgrade.

Vmotion EKS-A Requirements EKS-A nodes cannot be moved between Vsphere Vcenters. EKS-A nodes network configuration must not change during migration (i.e ip address, default gateway, subnet mask of nodes must stay the same).

Recommendations: VMware best practices for Vmotion should be followed when possible. With the biggest factor being the Network speed of the connectivity between ESXi nodes. Recommend using a 10GbE vMotion or greater network. Using a 10GbE or greater network in place of a 1GbE network for vMotion will result in significant improvements in vMotion performance. When using very large virtual machines (EKS-A nodes) (for example, 64GB or more), Using Shared storage that is common between nodes is strongly recommend (VSAN, Fiber Channel SAN Fiber, NFS, etc..) This reduces the amount of data to be transferred between ESXI nodes to just in memory footprint. (i.e Storage Vmotion is not required)

Testing An EKS-A 3 control plane/3 etcd /3 worker cluster was successful migrated between Vsphere ESXI hosts multiple times. EKS-A nodes (vSphere VM's) were moved one at a time between ESXi hosts. Cluster status was checked after each EKS-A node was migrated using kubectl get nodes and get pods command. During vMotion operation No node or pod restarts were observed. Cluster remained operational during Vmotion activities.

pbs-jyu commented 8 months ago

When VMware people say vmotion is supported, that could mean at different level:

I do not know what are exactly support by AWS for time being. When I did the migration test which are crossed wo datacenter or virtual datacenter. The migration is successful. After the migration, everything looks fine. Then I modified the cluster.yaml file to match the current status, and tested the cluster upgrade, that's where it failed. I migrated it back, everthing still looks fine. the whole cluster got totally meshed up after I chagned the cluster.yaml file back, and did another cluster upgrade. So AWS team need to do a through test, then we know clearly what is supported, what is not.

I have done another similar test, only difference this time is the migration happened between two clusters within the same virtual datacenter or datacenter, off course within a same vcenter server, did a cluster upgrade after the migration, everything seems like working. We just want to know, is our test supported by AWS or not. Is this success is just by luck?