giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Unified KaaS Infrastructure #2850

Open teemow opened 1 year ago

teemow commented 1 year ago

Outcome

All Vintage cluster are migrated to CAPI.

Q4

### Goals
- [ ] https://github.com/giantswarm/giantswarm/issues/31802
- [ ] Start Vodafone Migration to CAPA
- [ ] [Have first Clusters moved to Kubernetes v1.29](https://github.com/giantswarm/roadmap/issues/3633)
- [ ] [Have v1.30 available on all providers](https://github.com/giantswarm/roadmap/issues/3606)

Q3

### Goals
- [ ] https://github.com/giantswarm/giantswarm/issues/31467
- [ ] Upgrade all Vodafone clusters to v20 (Luca Rui)
- [ ] Have versions with Kubernetes v1.29 for all CAPI providers (Alex Dabija)

Q2

### Providers
- [ ] https://github.com/giantswarm/roadmap/issues/2726
- [ ] https://github.com/giantswarm/roadmap/issues/1787
- [ ] https://github.com/giantswarm/roadmap/issues/308
### Tasks
- [ ] https://github.com/giantswarm/giantswarm/issues/27758
- [ ] https://github.com/giantswarm/giantswarm/issues/27910
- [ ] https://github.com/giantswarm/giantswarm/issues/17491
- [ ] https://github.com/giantswarm/giantswarm/issues/28149
- [ ] https://github.com/giantswarm/roadmap/issues/2826
- [ ] https://github.com/giantswarm/giantswarm/issues/28148
- [ ] https://github.com/giantswarm/giantswarm/issues/27911
- [ ] https://github.com/giantswarm/giantswarm/issues/27912
- [ ] https://github.com/giantswarm/roadmap/issues/2727

Links

teemow commented 1 year ago

@alex-dabija @T-Kukawka we've created an overview epic in horizon for the migration. There are some duplicate issues imo for the migration. Can you clean them up?

Next step is refinement of this for Phoenix so we can figure out what can already be done until Q2 as we really want to migrate customers before september. @puja108 will coordinate.

T-Kukawka commented 1 year ago

should we just close the ones we created?

puja108 commented 1 year ago

@T-Kukawka yes, close whatever you don't need. Then let's split it into an epic each for:

  1. Customer Work for migration
  2. Technical Work for migration

For 1 we can already start with a list of customers and check with each and their AE what they're thoughts are towards migration (we already know some are planning to redo their setups and move to new clusters), and then create either sub issues or sub tasklists with lists of clusters that will need migrating. This will also inform some of the work that we might need to do around worries that customers have around this migration and might feed into how soon we will want to give them access to something where they can test.

puja108 commented 1 year ago

Focus should be on AWS for now.

T-Kukawka commented 1 year ago

okay, will go over those with @alex-dabija tomorrow

T-Kukawka commented 1 year ago

cleaned up

github-actions[bot] commented 1 year ago

This issue has had no activity for 100 days. It will be closed in 1 week, unless activity occurs or the label lifecycle/keep is added.

teemow commented 9 months ago

For jour fixe

We need to get AWS v20 out. This is planned for end of January. Team Phoenix is working on this and Cilium (and their network policies) was the main blocker to get this release out.

In the meantime we are already upgrading customers to v19.3 (PSS migration).

We also need to migrate our CAPA release to kubernetes 1.25. This has been done for CAPZ (because of WEPA) already. So this should be relatively easy.

We have to schedule meetings with customers to tell them about how the migration works, new features and CAPA in general. This happens end of January, beginning of February.

Once v20 is out, CAPA is on 1.25 and customers have migrated to it, we can start migrating the first clusters. The first migration will probably be mid February.

There is an overview of which customers is upgraded and migrated here:

Risks

We didn't yet test the migration very well. Work has been done by Phoenix and Honey Badger but we do need to do more testing. Phoenix and Honey Badger are currently working on a small tool that let's us test the migration easier.

We don't know yet what kind problems we will find.

We backported features in Cilium to unblock v20. But there are more conflicts and problems than expected. If this doesn't work we might need to relax the security constraints for CAPA. This has been done in vintage too, so it shouldn't be a big problem for customers. We will address this once a new version of Cilium is released in Q2.

Customer workloads - we don't know what kind of problems we will run into once we migrate customer workloads. The workloads might have special configuraiton and behave unexpectedly.

teemow commented 2 months ago

@alex-dabija another good metric for the migration is the number of vintage management clusters that we have to maintain. There are currently 16 left. 3 are our test environments and eagle being decommisioned atm.

teemow commented 1 month ago

@alex-dabija please update this with the goals for Q4