canonical / istio-operators

Charmed Istio
2 stars 17 forks source link

Design the migration path for istio in a different namespace than `kubeflow` #373

Open ca-scribner opened 5 months ago

ca-scribner commented 5 months ago

Context

369 outlines how to architect charmed kubeflow such that istio is deployed in a different namespace than kubeflow. For whatever solution #369 chooses, we must design the migration for existing deployments to move their istio.

Potential issues:

  1. Istio does not appear to have a supported path to move it from one namespace to another (there's no istioctl upgrade --new-namespace=X, so we likely need to uninstall and reinstall
  2. We typically avoid destroying CRDs during a migration to preserve user configuration. If we need to maintain the CRDs, we cannot simply do istioctl uninstall -n kubeflow; istioctl install -n istio-system and instead need to uninstall everything except the CRDs. The charm handles upgrades in the target version (eg: when upgrading from 1 to 2, 2 is the one executing upgrade logic) so this should be doable
    • If we were ok with destroying the CRDs, upgrade is easier. Our charms and workloads might be resiliant to this, but we'd have to check. In past we avoided this on guidance from field, who did not want us to destroy customer configurations
  3. if we install a new istio, we need to avoid duplicating the webhooks otherwise this will lead to a broken cluster
  4. if istiod is moved, all existing sidecars need to be updated. this discussion is an example of what could be done, either by a human operator or maybe as an action in the charm? That feels pretty invasive though.
    • this is actually something that should probably happen during any istio upgrade, not just a namespace migration, but so far we do not do it
  5. if istiod is separated from istio-pilot into its own charm (either in the Kubeflow namespace or elsewhere), how can we deploy the new charm and "take over" the old CRDs?

What needs to get done

  1. decide whether we will delete CRDs or keep them alive during migration
  2. define the work needed to enable migration, such as
    1. changes to any charms
  3. define the steps needed for someone to actually do a migration, such as:
    1. any relations that need to be broken/established
    2. how to deploy the new charm, if required

Definition of Done

  1. document the design and open any resulting tasks
  2. document the (rough) procedures for the upgrade
syncronize-issues-to-jira[bot] commented 5 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5246.

This message was autogenerated

ca-scribner commented 5 months ago

For context point (5) above (where istiod is separated from istio-pilot), we may need to release an updated version of the current istio-pilot that allows for "disowning" its CRDs (eg: has an option where it leaves the CRDs in cluster during the remove event). Unless there's a juju remove flag that ensures it always skips the remove event handling (does juju remove X --force --no-wait execute the remove event handler?)