canonical / bundle-kubeflow

Charmed Kubeflow
Apache License 2.0
103 stars 50 forks source link

Investigate using ambient mesh with Charmed Kubeflow #1114

Open kimwnasptd opened 1 week ago

kimwnasptd commented 1 week ago

Context

Since the service-mesh team will be implementing the ambient profile of Istio, we'll need to evaluate how to integrate with their work. This will include integrating with the following repos

The service-mesh team is focusing right now on the K8s Gateway API and the HTTPRoute CustomResources for their implementation. But upstream Kubeflow right now is relying on Gateway and VirtualService CRs.

We'll need to investigate the potential ways for us to integrate with Ambient mesh.

There are 2 approaches that we are considering for now:

  1. Don't change anything in upstream Kubeflow, as ambient supports the old APIs, and understand if we could use the mesh-team's charms with Gateway and VirtualServices support
  2. Change all components of upstream Kubeflow (Knative, KServe, web apps, controllers) to potentially create and work with the new K8s Gateway API

What needs to get done

  1. Understand if upstream Kubeflow can be deployed as is with ambient mesh (Gateway and VirtualService CRs)
  2. In this case, understand if we could have a path for using service-mesh team's charms
  3. Expose a list of upstream, components that need to be updated, or ideally have this discussion upstream

Definition of Done

  1. We know if we can use ambient mesh with Gateway and VirtualService objects
  2. We know if we can use service-mesh team's charms and stick with Gateway and VirtualService objects
  3. If not, start discussions with upstream on what are the components that need to be updated for ambient mesh
syncronize-issues-to-jira[bot] commented 1 week ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6426.

This message was autogenerated

kimwnasptd commented 1 week ago

As a first part of this, I'll try to understand if upstream Kubeflow can work with Ambient mesh. For this experiment I did the following:

1. Install MicroK8s

# microk8s
sudo snap install microk8s --classic
sudo microk8s enable dns
sudo microk8s enable hostpath-storage
sudo microk8s enable ingress
sudo microk8s enable rbac

2. istioctl

wget https://github.com/istio/istio/releases/download/1.23.2/istioctl-1.23.2-linux-amd64.tar.gz \
    -O istioctl.tar.gz
tar -xzvf istioctl.tar.gz
mv istioctl ~/.local/bin
rm istioctl.tar.gz

3. ambient mesh manifests

The following commands will generate the ambient mesh manifests, but also include an Istio IngressGateway (Deployment). This will be backed by an Envoy proxy and will be used for handling the Gateway CR.

git clone https://github.com/kubeflow/manifests
cd manifests/common
cp -r istio-1-22 istio-ambient
istioctl manifest generate \
    --set profile=ambient \
    --set "components.ingressGateways[0].enabled=true" \
    --set "components.ingressGateways[0].name=istio-ingressgateway" \
    --set "components.ingressGateways[1].enabled=true" \
    --set "components.ingressGateways[1].name=cluster-local-gateway" \
    --set "components.ingressGateways[1].label.app=cluster-local-gateway" \
    --set "components.ingressGateways[1].label.istio=cluster-local-gateway" \
    > istio-ambient-manifests.yaml

./split-istio-packages -f istio-ambient-manifests.yaml
mv crd.yaml istio-crds/base
mv install.yaml istio-install/base
mv cluster-local-gateway.yaml cluster-local-gateway/base

4. kustomization.yaml

Now that the manifests are there I updated the example/kustomization.yaml to have the following

# Istio
- ../common/istio-ambient/istio-crds/base
- ../common/istio-ambient/istio-namespace/base
- ../common/istio-ambient/istio-install/overlays/oauth2-proxy

Note that there's also an upstream PR that tries to introduce the manifests, but I didn't succeed with that PR https://github.com/kubeflow/manifests/pull/2822. The above should also be turned into an upstream PR.

kimwnasptd commented 1 week ago

With all the above, I managed to connect to the dashboard as expected via the IngressGateway, and the VirtualServices configured it correctly!

I can also confirm that ambient-mesh was running in the cluster


$ kubectl get pods -n istio-system

NAME                                            READY   STATUS      RESTARTS   AGE
cluster-local-gateway-589f74fc8d-m8lxr          1/1     Running     0          58m
istio-cni-node-45hxq                            1/1     Running     0          58m
istio-ingressgateway-7c4fb549bb-cbwfd           1/1     Running     0          58m
istiod-59b7b45c57-rpslq                         1/1     Running     0          58m
kubeflow-m2m-oidc-configurator-28809720-n728n   0/1     Completed   0          5m38s
kubeflow-m2m-oidc-configurator-28809725-dks67   0/1     Completed   0          38s
ztunnel-zbq9j                                   1/1     Running     0          58m