canonical / bundle-kubeflow

Charmed Kubeflow
Apache License 2.0
97 stars 48 forks source link

Create an airgapped release of CKF 1.8 #818

Closed AlexanderSing closed 1 week ago

AlexanderSing commented 5 months ago

Context

Recently, further scripts and tests for deploying CKF in an airgapped environment were merged and I am grateful for these contributions. Based on them I tried deploying stable/1.8 in an airgapped manner but ran into some problems.

  1. It seems that the envoy-operator is trying to reach raw.githubusercontent.com/canonical/operator-schemas/master/grpc.yaml as part of its set_pod_spec call, which doesn't work in an airgapped environment
  2. Not sure if thats because of my environment but using the argo-controller 398 charm, I get a missing authorization: lightkube.core.exceptions.ApiError: customresourcedefinitions.apiextensions.k8s.io is forbidden: User "system:serviceaccount:kubeflow:argo-controller" cannot list resource "customresourcedefinitions" in API group "apiextensions.k8s.io" at the cluster scope

In my case, I actually currently have access to the docker.io and gcr.io container registries, that's why they are still referred to in the .yaml file. However, this will be changed later on to a hosted registry.

I used the following bundle-airgap.yaml:

bundle: kubernetes
name: kubeflow
applications:
  admission-webhook:
    charm: ./admission-webhook_r275.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflownotebookswg/poddefaults-webhook:v1.8.0
  dex-auth:
    charm: ./dex-auth_r358.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/dexidp/dex:v2.36.0
  istio-ingressgateway:
    charm: ./istio-gateway_r723.charm
    options:
      kind: ingress
      proxy-image: docker.io/istio/proxyv2:1.17.3
    scale: 1
    trust: true
  istio-pilot:
    charm: ./istio-pilot_r711.charm
    scale: 1
    trust: true
    options:
      default-gateway: kubeflow-gateway
      image-configuration: '{"pilot-image": "pilot", "global-tag": "1.17.3", "global-hub": "docker.io/istio", "global-proxy-image": "proxyv2", "global-proxy-init-image": "proxyv2", "grpc-bootstrap-init": "busybox:1.28"}'
  jupyter-controller:
    charm: ./jupyter-controller_r824.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflownotebookswg/notebook-controller:v1.8.0
  jupyter-ui:
    charm: ./jupyter-ui_r746.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflownotebookswg/jupyter-web-app:v1.8.0
    options:
      jupyter-images: "['docker.io/kubeflownotebookswg/jupyter-scipy:v1.8.0','docker.io//kubeflownotebookswg/jupyter-pytorch-full:v1.8.0','docker.io//kubeflownotebookswg/jupyter-pytorch-cuda-full:v1.8.0','docker.io//kubeflownotebookswg/jupyter-tensorflow-full:v1.8.0','docker.io//kubeflownotebookswg/jupyter-tensorflow-cuda-full:v1.8.0']"
      rstudio-images: "['docker.io/kubeflownotebookswg/rstudio-tidyverse:v1.8.0']"
      vscode-images: "['docker.io/kubeflownotebookswg/codeserver-python:v1.8.0']"

  katib-db:
    charm: ./mysql-k8s_r113.charm
    scale: 1
    trust: true
    constraints: mem=2G
    resources:
      mysql-image: ghcr.io/canonical/charmed-mysql:8.0.35-22.04_edge
  katib-db-manager:
    charm: ./katib-db-manager_r411.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflowkatib/katib-db-manager:v0.16.0
  katib-ui:
    charm: ./katib-ui_r422.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflowkatib/katib-ui:v0.16.0
  kfp-api:
    charm: ./kfp-api_r1035.charm
    scale: 1
    trust: true
    resources:
      oci-image: ghcr.io/charmedkubeflow/api-server:2.0.3-e037d33
    options:
      cache-image: busybox
  kfp-db:
    charm: ./mysql-k8s_r113.charm
    scale: 1
    trust: true
    constraints: mem=2G
    resources:
      mysql-image: ghcr.io/canonical/charmed-mysql:8.0.35-22.04_edge
  kfp-metadata-writer:
    charm: ./kfp-metadata-writer_r118.charm
    scale: 1
    trust: true
    resources:
      oci-image: gcr.io/ml-pipeline/metadata-writer:2.0.3
  kfp-persistence:
    charm: ./kfp-persistence_r1039.charm
    scale: 1
    trust: true
    resources:
      oci-image: ghcr.io/charmedkubeflow/persistenceagent:2.0.3-a3714a9
  kfp-profile-controller:
    charm: ./kfp-profile-controller_r998.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/python:3.7
  kfp-schedwf:
    charm: ./kfp-schedwf_r1052.charm
    scale: 1
    trust: true
    resources:
      oci-image: ghcr.io/charmedkubeflow/scheduledworkflow:2.0.3-7d6d3e4
  kfp-ui:
    charm: ./kfp-ui_r1034.charm
    scale: 1
    trust: true
    resources:
      ml-pipeline-ui: ghcr.io/charmedkubeflow/frontend:2.0.3-d4ac42b
  kfp-viewer:
    charm: ./kfp-viewer_r1064.charm
    scale: 1
    trust: true
    resources:
      kfp-viewer-image: ghcr.io/charmedkubeflow/viewer-crd-controller:2.0.3-d89d9fc
  kfp-viz:
    charm: ./kfp-viz_r985.charm
    scale: 1
    trust: true
    resources:
      oci-image: ghcr.io/charmedkubeflow/visualization-server:2.0.3-8169d0c
  knative-eventing:
    charm: ./knative-eventing_r353.charm
    scale: 1
    trust: true
    options:
      namespace: knative-eventing
      custom_images: '{ "eventing-webhook/eventing-webhook": "gcr.io/knative-releases/knative.dev/eventing/cmd/webhook:c9c582f530155d22c01b43957ae0dba549b1cc903f77ec6cc1acb9ae9085be62", "eventing-controller/eventing-controller": "gcr.io/knative-releases/knative.dev/eventing/cmd/controller:cbc452f35842cc8a78240642adc1ebb11a4c4d7c143c8277edb49012f6cfc5d3", "mt-broker-filter/filter": "gcr.io/knative-releases/knative.dev/eventing/cmd/broker/filter:33ea8a657b974d7bf3d94c0b601a4fc287c1fb33430b3dda028a1a189e3d9526", "mt-broker-ingress/ingress": "gcr.io/knative-releases/knative.dev/eventing/cmd/broker/ingress:f4a9dfce9eec5272c90a19dbdf791fffc98bc5a6649ee85cb8a29bd5145635b1", "mt-broker-controller/mt-broker-controller": "gcr.io/knative-releases/knative.dev/eventing/cmd/mtchannel_broker:c5d3664780b394f6d3e546eb94c972965fbd9357da5e442c66455db7ca94124c", "imc-controller/controller": "gcr.io/knative-releases/knative.dev/eventing/cmd/in_memory/channel_controller:3ced549336c7ccf3bb2adf23a558eb55bd1aec7be17837062d21c749dfce8ce5", "imc-dispatcher/dispatcher": "gcr.io/knative-releases/knative.dev/eventing/cmd/in_memory/channel_dispatcher:e17bbdf951868359424cd0a0465da8ef44c66ba7111292444ce555c83e280f1a", "pingsource-mt-adapter/dispatcher": "gcr.io/knative-releases/knative.dev/eventing/cmd/mtping:bc200a12cbad35bea51aabe800a365f28a5bd1dd65b3934b3db2e7e22df37efd", "migrate": "gcr.io/knative-releases/knative.dev/pkg/apiextensions/storageversion/cmd/migrate:59431cf8337532edcd9a4bcd030591866cc867f13bee875d81757c960a53668d", }'
  knative-operator:
    charm: ./knative-operator_r328.charm
    scale: 1
    trust: true
    resources:
      knative-operator-image: gcr.io/knative-releases/knative.dev/operator/cmd/operator:v1.10.3
      knative-operator-webhook-image: gcr.io/knative-releases/knative.dev/operator/cmd/webhook:v1.10.3
    options:
      otel-collector-image: docker.io/otel/opentelemetry-collector:latest
  knative-serving:
    charm: ./knative-serving_r354.charm
    scale: 1
    trust: true
    options:
      namespace: knative-serving
      istio.gateway.namespace: kubeflow
      istio.gateway.name: kubeflow-gateway
      version: 1.8.0
      custom_images: '{ "activator": "gcr.io/knative-releases/knative.dev/serving/cmd/activator:c3bbf3a96920048869dcab8e133e00f59855670b8a0bbca3d72ced2f512eb5e1", "autoscaler": "gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler:caae5e34b4cb311ed8551f2778cfca566a77a924a59b775bd516fa8b5e3c1d7f", "controller": "gcr.io/knative-releases/knative.dev/serving/cmd/controller:38f9557f4d61ec79cc2cdbe76da8df6c6ae5f978a50a2847c22cc61aa240da95", "webhook": "gcr.io/knative-releases/knative.dev/serving/cmd/webhook:bc13765ba4895c0fa318a065392d05d0adc0e20415c739e0aacb3f56140bf9ae", "autoscaler-hpa": "gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa:7003443f0faabbaca12249aa16b73fa171bddf350abd826dd93b06f5080a146d", "net-istio-controller/controller": "gcr.io/knative-releases/knative.dev/net-istio/cmd/controller:2b484d982ef1a5d6ff93c46d3e45f51c2605c2e3ed766e20247d1727eb5ce918", "net-istio-webhook/webhook": "gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook:59b6a46d3b55a03507c76a3afe8a4ee5f1a38f1130fd3d65c9fe57fff583fa8d", "domain-mapping": "gcr.io/knative-releases/knative.dev/serving/cmd/domain-mapping:763d648bf1edee2b4471b0e211dbc53ba2d28f92e4dae28ccd39af7185ef2c96", "domainmapping-webhook": "gcr.io/knative-releases/knative.dev/serving/cmd/domain-mapping-webhook:a4ba0076df2efaca2eed561339e21b3a4ca9d90167befd31de882bff69639470", "migrate": "gcr.io/knative-releases/knative.dev/pkg/apiextensions/storageversion/cmd/migrate:d0095787bc1687e2d8180b36a66997733a52f8c49c3e7751f067813e3fb54b66", "queue-proxy": "gcr.io/knative-releases/knative.dev/serving/cmd/queue:505179c0c4892ea4a70e78bc52ac21b03cd7f1a763d2ecc78e7bbaa1ae59c86c", }'
  kserve-controller:
    charm: ./kserve-controller_r435.charm
    scale: 1
    trust: true
    options:
      deployment-mode: rawdeployment
      custom_images: '{ "configmap__agent": "docker.io/kserve/agent:v0.11.1", "configmap__batcher": "docker.io/kserve/agent:v0.11.1", "configmap__explainers__alibi": "docker.io/kserve/alibi-explainer:latest", "configmap__explainers__aix": "docker.io/kserve/aix-explainer:latest", "configmap__explainers__art": "docker.io/kserve/art-explainer:latest", "configmap__logger": "docker.io/kserve/agent:v0.11.1", "configmap__router": "docker.io/kserve/router:v0.11.1", "configmap__storageInitializer": "docker.io/kserve/storage-initializer:v0.11.1", "serving_runtimes__lgbserver": "docker.io/kserve/lgbserver:v0.11.1", "serving_runtimes__kserve_mlserver": "docker.io/seldonio/mlserver:1.3.2", "serving_runtimes__paddleserver": "docker.io/kserve/paddleserver:v0.11.1", "serving_runtimes__pmmlserver": "docker.io/kserve/pmmlserver:v0.11.1", "serving_runtimes__sklearnserver": "docker.io/kserve/sklearnserver:v0.11.1", "serving_runtimes__tensorflow_serving": "docker.io/tensorflow/serving:2.6.2", "serving_runtimes__torchserve": "docker.io/pytorch/torchserve-kfs:0.7.0", "serving_runtimes__tritonserver": "nvcr.io/nvidia/tritonserver:21.09-py3", "serving_runtimes__xgbserver": "docker.io/kserve/xgbserver:v0.11.1", }'
    resources:
      kserve-controller-image: docker.io/kserve/kserve-controller:v0.11.1
      kube-rbac-proxy-image: docker.io/kubebuilder/kube-rbac-proxy:v0.13.1
  kubeflow-dashboard:
    charm: ./kubeflow-dashboard_r454.charm
    scale: 1
    trust: true
    resources:
      oci-image: docker.io/kubeflownotebookswg/centraldashboard:v1.8.0
  kubeflow-profiles:
    charm: ./kubeflow-profiles_r355.charm
    scale: 1
    trust: true
    resources:
      profile-image: docker.io/kubeflownotebookswg/profile-controller:v1.8.0
      kfam-image: docker.io/kubeflownotebookswg/kfam:v1.8.0
  kubeflow-roles:
    charm: ./kubeflow-roles_r187.charm
    scale: 1
    trust: true
  metacontroller-operator:
    charm: ./metacontroller-operator_r226.charm
    scale: 1
    trust: true
    options:
      metacontroller-image: docker.io/metacontrollerio/metacontroller:v3.0.0
  oidc-gatekeeper:
    charm: ./oidc-gatekeeper_r294.charm
    scale: 1
    resources:
      oci-image: docker.io/kubeflowmanifestswg/oidc-authservice:e236439
  pvcviewer-operator:
    charm: ./pvcviewer-operator_r30.charm
    scale: 1
    series: focal
    trust: true
    resources:
      oci-image: docker.io/kubeflownotebookswg/pvcviewer-controller:v1.8.0
  seldon-controller-manager:
    charm: ./seldon-core_r590.charm
    scale: 1
    trust: true
    resources:
      oci-image: ghcr.io/charmedkubeflow/seldon-core-operator:1.17.1-c95840c
    options:
      custom_images: '{ "configmap__predictor__tensorflow__tensorflow": "docker.io/tensorflow/serving:2.1.0", "configmap__predictor__tensorflow__seldon": "docker.io/seldonio/tfserving-proxy:1.15.0", "configmap__predictor__sklearn__seldon": "ghcr.io/charmedkubeflow/sklearnserver:v1.16.0_20.04_1", "configmap__predictor__sklearn__v2": "ghcr.io/charmedkubeflow/mlserver-sklearn:1.2.0_22.04_1", "configmap__predictor__xgboost__seldon": "docker.io/seldonio/xgboostserver:1.15.0", "configmap__predictor__xgboost__v2": "ghcr.io/charmedkubeflow/mlserver-xgboost:1.2.0_22.04_1", "configmap__predictor__mlflow__seldon": "docker.io/seldonio/mlflowserver:1.15.0", "configmap__predictor__mlflow__v2": "ghcr.io/charmedkubeflow/mlserver-mlflow:1.2.0_22.04_1", "configmap__predictor__triton__v2": "nvcr.io/nvidia/tritonserver:21.08-py3", "configmap__predictor__huggingface__v2": "ghcr.io/charmedkubeflow/mlserver-huggingface:1.2.4_22.04_1", "configmap__predictor__tempo_server__v2": "docker.io/seldonio/mlserver:1.2.0-slim", "configmap_storageInitializer": "docker.io/seldonio/rclone-storage-initializer:1.14.1", "configmap_explainer": "docker.io/seldonio/alibiexplainer:1.15.0", "configmap_explainer_v2": "docker.io/seldonio/mlserver:1.2.0-alibi-explain", }'
      executor-container-image-and-version: docker.io/seldonio/seldon-core-executor:1.17.1
  tensorboard-controller:
    charm: ./tensorboard-controller_r257.charm
    scale: 1
    trust: true
    resources:
      tensorboard-controller-image: docker.io/kubeflownotebookswg/tensorboard-controller:v1.8.0
  tensorboards-web-app:
    charm: ./tensorboards-web-app_r245.charm
    scale: 1
    trust: true
    resources:
      tensorboards-web-app-image: docker.io/kubeflownotebookswg/tensorboards-web-app:v1.8.0
  training-operator:
    charm: ./training-operator_r330.charm
    scale: 1
    trust: true
    resources:
      training-operator-image: docker.io/kubeflow/training-operator:v1-855e096
relations:
  - [dex-auth:oidc-client, oidc-gatekeeper:oidc-client]
  - [istio-pilot:ingress, dex-auth:ingress]
  - [istio-pilot:ingress, jupyter-ui:ingress]
  - [istio-pilot:ingress, katib-ui:ingress]
  - [istio-pilot:ingress, kfp-ui:ingress]
  - [istio-pilot:ingress, kubeflow-dashboard:ingress]
  - [istio-pilot:ingress, oidc-gatekeeper:ingress]
  - [istio-pilot:ingress-auth, oidc-gatekeeper:ingress-auth]
  - [istio-pilot:istio-pilot, istio-ingressgateway:istio-pilot]
  - [istio-pilot:ingress, tensorboards-web-app:ingress]
  - [istio-pilot:gateway-info, tensorboard-controller:gateway-info]
  - [katib-db-manager:relational-db, katib-db:database]
  - [kfp-api:relational-db, kfp-db:database]
  - [kfp-api:kfp-api, kfp-persistence:kfp-api]
  - [kfp-api:kfp-api, kfp-ui:kfp-api]
  - [kfp-api:kfp-viz, kfp-viz:kfp-viz]
  - [kserve-controller:ingress-gateway, istio-pilot:gateway-info]
  - [kserve-controller:local-gateway, knative-serving:local-gateway]
  - [kubeflow-profiles, kubeflow-dashboard]
  - [kubeflow-dashboard:links, jupyter-ui:dashboard-links]
  - [kubeflow-dashboard:links, katib-ui:dashboard-links]
  - [kubeflow-dashboard:links, kfp-ui:dashboard-links]
  - [kubeflow-dashboard:links, tensorboards-web-app:dashboard-links]

and followed that up with the following script:

juju deploy ./argo-controller_r398.charm --resource oci-image=docker.io/argoproj/workflow-controller:v3.3.10 --config executor-image=docker.io/argoproj/argoexec:v3.3.10
juju deploy ./katib-controller_r446.charm --resource oci-image=docker.io/kubeflowkatib/katib-controller:v0.16.0 --config custom_images='{"default_trial_template": "docker.io/kubeflowkatib/mxnet-mnist:v0.16.0","early_stopping__medianstop": "docker.io/kubeflowkatib/earlystopping-medianstop:v0.16.0","enas_cpu_template": "docker.io/kubeflowkatib/enas-cnn-cifar10-cpu:v0.16.0","metrics_collector_sidecar__stdout": "docker.io/kubeflowkatib/file-metrics-collector:v0.16.0","metrics_collector_sidecar__file": "docker.io/kubeflowkatib/file-metrics-collector:v0.16.0","metrics_collector_sidecar__tensorflow_event": "docker.io/kubeflowkatib/tfevent-metrics-collector:v0.16.0","pytorch_job_template__master": "docker.io/kubeflowkatib/pytorch-mnist-cpu:v0.16.0","pytorch_job_template__worker": "docker.io/kubeflowkatib/pytorch-mnist-cpu:v0.16.0","suggestion__random": "docker.io/kubeflowkatib/suggestion-hyperopt:v0.16.0","suggestion__tpe": "docker.io/kubeflowkatib/suggestion-hyperopt:v0.16.0","suggestion__grid": "docker.io/kubeflowkatib/suggestion-optuna:v0.16.0","suggestion__hyperband": "docker.io/kubeflowkatib/suggestion-hyperband:v0.16.0","suggestion__bayesianoptimization": "docker.io/kubeflowkatib/suggestion-skopt:v0.16.0","suggestion__cmaes": "docker.io/kubeflowkatib/suggestion-goptuna:v0.16.0","suggestion__sobol": "docker.io/kubeflowkatib/suggestion-goptuna:v0.16.0","suggestion__multivariate_tpe": "docker.io/kubeflowkatib/suggestion-optuna:v0.16.0","suggestion__enas": "docker.io/kubeflowkatib/suggestion-enas:v0.16.0","suggestion__darts": "docker.io/kubeflowkatib/suggestion-darts:v0.16.0","suggestion__pbt": "docker.io/kubeflowkatib/suggestion-pbt:v0.16.0", }'
juju deploy ./kubeflow-volumes_r260.charm --resource oci-image=docker.io/kubeflownotebookswg/volumes-web-app:v1.8.0
juju deploy ./minio_r258.charm --resource oci-image=docker.io/minio/minio:RELEASE.2021-09-03T03-56-13Z
juju deploy ./mlmd_r127.charm --resource oci-image=gcr.io/tfx-oss-public/ml_metadata_store_server:1.14.0
juju deploy ./envoy_r101.charm --resource oci-image=gcr.io/ml-pipeline/metadata-envoy:2.0.2

juju relate argo-controller minio
juju relate istio-pilot:ingress envoy:ingress
juju relate istio-pilot:ingress kubeflow-volumes:ingress
juju relate kubeflow-dashboard:links kubeflow-volumes:dashboard-links
juju relate kfp-api:object-storage minio:object-storage
juju relate kfp-profile-controller:object-storage minio:object-storage
juju relate kfp-ui:object-storage minio:object-storage
juju relate mlmd:grpc envoy:grpc
juju relate mlmd:grpc kfp-metadata-writer:grpc

What needs to get done

  1. Create a bundle-airgap.yaml file for 1.8/stable
  2. Ensure that everything is deployed as expected (ideally throught automated tests)
  3. Enhance the documentation of the process

Definition of Done

  1. CKF 1.8/stable can be deployed in an airgapped environment using the provide bundle-airgap.yaml and documentation
  2. Functionality is ensured through automated testing
syncronize-issues-to-jira[bot] commented 5 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5306.

This message was autogenerated

kimwnasptd commented 4 months ago

Hey @AlexanderSing thanks for raising this!

We have it in our map to also provide clear instructions and verify the 1.8 installation in an airgapped environment. For now we are waiting for the upstream 1.8.1 release, since it includes some necessary changes in KFP to have configurable images for launcher and driver https://github.com/kubeflow/pipelines/blob/2.0.5/CHANGELOG.md https://github.com/kubeflow/pipelines/pull/10269

Once those are up we'll also work on the other aspects of the Charms to work in airgap

gustavosr98 commented 4 months ago

Here the current output when trying to only use a bundle (no scripts)

Executing changes:
- upload charm /home/ubuntu/charms/kubeflow/unversioned/admission-webhook for series focal with architecture=amd64
- deploy application admission-webhook with 1 unit on focal
  added resource oci-image
- upload charm /home/ubuntu/charms/kubeflow/unversioned/argo-controller for series focal with architecture=amd64
- deploy application argo-controller with 1 unit on focal
  added resource oci-image
- upload charm /home/ubuntu/charms/kubeflow/unversioned/dex-auth for series focal with architecture=amd64
- deploy application dex-auth with 1 unit on focal
  added resource oci-image
- upload charm /home/ubuntu/charms/kubeflow/unversioned/envoy for series focal with architecture=amd64
- deploy application envoy with 1 unit on focal
ERROR cannot deploy bundle: series "kubernetes" is not supported, supported series are: focal

Also current script I am trying to use to workaround not being able to use a bundle Specially to avoid hardcoded versions of the OCI images

OCI_REGISTRY=10.10.11.39:32000
IMAGES=~/k8s/images_kubeflow.txt

img(){ echo "$OCI_REGISTRY/$(cat $IMAGES | grep $1 | tail -n1)"; }
kfn=kubeflownotebookswg

juju deploy --trust --debug ./admission-webhook admission-webhook --resource oci-image=$(img $kfn/poddefaults-webhook)
juju deploy --trust --debug ./argo-controller argo-controller --resource oci-image=$(img argoproj/workflow-controller)  --config executor-image=$(img argoproj/argoexec)
juju deploy --trust --debug ./dex-auth dex-auth --resource oci-image=$(img charmedkubeflow/dex)
juju deploy --trust --debug ./envoy envoy --resource oci-image=$(img ml-pipeline/metadata-envoy)
juju deploy --trust --debug ./istio-gateway istio-ingressgateway --config kind=ingress --config proxy-image=$(img istio/proxyv2)

version=$(img istio/proxyv2 | rev)
version=$(echo ${tmp/:/ } | awk '{print $1}' | rev)

juju deploy --trust --debug ./istio-pilot istio-pilot --config default-gateway=kubeflow-gateway --config image-configuration="pilot-image: 'pilot'
global-tag: '$version'
global-hub: '$OCI_REGISTRY/docker.io/istio'
global-proxy-image: 'proxyv2'
global-proxy-init-image: 'proxyv2'
grpc-bootstrap-init: 'busybox:1.28'
"

juju deploy --trust --debug ./jupyter-controller jupyter-controller --resource oci-image=$(img $kfn/notebook-controller)
juju deploy --trust --debug ./jupyter-ui jupyter-ui --resource oci-image=$(img $kfn/jupyter-web-app) \
    --config jupyter-images="['$(img $kfn/jupyter-scipy)','$(img $kfn/jupyter-pytorch-full)','$(img $kfn/jupyter-pytorch-cuda-full)','$(img $kfn/jupyter-tensorflow-full)','$(img $kfn/jupyter-tensorflow-cuda-full)']" \
    --config rstudio-images="['$(img $kfn/rstudio-tidyverse)']" \
    --config vscode-images="['$(img $kfn/codeserver-python)']"

juju deploy --trust --debug ./katib-controller katib-controller  --resource oci-image=$(img kubeflowkatib/katib-controller) \
    --config custom_images="default_trial_template: '$(img kubeflowkatib/mxnet-mnist)'
early_stopping__medianstop: '$(img kubeflowkatib/earlystopping-medianstop)'
enas_cpu_template: '$(img kubeflowkatib/enas-cnn-cifar10-cpu)'
metrics_collector_sidecar__stdout: '$(img kubeflowkatib/file-metrics-collector)'
metrics_collector_sidecar__file: '$(img kubeflowkatib/file-metrics-collector)'
metrics_collector_sidecar__tensorflow_event: '$(img kubeflowkatib/tfevent-metrics-collector)'
pytorch_job_template__master: '$(img kubeflowkatib/pytorch-mnist-cpu)'
pytorch_job_template__worker: '$(img kubeflowkatib/pytorch-mnist-cpu)'
suggestion__random: '$(img kubeflowkatib/suggestion-hyperopt)'
suggestion__tpe: '$(img kubeflowkatib/suggestion-hyperopt)'
suggestion__grid: '$(img kubeflowkatib/suggestion-optuna)'
suggestion__hyperband: '$(img kubeflowkatib/suggestion-hyperband)'
suggestion__bayesianoptimization: '$(img kubeflowkatib/suggestion-skopt)'
suggestion__cmaes: '$(img kubeflowkatib/suggestion-goptuna)'
suggestion__sobol: '$(img kubeflowkatib/suggestion-goptuna)'
suggestion__multivariate_tpe: '$(img kubeflowkatib/suggestion-optuna)'
suggestion__enas: '$(img kubeflowkatib/suggestion-enas)'
suggestion__darts: '$(img kubeflowkatib/suggestion-darts)'
suggestion__pbt: '$(img kubeflowkatib/suggestion-pbt)'
"

juju deploy --trust --debug ./mysql-k8s katib-db --constraints="mem=2G" --resource mysql-image=$(img canonical/charmed-mysql)
juju deploy --trust --debug ./katib-db-manager katib-db-manager  --resource oci-image=$(img kubeflowkatib/katib-db-manager)
juju deploy --trust --debug ./katib-ui katib-ui  --resource oci-image=$(img kubeflowkatib/katib-ui)
juju deploy --trust --debug ./kfp-api kfp-api --resource oci-image=$(img charmedkubeflow/api-server)
juju deploy --trust --debug ./mysql-k8s kfp-db --constraints="mem=2G" --resource mysql-image=$(img canonical/charmed-mysql)
juju deploy --trust --debug ./kfp-metadata-writer kfp-metadata-writer --resource oci-image=$(img gcr.io/ml-pipeline/metadata-writer)
juju deploy --trust --debug ./kfp-persistence kfp-persistence --resource oci-image=$(img charmedkubeflow/persistenceagent)
juju deploy --trust --debug ./kfp-profile-controller kfp-profile-controller --resource oci-image=$(img python:3.7)
juju deploy --trust --debug ./kfp-schedwf kfp-schedwf --resource oci-image=$(img charmedkubeflow/scheduledworkflow)
juju deploy --trust --debug ./kfp-ui kfp-ui --resource ml-pipeline-ui=$(img charmedkubeflow/frontend)
juju deploy --trust --debug ./kfp-viewer kfp-viewer --resource kfp-viewer-image=$(img charmedkubeflow/viewer-crd-controller)
juju deploy --trust --debug ./kfp-viz kfp-viz --resource oci-image=$(img charmedkubeflow/visualization-server)
juju deploy --trust --debug ./knative-eventing knative-eventing --config namespace=knative-eventing
juju deploy --trust --debug ./knative-operator knative-operator --resource knative-operator-image=$(img gcr.io/knative-releases/knative.dev/operator/cmd/operator) --resource knative-operator-webhook-image=$(img gcr.io/knative-releases/knative.dev/operator/cmd/webhook) --config otel-collector-image=$(img otel/opentelemetry-collector)
juju deploy --trust --debug ./knative-serving knative-serving --config namespace=knative-serving --config istio.gateway.namespace=kubeflow --config istio.gateway.name=kubeflow-gateway \
    --config custom_images="activator: $(img gcr.io/knative-releases/knative.dev/serving/cmd/activator | sed 's%@.*%%g')
autoscaler: $(img gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler | sed 's%@.*%%g')
controller: $(img gcr.io/knative-releases/knative.dev/serving/cmd/controller | sed 's%@.*%%g')
webhook: $(img gcr.io/knative-releases/knative.dev/serving/cmd/controller | sed 's%@.*%%g')
autoscaler-hpa: $(img gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler-hpa | sed 's%@.*%%g')
net-istio-controller/controller: $(img gcr.io/knative-releases/knative.dev/net-istio/cmd/controller | sed 's%@.*%%g')
net-istio-webhook/webhook: $(img gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook | sed 's%@.*%%g')
queue-proxy: $(img gcr.io/knative-releases/knative.dev/serving/cmd/queue | sed 's%@.*%%g')
"

juju deploy --trust --debug ./kserve-controller kserve-controller --resource kserve-controller-image=$(img kserve/kserve-controller) --resource kube-rbac-proxy-image=$(img gcr.io/kubebuilder/kube-rbac-proxy) --config custom_images="configmap__agent: $(img kserve/agent)
configmap__batcher: $(img kserve/agent)
configmap__explainers__alibi: $(img kserve/alibi-explainer)
configmap__explainers__art: $(img kserve/art-explainer)
configmap__logger: $(img kserve/agent)
configmap__router: $(img kserve/router)
configmap__storageInitializer: $(img kserve/storage-initializer)
serving_runtimes__lgbserver: $(img kserve/lgbserver)
serving_runtimes__kserve_mlserver: $(img docker.io/seldonio/mlserver)
serving_runtimes__paddleserver: $(img kserve/paddleserver)
serving_runtimes__pmmlserver: $(img kserve/pmmlserver)
serving_runtimes__sklearnserver: $(img kserve/sklearnserver)
serving_runtimes__tensorflow_serving: $(img tensorflow/serving)
serving_runtimes__torchserve: $(img pytorch/torchserve-kfs)
serving_runtimes__tritonserver: $(img nvcr.io/nvidia/tritonserver)
serving_runtimes__xgbserver: $(img kserve/xgbserver)
"

juju deploy --trust --debug ./kubeflow-dashboard kubeflow-dashboard --resource oci-image=$(img kubeflownotebookswg/centraldashboard)
juju deploy --trust --debug ./kubeflow-profiles kubeflow-profiles --resource profile-image=$(img kubeflownotebookswg/profile-controller) --resource kfam-image=$(img kubeflownotebookswg/kfam)
juju deploy --trust --debug ./kubeflow-roles kubeflow-roles
juju deploy --trust --debug ./kubeflow-volumes kubeflow-volumes --resource oci-image=$(img kubeflownotebookswg/volumes-web-app)
juju deploy --trust --debug ./metacontroller-operator metacontroller-operator --config metacontroller-image=$(img metacontrollerio/metacontroller)
juju deploy --trust --debug ./mlmd mlmd --resource oci-image=$(img gcr.io/tfx-oss-public/ml_metadata_store_server)

# FIXME Needs tweaks and use version without restrictive license
juju deploy --trust --debug ./minio minio --resource oci-image=$(img minio/minio)
juju deploy --trust --debug ./oidc-gatekeeper oidc-gatekeeper --resource oci-image=$(img charmedkubeflow/oidc-authservice)
juju deploy --trust --debug ./pvcviewer-operator pvcviewer-operator --series=focal --resource oci-image=$(img docker.io/kubeflownotebookswg/pvcviewer-controller) --resource oci-image-proxy=$(img kubebuilder/kube-rbac-proxy)
juju deploy --trust --debug ./seldon-core seldon-controller-manager --resource oci-image=$(img charmedkubeflow/seldon-core-operator) \
    --config executor-container-image-and-version=$(img docker.io/seldonio/seldon-core-executor) \
    --config custom_images="configmap__predictor__tensorflow__tensorflow: $(img charmedkubeflow/tensorflow-serving)
configmap__predictor__tensorflow__seldon: $(img seldonio/tfserving-proxy)
configmap__predictor__sklearn__seldon: $(img charmedkubeflow/sklearnserver)
configmap__predictor__sklearn__v2: $(img charmedkubeflow/mlserver-sklearn)
configmap__predictor__xgboost__seldon: $(img seldonio/xgboostserver)
configmap__predictor__xgboost__v2: $(img charmedkubeflow/mlserver-xgboost)
configmap__predictor__mlflow__seldon: $(img seldonio/mlflowserver)
configmap__predictor__mlflow__v2: $(img charmedkubeflow/mlserver-mlflow)
configmap__predictor__triton__v2: $(img nvcr.io/nvidia/tritonserver)
configmap__predictor__huggingface__v2: $(img charmedkubeflow/mlserver-huggingface)
configmap__predictor__tempo_server__v2: $(img seldonio/mlserver)
configmap_storageInitializer: $(img seldonio/rclone-storage-initializer)
configmap_explainer: $(img seldonio/alibiexplainer)
configmap_explainer_v2: $(img seldonio/mlserver)
"

juju deploy --trust --debug ./tensorboard-controller tensorboard-controller --resource tensorboard-controller-image=$(img kubeflownotebookswg/tensorboard-controller)
juju deploy --trust --debug ./tensorboards-web-app tensorboards-web-app --resource tensorboards-web-app-image=$(img kubeflownotebookswg/tensorboards-web-app)
juju deploy --trust --debug ./training-operator training-operator --resource training-operator-image=$(img kubeflow/training-operator)

# ----- Relations
juju relate argo-controller minio
juju relate dex-auth:oidc-client oidc-gatekeeper:oidc-client
juju relate istio-pilot:ingress dex-auth:ingress
juju relate istio-pilot:ingress envoy:ingress
juju relate istio-pilot:ingress jupyter-ui:ingress
juju relate istio-pilot:ingress katib-ui:ingress
juju relate istio-pilot:ingress kfp-ui:ingress
juju relate istio-pilot:ingress kubeflow-dashboard:ingress
juju relate istio-pilot:ingress kubeflow-volumes:ingress
juju relate istio-pilot:ingress oidc-gatekeeper:ingress
juju relate istio-pilot:ingress-auth oidc-gatekeeper:ingress-auth
juju relate istio-pilot:istio-pilot istio-ingressgateway:istio-pilot
juju relate istio-pilot:ingress tensorboards-web-app:ingress
juju relate istio-pilot:gateway-info tensorboard-controller:gateway-info
juju relate katib-db-manager:relational-db katib-db:database
juju relate kfp-api:relational-db kfp-db:database
juju relate kfp-api:kfp-api kfp-persistence:kfp-api
juju relate kfp-api:kfp-api kfp-ui:kfp-api
juju relate kfp-api:kfp-viz kfp-viz:kfp-viz
juju relate kfp-api:object-storage minio:object-storage
juju relate kfp-profile-controller:object-storage minio:object-storage
juju relate kfp-ui:object-storage minio:object-storage
juju relate kserve-controller:ingress-gateway istio-pilot:gateway-info
juju relate kserve-controller:local-gateway knative-serving:local-gateway
juju relate kubeflow-profiles kubeflow-dashboard
juju relate kubeflow-dashboard:links jupyter-ui:dashboard-links
juju relate kubeflow-dashboard:links katib-ui:dashboard-links
juju relate kubeflow-dashboard:links kfp-ui:dashboard-links
juju relate kubeflow-dashboard:links kubeflow-volumes:dashboard-links
juju relate kubeflow-dashboard:links tensorboards-web-app:dashboard-links
juju relate mlmd:grpc envoy:grpc
juju relate mlmd:grpc kfp-metadata-writer:grpc

And the current output I am getting when trying to do 1.8 airgapped is on https://github.com/canonical/envoy-operator/issues/72

NohaIhab commented 1 week ago

Hi @AlexanderSing and @gustavosr98, We have fixed and tested 1.8/stable bundle to work in airgapped environment. The guide to install Charmed Kubeflow in airgapped environment is now updated and published here.