SAME-Project / same-project

https://sameproject.ml/
Apache License 2.0
19 stars 8 forks source link

Detect if Kubeflow has been setup "correctly" #126

Open aronchick opened 2 years ago

aronchick commented 2 years ago

Not sure where this came about - it was after I merged main back into my vertex branch.

https://github.com/kubeflow/pipelines/issues/7676

Thoughts?

aronchick commented 2 years ago

ARGH, i'm stumped. Here's the two compiles working/not working.

Differences between them both - the working one deployed to AWS, and reads 1.8.1 as the KFP version (not sure how this is possible) gcp-not-working.tar.gz

The non-working on deployed to GCP, and doesn't work on release-1.4, release-1.5 or master. [Uploading aws-working.tar.gz…]()

Thoughts? Happy to pair if this helps.

aronchick commented 2 years ago

OK, after hacking at this for the day, it LOOKS like the issue was how i set Kubeflow up on GKE.

I was installing from this page - https://www.kubeflow.org/docs/components/pipelines/installation/localcluster-deployment/:

KFP_ENV=platform-agnostic
kubectl apply -k cluster-scoped-resources/
kubectl wait crd/applications.app.k8s.io --for condition=established --timeout=60s
kubectl apply -k "env/${KFP_ENV}/"
kubectl wait pods -l application-crd-id=kubeflow-pipelines -n kubeflow --for condition=Ready --timeout=1800s

But it looks like this page is the correct one - https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/: 1.8.1

export PIPELINE_VERSION=1.8.1
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"

2.0.0-alpha.2

export PIPELINE_VERSION=2.0.0-alpha.2
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
aronchick commented 2 years ago

Turning this into a feature - it'd be great to help diagnose Kubeflow a TEENY bit. If we can figure out SOME way of detecting the thing is going to fail, that'd be awesome.

Bubblyworld commented 2 years ago

Might be a more generally useful feature for backends - the functions backend needs to provision stuff on azure, for instance, and it would be nice to check that it's all in place correctly before executing notebooks.