Open aronchick opened 2 years ago
ARGH, i'm stumped. Here's the two compiles working/not working.
Differences between them both - the working one deployed to AWS, and reads 1.8.1 as the KFP version (not sure how this is possible) gcp-not-working.tar.gz
The non-working on deployed to GCP, and doesn't work on release-1.4, release-1.5 or master. [Uploading aws-working.tar.gz…]()
Thoughts? Happy to pair if this helps.
OK, after hacking at this for the day, it LOOKS like the issue was how i set Kubeflow up on GKE.
I was installing from this page - https://www.kubeflow.org/docs/components/pipelines/installation/localcluster-deployment/:
KFP_ENV=platform-agnostic
kubectl apply -k cluster-scoped-resources/
kubectl wait crd/applications.app.k8s.io --for condition=established --timeout=60s
kubectl apply -k "env/${KFP_ENV}/"
kubectl wait pods -l application-crd-id=kubeflow-pipelines -n kubeflow --for condition=Ready --timeout=1800s
But it looks like this page is the correct one - https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/: 1.8.1
export PIPELINE_VERSION=1.8.1
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
2.0.0-alpha.2
export PIPELINE_VERSION=2.0.0-alpha.2
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
Turning this into a feature - it'd be great to help diagnose Kubeflow a TEENY bit. If we can figure out SOME way of detecting the thing is going to fail, that'd be awesome.
Might be a more generally useful feature for backends - the functions
backend needs to provision stuff on azure, for instance, and it would be nice to check that it's all in place correctly before executing notebooks.
Not sure where this came about - it was after I merged main back into my vertex branch.
https://github.com/kubeflow/pipelines/issues/7676
Thoughts?