Closed ca-scribner closed 3 years ago
I think the root issue you're seeing here is that istio-ingressgateway
wasn't set up correctly. Although it eventually came up, at the time this code ran, service/istio-ingressgateway
didn't have a loadbalancer IP set up by metallb yet.
You should be able to fix this by calculating the right hostname and running these commands again. That would end up being
juju config dex-auth public-url=10.64.140.43.nip.io
juju config oidc-gatekeeper public-url=10.64.140.43.nip.io
As far as why istio-ingressgateway
had issues, it's a little hard to diagnose what exactly went wrong. That charm relies on reading a configmap generated by istio-pilot. That code only runs when Juju executes the operator code, and the default update-status-hook-interval is 5m for a model. So if istio-ingressgateway
can't read that configmap in the first few hooks that run, it only checks every 5 minutes by default. We should probably import this code over to microk8s enable kubeflow
to handle that situation a little better.
After more debugging I could not get the oidc-gatekeeper
application to behave, so I tried:
dex-auth
and oidc-gatekeeper
juju remove-application dex-auth oidc-gatekeeper --force
microk8s juju deploy dex-auth
microk8s juju deploy oidc-gatekeeper
microk8s juju relate dex-auth:oidc-client oidc-gatekeeper:oidc-client
microk8s juju relate istio-pilot:ingress oidc-gatekeeper:ingress
microk8s juju relate istio-pilot:ingress-auth oidc-gatekeeper:ingress-auth
microk8s juju config dex-auth static-username=admin
microk8s juju config dex-auth static-password=admin
microk8s juju config dex-auth public-url=http://10.64.140.43.nip.io
microk8s juju config oidc-gatekeeper public-url=http://10.64.140.43.nip.io
After waiting for everything to normalize, I successfully connected to the Kubeflow dashboard. I'm not sure why the oidc-gatekeeper
charm became unrecoverable (maybe related to the incorrect public-url=http://localhost
? maybe something else that just made that harder to deal with?) or what to do about that.
After doing a
microk8s enable kubeflow
to deploy the bundle (trying both full and lite bundles) on a local machine, the Kubeflow web portal was inaccessible returning a 403 (forbidden) and some juju applications weren't working as expected. It feels like there's both microk8s and kubeflow bundle problems being encountered - for the microk8s specific ones I'll open something else there as well, but I think I worked around them and have only kubeflow-bundle problems remaining.During the deployment, I hit a few snags:
when deploying the istio-ingressgateway, the app was stuck waiting with status
Waiting for Istio Pilot information
. I resolved this (I think) by step 7 here then waiting ~10 minuteswhen all applications were up, microk8s could not identify hostname automatically, stating
WARNING: Unable to determine hostname, defaulting to localhost
andThe dashboard is available at http://localhost
. I manually retrieved the istio-ingressgateway IP and tried logging into the dashboard there (INGRESSIP.nip.io) but received a 403 errorAfter doing step 1. I left the deployment overnight to resolve. When I returned in the morning the oidc-gatekeeper/0 unit appeared stuck in a crash loop unable to get a pod. Not sure if this happened immediately after deployment or somewhere overnight.
juju status
shows:At other times, this has shown the agent being lost (see full
juju status
below)I think as a result of 2,
dex-auth.public-url=http://localhost
andoidc-gatekeeper.public-url=http://localhost
which is likely wrong (see step 6). I updated those (juju config
) to beINGRESSIP.nip.io
. I thinkdex-auth
accepted this and restarted, butoidc-gatekeeper
did not appear to update properly.Other resources:
microk8s inspect
: inspection-report-20210716_100355.tar.gzjuju status
:App Version Status Scale Charm Store Channel Rev OS Address Message admission-webhook res:oci-image@1abb127 active 1 admission-webhook charmstore stable 10 kubernetes 10.152.183.19
argo-controller res:oci-image@c1746ae active 1 argo-controller charmstore stable 51 kubernetes
dex-auth res:oci-image@af9c1b3 active 1 dex-auth charmstore stable 60 kubernetes 10.152.183.13
istio-ingressgateway res:oci-image@89b5fe2 active 1 istio-ingressgateway charmstore stable 20 kubernetes 10.64.140.43
istio-pilot res:oci-image@e3e03b3 active 1 istio-pilot charmstore stable 20 kubernetes 10.152.183.233
jupyter-controller res:oci-image@8c7be42 active 1 jupyter-controller charmstore stable 55 kubernetes
jupyter-ui res:oci-image@af3b8ce active 1 jupyter-ui charmstore stable 9 kubernetes 10.152.183.133
kfp-api res:oci-image@8e60840 active 1 kfp-api charmstore stable 10 kubernetes 10.152.183.220
kfp-db mariadb/server:10.3 active 1 mariadb-k8s charmstore stable 35 kubernetes 10.152.183.239
kfp-persistence res:oci-image@9338d08 active 1 kfp-persistence charmstore stable 7 kubernetes
kfp-schedwf res:oci-image@4ab6488 active 1 kfp-schedwf charmstore stable 7 kubernetes
kfp-ui res:oci-image@04a4348 active 1 kfp-ui charmstore stable 9 kubernetes 10.152.183.226
kfp-viewer res:oci-image@bae62bf active 1 kfp-viewer charmstore stable 7 kubernetes
kfp-viz res:oci-image@c90a581 active 1 kfp-viz charmstore stable 6 kubernetes 10.152.183.24
kubeflow-dashboard res:oci-image@126c9a9 active 1 kubeflow-dashboard charmstore stable 56 kubernetes 10.152.183.213
kubeflow-profiles res:profile-image@582b8eb active 1 kubeflow-profiles charmstore stable 52 kubernetes 10.152.183.108
minio res:oci-image@4707912 active 1 minio charmstore stable 55 kubernetes 10.152.183.118
mlmd res:oci-image@78eb66d active 1 mlmd charmstore stable 5 kubernetes 10.152.183.214
oidc-gatekeeper res:oci-image@9bb01f7 active 0/1 oidc-gatekeeper charmstore stable 53 kubernetes 10.152.183.144
pytorch-operator res:oci-image@08c3373 active 1 pytorch-operator charmstore stable 53 kubernetes
seldon-controller-manager res:oci-image@82fd029 active 1 seldon-core charmstore stable 50 kubernetes 10.152.183.69
tfjob-operator res:oci-image@3fabaf3 active 1 tfjob-operator charmstore stable 1 kubernetes
Unit Workload Agent Address Ports Message admission-webhook/0 active idle 10.1.182.77 443/TCP
argo-controller/0 active idle 10.1.182.114
dex-auth/6 active idle 10.1.182.124 5556/TCP
istio-ingressgateway/0 active idle 10.1.182.121 15020/TCP,80/TCP,443/TCP,15029/TCP,15030/TCP,15031/TCP,15032/TCP,15443/TCP,15011/TCP,8060/TCP,853/TCP
istio-pilot/0 active idle 10.1.182.90 8080/TCP,15010/TCP,15012/TCP,15017/TCP
jupyter-controller/0 active idle 10.1.182.88
jupyter-ui/0 active idle 10.1.182.83 5000/TCP
kfp-api/0 active idle 10.1.182.116 8888/TCP,8887/TCP
kfp-db/0 active idle 10.1.182.89 3306/TCP ready kfp-persistence/0 active idle 10.1.182.115
kfp-schedwf/0 active idle 10.1.182.104
kfp-ui/0 active idle 10.1.182.117 3000/TCP
kfp-viewer/0 active idle 10.1.182.111
kfp-viz/0 active idle 10.1.182.106 8888/TCP
kubeflow-dashboard/0 active idle 10.1.182.112 8082/TCP
kubeflow-profiles/0 active idle 10.1.182.105 8080/TCP,8081/TCP
minio/0 active idle 10.1.182.108 9000/TCP
mlmd/0 active idle 10.1.182.110 8080/TCP
oidc-gatekeeper/0 unknown lost 10.1.182.113 8080/TCP agent lost, see 'juju show-status-log oidc-gatekeeper/0' oidc-gatekeeper/1 unknown lost 10.1.182.119 8080/TCP agent lost, see 'juju show-status-log oidc-gatekeeper/1' pytorch-operator/0 active idle 10.1.182.107 8443/TCP
seldon-controller-manager/0* active idle 10.1.182.100 8080/TCP,4443/TCP
tfjob-operator/0 active idle 10.1.182.109 8443/TCP
NAMESPACE NAME READY STATUS RESTARTS AGE metallb-system pod/speaker-hv4ll 1/1 Running 0 18h metallb-system pod/controller-559b68bfd8-zvhsk 1/1 Running 0 18h kube-system pod/hostpath-provisioner-5c65fbdb4f-6cz2m 1/1 Running 0 18h controller-uk8s pod/modeloperator-649944bf89-swqsg 1/1 Running 0 18h kubeflow pod/modeloperator-658b4b6c58-gn5fc 1/1 Running 0 18h kubeflow pod/admission-webhook-operator-0 1/1 Running 0 18h kubeflow pod/argo-controller-operator-0 1/1 Running 0 18h kubeflow pod/dex-auth-operator-0 1/1 Running 0 18h kubeflow pod/admission-webhook-795c896784-n5bcc 1/1 Running 0 18h kubeflow pod/jupyter-ui-operator-0 1/1 Running 0 18h kubeflow pod/istio-ingressgateway-operator-0 1/1 Running 0 18h kubeflow pod/istio-pilot-operator-0 1/1 Running 0 18h kubeflow pod/jupyter-controller-operator-0 1/1 Running 0 18h kubeflow pod/kfp-db-operator-0 1/1 Running 0 18h kubeflow pod/kfp-api-operator-0 1/1 Running 0 18h kubeflow pod/kfp-persistence-operator-0 1/1 Running 0 18h kubeflow pod/seldon-controller-manager-operator-0 1/1 Running 0 18h kubeflow pod/kubeflow-profiles-operator-0 1/1 Running 0 18h kubeflow pod/kfp-ui-operator-0 1/1 Running 0 18h kubeflow pod/kfp-schedwf-operator-0 1/1 Running 0 18h kubeflow pod/kfp-viz-operator-0 1/1 Running 0 18h kubeflow pod/minio-operator-0 1/1 Running 0 18h kubeflow pod/oidc-gatekeeper-operator-0 1/1 Running 0 18h kubeflow pod/pytorch-operator-operator-0 1/1 Running 0 18h kubeflow pod/tfjob-operator-operator-0 1/1 Running 0 18h kubeflow pod/kfp-viewer-operator-0 1/1 Running 0 18h kubeflow pod/kubeflow-dashboard-operator-0 1/1 Running 0 18h kubeflow pod/mlmd-operator-0 1/1 Running 0 18h kubeflow pod/jupyter-ui-6d87f8dc8-4j4nz 1/1 Running 0 18h kubeflow pod/jupyter-controller-5b9b44fdc4-brtxs 1/1 Running 0 18h kubeflow pod/kfp-schedwf-bc96bbbd6-dvspd 1/1 Running 0 18h kubeflow pod/minio-0 1/1 Running 0 18h kubeflow pod/kfp-viewer-6d96fbf466-khqq4 1/1 Running 0 18h kubeflow pod/kubeflow-dashboard-694d66ffb6-nxl2n 1/1 Running 0 17h kubeflow pod/kubeflow-profiles-7d54db8b75-qnnvf 2/2 Running 0 18h kubeflow pod/oidc-gatekeeper-748687b564-429zg 1/1 Running 0 17h kubeflow pod/kfp-persistence-648f685479-jbcwn 1/1 Running 2 17h kubeflow pod/istio-ingressgateway-957447478-cd8tr 1/1 Running 0 17h kubeflow pod/kfp-viz-67775f7888-z7zp9 1/1 Running 0 18h kubeflow pod/kfp-api-54dd7dc858-2xrgp 1/1 Running 0 17h kubeflow pod/kfp-ui-869494c98c-8m8wj 1/1 Running 0 17h kube-system pod/calico-kube-controllers-f7868dd95-47v6p 1/1 Running 0 18h kubeflow pod/kfp-db-0 1/1 Running 0 18h kubeflow pod/istio-pilot-7bfdbc474b-prtlk 1/1 Running 0 18h ingress pod/nginx-ingress-microk8s-controller-59dhj 1/1 Running 0 18h kube-system pod/coredns-7f9c69c78c-fffsj 1/1 Running 0 18h controller-uk8s pod/controller-0 2/2 Running 1 18h kubeflow pod/tfjob-operator-965d5c769-7gltp 1/1 Running 1 18h kubeflow pod/mlmd-0 1/1 Running 0 18h kube-system pod/calico-node-jn6gn 1/1 Running 0 18h kubeflow pod/pytorch-operator-568d56c769-pf2pj 1/1 Running 1 18h kubeflow pod/dex-auth-5f68f57bc9-jcb8w 1/1 Running 0 141m kubeflow pod/seldon-controller-manager-5c8fbffc67-hfhgt 0/1 CrashLoopBackOff 148 18h kubeflow pod/argo-controller-84468669d4-h4x6g 0/1 CrashLoopBackOff 148 17h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.152.183.1 443/TCP 18h
kube-system service/kube-dns ClusterIP 10.152.183.10 53/UDP,53/TCP,9153/TCP 18h
controller-uk8s service/controller-service ClusterIP 10.152.183.135 17070/TCP 18h
controller-uk8s service/modeloperator ClusterIP 10.152.183.229 17071/TCP 18h
kubeflow service/modeloperator ClusterIP 10.152.183.254 17071/TCP 18h
kubeflow service/admission-webhook-operator ClusterIP 10.152.183.191 30666/TCP 18h
kubeflow service/argo-controller-operator ClusterIP 10.152.183.143 30666/TCP 18h
kubeflow service/dex-auth-operator ClusterIP 10.152.183.140 30666/TCP 18h
kubeflow service/istio-ingressgateway-operator ClusterIP 10.152.183.129 30666/TCP 18h
kubeflow service/admission-webhook ClusterIP 10.152.183.19 443/TCP 18h
kubeflow service/istio-pilot-operator ClusterIP 10.152.183.212 30666/TCP 18h
kubeflow service/jupyter-controller-operator ClusterIP 10.152.183.3 30666/TCP 18h
kubeflow service/jupyter-ui-operator ClusterIP 10.152.183.227 30666/TCP 18h
kubeflow service/dex-auth ClusterIP 10.152.183.13 5556/TCP 18h
kubeflow service/kfp-api-operator ClusterIP 10.152.183.63 30666/TCP 18h
kubeflow service/kfp-db-operator ClusterIP 10.152.183.219 30666/TCP 18h
kubeflow service/jupyter-ui ClusterIP 10.152.183.133 5000/TCP 18h
kubeflow service/kfp-persistence-operator ClusterIP 10.152.183.246 30666/TCP 18h
kubeflow service/kfp-db ClusterIP 10.152.183.239 3306/TCP 18h
kubeflow service/kfp-db-endpoints ClusterIP None 18h
kubeflow service/istio-pilot ClusterIP 10.152.183.233 8080/TCP,15010/TCP,15012/TCP,15017/TCP 18h
kubeflow service/seldon-controller-manager-operator ClusterIP 10.152.183.36 30666/TCP 18h
kubeflow service/kfp-schedwf-operator ClusterIP 10.152.183.117 30666/TCP 18h
kubeflow service/kfp-ui-operator ClusterIP 10.152.183.238 30666/TCP 18h
kubeflow service/kfp-viz-operator ClusterIP 10.152.183.119 30666/TCP 18h
kubeflow service/kubeflow-profiles-operator ClusterIP 10.152.183.247 30666/TCP 18h
kubeflow service/minio-operator ClusterIP 10.152.183.81 30666/TCP 18h
kubeflow service/oidc-gatekeeper-operator ClusterIP 10.152.183.109 30666/TCP 18h
kubeflow service/pytorch-operator-operator ClusterIP 10.152.183.244 30666/TCP 18h
kubeflow service/tfjob-operator-operator ClusterIP 10.152.183.134 30666/TCP 18h
kubeflow service/kfp-viewer-operator ClusterIP 10.152.183.151 30666/TCP 18h
kubeflow service/seldon-controller-manager ClusterIP 10.152.183.69 8080/TCP,4443/TCP 18h
kubeflow service/kubeflow-dashboard-operator ClusterIP 10.152.183.130 30666/TCP 18h
kubeflow service/mlmd-operator ClusterIP 10.152.183.160 30666/TCP 18h
kubeflow service/kubeflow-profiles ClusterIP 10.152.183.108 8080/TCP,8081/TCP 18h
kubeflow service/kfp-viz ClusterIP 10.152.183.24 8888/TCP 18h
kubeflow service/minio ClusterIP 10.152.183.118 9000/TCP 18h
kubeflow service/minio-endpoints ClusterIP None 18h
kubeflow service/mlmd ClusterIP 10.152.183.214 8080/TCP 18h
kubeflow service/mlmd-endpoints ClusterIP None 18h
kubeflow service/ml-pipeline ClusterIP 10.152.183.220 8887/TCP,8888/TCP 18h
kubeflow service/kubeflow-dashboard ClusterIP 10.152.183.213 8082/TCP 18h
kubeflow service/oidc-gatekeeper ClusterIP 10.152.183.144 8080/TCP 17h
kubeflow service/kfp-api ClusterIP 10.152.183.104 8888/TCP,8887/TCP 17h
kubeflow service/kfp-ui ClusterIP 10.152.183.226 3000/TCP 17h
kubeflow service/istio-ingressgateway LoadBalancer 10.152.183.131 10.64.140.43 15020:30975/TCP,80:30867/TCP,443:30879/TCP,15029:30529/TCP,15030:31280/TCP,15031:32546/TCP,15032:30608/TCP,15443:31540/TCP,15011:30785/TCP,8060:32290/TCP,853:30859/TCP 17h
kubeflow service/seldon-webhook-service ClusterIP 10.152.183.9 4443/TCP 17h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system daemonset.apps/calico-node 1 1 1 1 1 kubernetes.io/os=linux 18h metallb-system daemonset.apps/speaker 1 1 1 1 1 beta.kubernetes.io/os=linux 18h ingress daemonset.apps/nginx-ingress-microk8s-controller 1 1 1 1 1 18h
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system deployment.apps/calico-kube-controllers 1/1 1 1 18h kube-system deployment.apps/coredns 1/1 1 1 18h metallb-system deployment.apps/controller 1/1 1 1 18h kube-system deployment.apps/hostpath-provisioner 1/1 1 1 18h controller-uk8s deployment.apps/modeloperator 1/1 1 1 18h kubeflow deployment.apps/modeloperator 1/1 1 1 18h kubeflow deployment.apps/jupyter-controller 1/1 1 1 18h kubeflow deployment.apps/admission-webhook 1/1 1 1 18h kubeflow deployment.apps/jupyter-ui 1/1 1 1 18h kubeflow deployment.apps/kfp-schedwf 1/1 1 1 18h kubeflow deployment.apps/istio-pilot 1/1 1 1 18h kubeflow deployment.apps/kfp-viz 1/1 1 1 18h kubeflow deployment.apps/pytorch-operator 1/1 1 1 18h kubeflow deployment.apps/tfjob-operator 1/1 1 1 18h kubeflow deployment.apps/kfp-viewer 1/1 1 1 18h kubeflow deployment.apps/kubeflow-dashboard 1/1 1 1 17h kubeflow deployment.apps/kfp-ui 1/1 1 1 17h kubeflow deployment.apps/istio-ingressgateway 1/1 1 1 17h kubeflow deployment.apps/kfp-api 1/1 1 1 17h kubeflow deployment.apps/kubeflow-profiles 1/1 1 1 18h kubeflow deployment.apps/oidc-gatekeeper 1/1 1 1 17h kubeflow deployment.apps/kfp-persistence 1/1 1 1 17h kubeflow deployment.apps/dex-auth 1/1 1 1 18h kubeflow deployment.apps/seldon-controller-manager 0/1 1 0 18h kubeflow deployment.apps/argo-controller 0/1 1 0 17h
NAMESPACE NAME DESIRED CURRENT READY AGE kube-system replicaset.apps/calico-kube-controllers-f7868dd95 1 1 1 18h kube-system replicaset.apps/coredns-7f9c69c78c 1 1 1 18h metallb-system replicaset.apps/controller-559b68bfd8 1 1 1 18h kube-system replicaset.apps/hostpath-provisioner-5c65fbdb4f 1 1 1 18h controller-uk8s replicaset.apps/modeloperator-649944bf89 1 1 1 18h kubeflow replicaset.apps/modeloperator-658b4b6c58 1 1 1 18h kubeflow replicaset.apps/admission-webhook-795c896784 1 1 1 18h kubeflow replicaset.apps/jupyter-ui-6d87f8dc8 1 1 1 18h kubeflow replicaset.apps/jupyter-controller-5b9b44fdc4 1 1 1 18h kubeflow replicaset.apps/istio-pilot-7bfdbc474b 1 1 1 18h kubeflow replicaset.apps/kfp-schedwf-bc96bbbd6 1 1 1 18h kubeflow replicaset.apps/kfp-viz-67775f7888 1 1 1 18h kubeflow replicaset.apps/pytorch-operator-568d56c769 1 1 1 18h kubeflow replicaset.apps/tfjob-operator-965d5c769 1 1 1 18h kubeflow replicaset.apps/kfp-viewer-6d96fbf466 1 1 1 18h kubeflow replicaset.apps/kubeflow-dashboard-694d66ffb6 1 1 1 17h kubeflow replicaset.apps/kfp-ui-869494c98c 1 1 1 17h kubeflow replicaset.apps/istio-ingressgateway-957447478 1 1 1 17h kubeflow replicaset.apps/kfp-api-54dd7dc858 1 1 1 17h kubeflow replicaset.apps/kubeflow-profiles-7d54db8b75 1 1 1 18h kubeflow replicaset.apps/oidc-gatekeeper-748687b564 1 1 1 17h kubeflow replicaset.apps/kfp-persistence-648f685479 1 1 1 17h kubeflow replicaset.apps/dex-auth-5f68f57bc9 1 1 1 141m kubeflow replicaset.apps/seldon-controller-manager-5c8fbffc67 1 1 0 18h kubeflow replicaset.apps/argo-controller-84468669d4 1 1 0 17h
NAMESPACE NAME READY AGE controller-uk8s statefulset.apps/controller 1/1 18h kubeflow statefulset.apps/admission-webhook-operator 1/1 18h kubeflow statefulset.apps/argo-controller-operator 1/1 18h kubeflow statefulset.apps/dex-auth-operator 1/1 18h kubeflow statefulset.apps/jupyter-ui-operator 1/1 18h kubeflow statefulset.apps/istio-ingressgateway-operator 1/1 18h kubeflow statefulset.apps/istio-pilot-operator 1/1 18h kubeflow statefulset.apps/jupyter-controller-operator 1/1 18h kubeflow statefulset.apps/kfp-db-operator 1/1 18h kubeflow statefulset.apps/kfp-api-operator 1/1 18h kubeflow statefulset.apps/kfp-persistence-operator 1/1 18h kubeflow statefulset.apps/seldon-controller-manager-operator 1/1 18h kubeflow statefulset.apps/kubeflow-profiles-operator 1/1 18h kubeflow statefulset.apps/kfp-ui-operator 1/1 18h kubeflow statefulset.apps/kfp-schedwf-operator 1/1 18h kubeflow statefulset.apps/kfp-viz-operator 1/1 18h kubeflow statefulset.apps/minio-operator 1/1 18h kubeflow statefulset.apps/oidc-gatekeeper-operator 1/1 18h kubeflow statefulset.apps/pytorch-operator-operator 1/1 18h kubeflow statefulset.apps/tfjob-operator-operator 1/1 18h kubeflow statefulset.apps/kfp-viewer-operator 1/1 18h kubeflow statefulset.apps/kubeflow-dashboard-operator 1/1 18h kubeflow statefulset.apps/mlmd-operator 1/1 18h kubeflow statefulset.apps/kfp-db 1/1 18h kubeflow statefulset.apps/minio 1/1 18h kubeflow statefulset.apps/mlmd 1/1 18h