<pending> EXTERNAL-IP when linking the domain to the load balancer

PaulSteffen-betclic commented 3 years ago

When I try to follow the instructions at https://github.com/datarevenue-berlin/OpenMLOps/blob/master/tutorials/set-up-open-source-production-mlops-architecture-aws.md

I got to the step running: kubectl get svc -n ambassador

But I got the following return:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ambassador LoadBalancer 172.20.12.231 pending 443:30209/TCP 55m ambassador-admin ClusterIP 172.20.48.162 none 8877/TCP,8005/TCP 55m

pipatth commented 3 years ago

Thank you for reaching out @PaulSteffen-betclic

There are few possibilities here. Can you please check if other pods are also pending?

You can check this by running kubectl get pods --all-namespaces. If there are few pods in pending, it might be that you don't have enough resources (probably you're deploying to a region with more than three availability zones e.g. us-east-1).

If this is the case, you can ...

destroy the cluster by running terraform destroy -var-file=my_vars.tfvars
for the repo OpenMLOps-EKS-cluster, checkout the branch pipatth/fix-autoscaler and use it instead of the master. This branch will deploy the cluster to only two availability zones and you shouldn't have a problem with computing resources.
continue the tutorial from this step: https://github.com/datarevenue-berlin/OpenMLOps/blob/master/tutorials/set-up-open-source-production-mlops-architecture-aws.md#creating-the-terraform-plan

Let me know if this works out.

PaulSteffen-betclic commented 3 years ago

Thank you for your answer @pipatth.

No pods are also pending.

NAMESPACE NAME READY STATUS RESTARTS AGE ambassador ambassador-55dd56c9d8-n97ff 1/1 Running 0 8m4s dask dask-jupyter-85f4d96c8d-jddz8 1/1 Running 0 8m29s dask dask-scheduler-6b59ddb864-v9drh 1/1 Running 0 8m28s dask dask-worker-cb785d955-fw2p7 1/1 Running 0 8m28s dask dask-worker-cb785d955-j99n7 1/1 Running 0 8m29s dask dask-worker-cb785d955-mvxgd 1/1 Running 0 8m28s feast feast-feast-core-5ff9695c44-z5z8l 1/1 Running 0 8m15s feast feast-feast-jobservice-75d6fd6c58-h4hz6 1/1 Running 0 8m15s feast feast-feast-online-serving-5847c98d77-4hnz4 1/1 Running 0 8m16s feast feast-postgresql-0 1/1 Running 0 8m15s feast feast-redis-master-0 1/1 Running 0 8m15s feast feast-redis-slave-0 1/1 Running 0 8m15s feast feast-redis-slave-1 1/1 Running 0 7m4s feast feast-spark-spark-operator-85fc4fb995-brxcs 1/1 Running 0 8m19s jhub continuous-image-puller-brf7w 1/1 Running 0 2m8s jhub continuous-image-puller-pwmn4 1/1 Running 0 2m8s jhub continuous-image-puller-ztz92 1/1 Running 0 2m8s jhub hub-57fc96d677-95777 1/1 Running 0 2m8s jhub proxy-69896f594b-fwg7f 1/1 Running 0 2m8s jhub user-scheduler-5464f84c96-7ts2b 1/1 Running 0 2m8s jhub user-scheduler-5464f84c96-m6hsv 1/1 Running 0 2m8s kube-system autoscaler-aws-cluster-autoscaler-6fdb5cf446-mzq2f 1/1 Running 0 11m kube-system aws-node-6668f 1/1 Running 0 10m kube-system aws-node-72vgl 1/1 Running 0 10m kube-system aws-node-mq2cw 1/1 Running 0 10m kube-system coredns-6ddcfb5bcf-6ft45 1/1 Running 0 14m kube-system coredns-6ddcfb5bcf-mtpfg 1/1 Running 0 14m kube-system kube-proxy-5xwhk 1/1 Running 0 10m kube-system kube-proxy-dbcsl 1/1 Running 0 10m kube-system kube-proxy-xvzcx 1/1 Running 0 10m kube-system metrics-server-c65bf9997-wwsl2 1/1 Running 0 8m49s kube-system seldon-spartakus-volunteer-8488bc5849-ht8cl 1/1 Running 0 8m21s mlflow mlflow-595f7556c9-cw9x8 1/1 Running 0 8m12s mlflow postgres-postgresql-0 1/1 Running 0 8m57s ory ory-kratos-5f777789c7-mwfs8 0/1 Running 0 6m47s ory ory-kratos-courier-0 1/1 Running 0 6m47s ory ory-kratos-ui-5857cc6d9b-kwdmt 1/1 Running 0 100s ory ory-oathkeeper-6bf994cf97-b6n2b 1/1 Running 0 8m30s ory postgres-postgresql-0 1/1 Running 0 8m19s prefect prefect-server-agent-784d877787-wgbzz 1/1 Running 1 8m16s prefect prefect-server-apollo-57fc96dfcd-22hnx 1/1 Running 0 8m16s prefect prefect-server-create-tenant-job-6cwnl 0/1 Completed 2 8m15s prefect prefect-server-graphql-96fb476c4-b7k4d 1/1 Running 0 8m16s prefect prefect-server-hasura-5d45596fd-pdwl7 1/1 Running 3 8m16s prefect prefect-server-postgresql-0 1/1 Running 0 8m15s prefect prefect-server-towel-686fd94f7f-2hppd 1/1 Running 0 8m16s prefect prefect-server-ui-575b447fbc-dvz5w 1/1 Running 0 8m16s seldon seldon-controller-manager-6b6c65f4c4-b2dr4 1/1 Running 0 8m21s

I change the region used to select eu-west-2, as in your tutorial, I destroy the cluster and I checkout the branch pipatth/fix-autoscaler before redoing the tutorial.

But the issue persist ...

pipatth commented 3 years ago

@PaulSteffen-betclic Can you please send me the log from that pod?

i.e. kubectl logs -n ambassador ambassador-55dd56c9d8-n97ff

datarevenue-berlin / OpenMLOps

<pending> EXTERNAL-IP when linking the domain to the load balancer #73