bentoml / yatai-deployment

🚀 Launching Bento in a Kubernetes cluster
17 stars 14 forks source link

Failed to reconcile BentoDeployment #57

Closed aditya624 closed 1 year ago

aditya624 commented 1 year ago

Failed to reconcile BentoDeployment: DoJsonRequest Error: [GET]http://yatai.yatai-system.svc.cluster.local/api/v1/current_org: Get "http://yatai.yatai-system.svc.cluster.local/api/v1/current_org": read tcp 10.244.2.243:46536->10.97.35.56:80: read: connection reset by peer

why alway failed to deploy either ui or cli

yetone commented 1 year ago

First, you should follow the documentation to check if the yatai pod is working properly:

https://docs.bentoml.org/projects/yatai/en/latest/installation/yatai.html#verify-the-yatai-installation

If it works properly, then check the network of your k8s cluster:

kubectl run test-network --rm --tty -i --restart='Never' \
    --namespace default \
    --image curlimages/curl \
    --command -- sh -c 'curl http://yatai.yatai-system.svc.cluster.local/api/v1/info'

The above command should return:

{"is_saas":false,"saas_domain_suffix":""}pod "test-network" deleted
aditya624 commented 1 year ago

i verify the installation, NAME READY STATUS RESTARTS AGE yatai-56d9c649b5-5wpsl 1/1 Running 0 2d10h

but when i try test-network, Error: ImagePullBackOff

yetone commented 1 year ago

@aditya624 which k8s distribution do you use? k3s?

aditya624 commented 1 year ago

k8s

result testing like: curl: (28) Failed to connect to yatai.yatai-system.svc.cluster.local port 80 after 129553 ms: Connection timed out pod "test-network" deleted pod default/test-network terminated (Error)

yetone commented 1 year ago

@aditya624 can you create an example service for testing?

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app.kubernetes.io/name: proxy
spec:
  containers:
  - name: nginx
    image: nginx:stable
    ports:
      - containerPort: 80
        name: http-web-svc

---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app.kubernetes.io/name: proxy
  ports:
  - name: name-of-service-port
    protocol: TCP
    port: 80
    targetPort: http-web-svc
EOF

And then test the network connection again:

kubectl run test-network --rm --tty -i --restart='Never' \
    --namespace default \
    --image curlimages/curl \
    --command -- sh -c 'curl http://nginx-service.default.svc.cluster.local'

The expected output should be:

image
aditya624 commented 1 year ago

i get the output like this

curl: (6) Could not resolve host: nginx-service.default.svc.cluster.local pod "test-network" deleted pod default/test-network terminated (Error)

yetone commented 1 year ago

@aditya624 Can you see the nginx-service service in default namespace?

kubectl get svc

This is my output:

image
aditya624 commented 1 year ago

svc nginx already exists. Is there something wrong with the Kubernetes network?

Screenshot from 2022-11-30 22-32-48

yetone commented 1 year ago

@aditya624

It looks like your kube-dns is not working properly, you should check the `kube-dns' component of your k8s cluster.

For example, you can try to restart the coredns deployment:

kubectl -n kube-sytem rollout restart deploy/coredns
aditya624 commented 1 year ago

@yetone i have question. Can i change the internal url ?

Screenshot from 2022-12-01 11-01-39

yetone commented 1 year ago

@aditya624 Since this URL is based on deployment service name and namespace, this URL cannot be changed:

${deploymentName}.${namespace}.svc.cluster.local

aditya624 commented 1 year ago

First, you should follow the documentation to check if the yatai pod is working properly:

https://docs.bentoml.org/projects/yatai/en/latest/installation/yatai.html#verify-the-yatai-installation

If it works properly, then check the network of your k8s cluster:

kubectl run test-network --rm --tty -i --restart='Never' \
    --namespace default \
    --image curlimages/curl \
    --command -- sh -c 'curl http://yatai.yatai-system.svc.cluster.local/api/v1/current_org'

The above command should return:

{"message":"username in cookie is empty"}pod "test-network" deleted

I failed to use your command line because it uses svc.cluster.local. however, it works without using the suffix.

do you know the location of the cluster domain settings in k8s ? i think "curl" failed because k8s domain cluster doesn't match svc.domain.cluster.

yetone commented 1 year ago

First, you should follow the documentation to check if the yatai pod is working properly: https://docs.bentoml.org/projects/yatai/en/latest/installation/yatai.html#verify-the-yatai-installation If it works properly, then check the network of your k8s cluster:

kubectl run test-network --rm --tty -i --restart='Never' \
    --namespace default \
    --image curlimages/curl \
    --command -- sh -c 'curl http://yatai.yatai-system.svc.cluster.local/api/v1/current_org'

The above command should return:

{"message":"username in cookie is empty"}pod "test-network" deleted

I failed to use your command line because it uses svc.cluster.local. however, it works without using the suffix.

do you know the location of the cluster domain settings in k8s ? i think "curl" failed because k8s domain cluster doesn't match svc.domain.cluster.

@aditya624 svc.cluster.local is the default cluster domain for k8s. If it does not work or is changed, you should communicate with your k8s cluster administrator. If you are the k8s cluster administrator, you should debug the cluster DNS following the official documentation: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

aditya624 commented 1 year ago

This problem solve. I just change the curl image. Iam using buildpack-deps:curl

Thank you.