c0c0n3 / kitt4sme.live

On a mission to bring AI to the shop floor: https://kitt4sme.eu/
MIT License
1 stars 28 forks source link

Istiocd timeout when pursuing local deployment of kitt4sme.live cluster #242

Open elisabet-cg opened 1 year ago

elisabet-cg commented 1 year ago

Describe the bug

Bootstrap cluster procedure. We have provisioned a Virtual Machine with Multipass with the Ubuntu 20.04.1 LTS, GNU/Linux 5.4.0-42-generic x86_64, 8 CPUs, 16GB RAM/4GB swap, 120GB storage. When trying to install the Mesh infra Istio (istioctl install -y --verify -f profile.yaml), we get the error under additional content.

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

We are pursuing a local deployment of the kitt4sme.live instance in our own box. We have followed the instructions of the Bootstrap cluster procedure. We have provisioned a Virtual Machine with Multipass with the Ubuntu 20.04.1 LTS, GNU/Linux 5.4.0-42-generic x86_64, 8 CPUs, 16GB RAM/4GB swap, 120GB storage. When trying to install the Mesh infra Istio (istioctl install -y --verify -f profile.yaml), we get the following error:

✔ Istio core installed                                                                                    
✘ Istiod encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
  Deployment/istio-system/istiod (container failed to start: CrashLoopBackOff: back-off 2m40s restarting failed container=discovery pod=istiod-5847c59c69-clbz4_istio-system(c989f690-8a80-4135-be38-944f8141c7d5))'

When debugging, we get the following information:

Name:         istiod-5847c59c69-clbz4
Namespace:    istio-system
Priority:     0
Node:         kitt4sme/192.168.64.2
Start Time:   Thu, 27 Apr 2023 08:37:44 +0200
Labels:       app=istiod
              install.operator.istio.io/owning-resource=unknown
              istio=pilot
              istio.io/rev=default
              operator.istio.io/component=Pilot
              pod-template-hash=5847c59c69
              sidecar.istio.io/inject=false
Annotations:  cni.projectcalico.org/podIP: 10.1.124.236/32
              cni.projectcalico.org/podIPs: 10.1.124.236/32
              prometheus.io/port: 15014
              prometheus.io/scrape: true
              sidecar.istio.io/inject: false
Status:       Running
IP:           10.1.124.236
IPs:
  IP:           10.1.124.236
Controlled By:  ReplicaSet/istiod-5847c59c69
Containers:
  discovery:
    Container ID:  containerd://a44af1dbb4b2382fea77435670965204e2e6693169e3219318e1cd3f902cbc6c
    Image:         docker.io/istio/pilot:1.11.4
    Image ID:      docker.io/istio/pilot@sha256:c590783fc54aec5d3edb44e3f588be5431db9f0844d44c4314a896728cdbbf77
    Ports:         8080/TCP, 15010/TCP, 15017/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      discovery
      --monitoringAddr=:15014
      --log_output_level=default:info
      --domain
      cluster.local
      --keepaliveMaxServerConnectionAge
      30m
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 27 Apr 2023 08:43:21 +0200
      Finished:     Thu, 27 Apr 2023 08:43:21 +0200
    Ready:          False
    Restart Count:  6
    Requests:
      cpu:      10m
      memory:   100Mi
    Readiness:  http-get http://:8080/ready delay=1s timeout=5s period=3s #success=1 #failure=3
    Environment:
      REVISION:                                     default
      JWT_POLICY:                                   third-party-jwt
      PILOT_CERT_PROVIDER:                          istiod
      POD_NAME:                                     istiod-5847c59c69-clbz4 (v1:metadata.name)
      POD_NAMESPACE:                                istio-system (v1:metadata.namespace)
      SERVICE_ACCOUNT:                               (v1:spec.serviceAccountName)
      KUBECONFIG:                                   /var/run/secrets/remote/config
      ENABLE_LEGACY_FSGROUP_INJECTION:              false
      PILOT_TRACE_SAMPLING:                         100
      PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_OUTBOUND:  true
      PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_INBOUND:   true
      ISTIOD_ADDR:                                  istiod.istio-system.svc:15012
      PILOT_ENABLE_ANALYSIS:                        false
      CLUSTER_ID:                                   Kubernetes
    Mounts:
      /etc/cacerts from cacerts (ro)
      /var/run/secrets/istio-dns from local-certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vl895 (ro)
      /var/run/secrets/remote from istio-kubeconfig (ro)
      /var/run/secrets/tokens from istio-token (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  local-certs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  istio-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  43200
  cacerts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cacerts
    Optional:    true
  istio-kubeconfig:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  istio-kubeconfig
    Optional:    true
  kube-api-access-vl895:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                     From               Message
  ----     ------     ----                    ----               -------
  Normal   Scheduled  8m31s                   default-scheduler  Successfully assigned istio-system/istiod-5847c59c69-clbz4 to kitt4sme
  Normal   Pulled     7m9s (x5 over 8m31s)    kubelet            Container image "docker.io/istio/pilot:1.11.4" already present on machine
  Normal   Created    7m9s (x5 over 8m31s)    kubelet            Created container discovery
  Normal   Started    7m9s (x5 over 8m31s)    kubelet            Started container discovery
  Warning  BackOff    3m17s (x30 over 8m29s)  kubelet            Back-off restarting failed container

Do you have any hint on how we could resolve this issue? We have tried installing different compatible versions of Istio (1.11.4,1.12, 1.13.3 and 1.17)., and none of them worked. We have also increased RAM to 32GB and cpus to 10, but it did not work either. We have followed all the proposed solutions here, with no results.

RyanKelvinFord commented 1 year ago

Is this still an issue?

kostasgrevenitis commented 1 year ago

Similar issue here https://discuss.istio.io/t/istiod-installation-fails/7093/14