giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

PVC not provisioned on `goat` #1687

Closed alex-dabija closed 1 year ago

alex-dabija commented 1 year ago

Issue

PVC is not provisioned on goat.

Example:

❯ kubectl -n goat-prometheus describe pod prometheus-goat-0
Name:                 prometheus-goat-0
Namespace:            goat-prometheus
Priority:             500000000
Priority Class Name:  prometheus
Service Account:      default
Node:                 <none>
Labels:               app.kubernetes.io/instance=goat
                      app.kubernetes.io/managed-by=prometheus-operator
                      app.kubernetes.io/name=prometheus
                      app.kubernetes.io/version=2.32.1
                      controller-revision-hash=prometheus-goat-58d57f7f6d
                      giantswarm.io/cluster=goat
                      giantswarm.io/monitoring=true
                      operator.prometheus.io/name=goat
                      operator.prometheus.io/shard=0
                      prometheus=goat
                      statefulset.kubernetes.io/pod-name=prometheus-goat-0
Annotations:          kubectl.kubernetes.io/default-container: prometheus
                      kubernetes.io/psp: prometheus-psp
Status:               Pending
IP:
IPs:                  <none>
Controlled By:        StatefulSet/prometheus-goat
Init Containers:
  init-config-reloader:
    Image:      docker.io/giantswarm/prometheus-config-reloader:v0.54.0
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      /bin/prometheus-config-reloader
    Args:
      --watch-interval=0
      --listen-address=:8080
      --config-file=/etc/prometheus/config/prometheus.yaml.gz
      --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
      --watched-dir=/etc/prometheus/rules/prometheus-goat-rulefiles-0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:  prometheus-goat-0 (v1:metadata.name)
      SHARD:     0
    Mounts:
      /etc/prometheus/config from config (rw)
      /etc/prometheus/config_out from config-out (rw)
      /etc/prometheus/rules/prometheus-goat-rulefiles-0 from prometheus-goat-rulefiles-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q2b94 (ro)
Containers:
  prometheus:
    Image:      docker.io/giantswarm/prometheus:v2.39.1
    Port:       9090/TCP
    Host Port:  0/TCP
    Args:
      --web.console.templates=/etc/prometheus/consoles
      --web.console.libraries=/etc/prometheus/console_libraries
      --storage.tsdb.retention.size=90GB
      --config.file=/etc/prometheus/config_out/prometheus.env.yaml
      --storage.tsdb.path=/prometheus
      --storage.tsdb.retention.time=2w
      --web.enable-lifecycle
      --web.page-title=goat/goat Prometheus
      --enable-feature=remote-write-receiver
      --web.external-url=https://prometheus.goat.gaws.gigantic.io/goat
      --web.route-prefix=/goat
      --storage.tsdb.wal-compression
      --web.config.file=/etc/prometheus/web_config/web-config.yaml
    Limits:
      cpu:     150m
      memory:  1288490188
    Requests:
      cpu:      100m
      memory:   1073741824
    Readiness:  http-get http://:web/goat/-/ready delay=300s timeout=3s period=5s #success=1 #failure=3
    Startup:    http-get http://:web/goat/-/ready delay=0s timeout=3s period=15s #success=1 #failure=60
    Environment:
      HTTP_PROXY:   http://proxy.goatproxy.gaws.gigantic.io:4000
      http_proxy:   http://proxy.goatproxy.gaws.gigantic.io:4000
      HTTPS_PROXY:  http://proxy.goatproxy.gaws.gigantic.io:4000
      https_proxy:  http://proxy.goatproxy.gaws.gigantic.io:4000
      NO_PROXY:     10.224.0.0/16,100.64.0.0/12,172.31.0.0/16,internal-goat-apiserver-1888485199.eu-north-1.elb.amazonaws.com,svc,127.0.0.1,localhost,no-domain.com
      no_proxy:     10.224.0.0/16,100.64.0.0/12,172.31.0.0/16,internal-goat-apiserver-1888485199.eu-north-1.elb.amazonaws.com,svc,127.0.0.1,localhost,no-domain.com
    Mounts:
      /etc/prometheus/certs from tls-assets (ro)
      /etc/prometheus/config_out from config-out (ro)
      /etc/prometheus/rules/prometheus-goat-rulefiles-0 from prometheus-goat-rulefiles-0 (rw)
      /etc/prometheus/secrets/etcd-certificates from secret-etcd-certificates (ro)
      /etc/prometheus/web_config/web-config.yaml from web-config (ro,path="web-config.yaml")
      /prometheus from prometheus-goat-db (rw,path="prometheus-db")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q2b94 (ro)
  config-reloader:
    Image:      docker.io/giantswarm/prometheus-config-reloader:v0.54.0
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      /bin/prometheus-config-reloader
    Args:
      --listen-address=:8080
      --reload-url=http://127.0.0.1:9090/goat/-/reload
      --config-file=/etc/prometheus/config/prometheus.yaml.gz
      --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
      --watched-dir=/etc/prometheus/rules/prometheus-goat-rulefiles-0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      HTTP_PROXY:   http://proxy.goatproxy.gaws.gigantic.io:4000
      http_proxy:   http://proxy.goatproxy.gaws.gigantic.io:4000
      HTTPS_PROXY:  http://proxy.goatproxy.gaws.gigantic.io:4000
      https_proxy:  http://proxy.goatproxy.gaws.gigantic.io:4000
      NO_PROXY:     10.224.0.0/16,100.64.0.0/12,172.31.0.0/16,internal-goat-apiserver-1888485199.eu-north-1.elb.amazonaws.com,svc,127.0.0.1,localhost,no-domain.com
      no_proxy:     10.224.0.0/16,100.64.0.0/12,172.31.0.0/16,internal-goat-apiserver-1888485199.eu-north-1.elb.amazonaws.com,svc,127.0.0.1,localhost,no-domain.com
      POD_NAME:     prometheus-goat-0 (v1:metadata.name)
      SHARD:        0
    Mounts:
      /etc/prometheus/config from config (rw)
      /etc/prometheus/config_out from config-out (rw)
      /etc/prometheus/rules/prometheus-goat-rulefiles-0 from prometheus-goat-rulefiles-0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q2b94 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  prometheus-goat-db:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  prometheus-goat-db-prometheus-goat-0
    ReadOnly:   false
  config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-goat
    Optional:    false
  tls-assets:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          prometheus-goat-tls-assets-0
    SecretOptionalName:  <nil>
  config-out:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  prometheus-goat-rulefiles-0:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      prometheus-goat-rulefiles-0
    Optional:  false
  web-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prometheus-goat-web-config
    Optional:    false
  secret-etcd-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  etcd-certificates
    Optional:    false
  kube-api-access-q2b94:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               <none>
Tolerations:                  node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/name=prometheus
Events:
  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  FailedScheduling  13s (x3888 over 2d17h)  default-scheduler  0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims.

Questions

tuladhar commented 1 year ago

Why does this occur?

What can be done to fix the problem?