kubecost / features-bugs

A public repository for filing of Kubecost feature requests and bugs. Please read the issue guidelines before filing an issue here.
0 stars 0 forks source link

Kubecost Promethus server in container creating state #50

Closed nidiculageorge closed 8 months ago

nidiculageorge commented 9 months ago

Kubecost Version

2.0.2

Kubernetes Version

1.27.3

Kubernetes Platform

AKS

Description

Hi Team

I have deployed kubecost using the following commands,as per the below link

https://www.kubecost.com/install#show-instructions

Installing Kubecost

https://www.kubecost.com/install#show-instructions

helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98" --set nodeSelector."kubernetes.io/os"=linux

After I deployed kubecost could see the promethus server pod is in container creating state

image

Steps to reproduce

  1. Installing Kubecost

https://www.kubecost.com/install#show-instructions

helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98" --set nodeSelector."kubernetes.io/os"=linux

Expected behavior

The kubecost promethus server pod should be in running state

Impact

Unable to load grafana dashboad

Screenshots

image

Logs

PS C:\Users\nidicula\OneDrive - RM PLC\PlatformandEngineering> kubectl describe pod kubecost-prometheus-server-657fb89fdc-5qzp6 -n kubecost    
Name:             kubecost-prometheus-server-657fb89fdc-5qzp6
Namespace:        kubecost
Priority:         0
Service Account:  kubecost-prometheus-server
Node:             aksnpwin00002s/172.28.194.101
Start Time:       Tue, 06 Feb 2024 11:30:59 +0530
Labels:           app=prometheus
                  component=server
                  heritage=Helm
                  pod-template-hash=657fb89fdc
                  release=kubecost
Annotations:      <none>
Status:           Pending
SeccompProfile:   RuntimeDefault
IP:
IPs:              <none>
Controlled By:    ReplicaSet/kubecost-prometheus-server-657fb89fdc
Containers:
  prometheus-server:
    Container ID:
    Image:         quay.io/prometheus/prometheus:v2.49.1
    Image ID:
    Port:          9090/TCP
    Host Port:     0/TCP
    Args:
      --storage.tsdb.retention.time=15d
      --config.file=/etc/config/prometheus.yml
      --storage.tsdb.path=/data
      --web.console.libraries=/etc/prometheus/console_libraries
      --web.console.templates=/etc/prometheus/consoles
      --web.enable-lifecycle
      --query.max-concurrency=1
      --query.max-samples=1e+08
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:9090/-/healthy delay=30s timeout=30s period=10s #success=1 #failure=3
    Readiness:      http-get http://:9090/-/ready delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /data from storage-volume (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tfnxm (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kubecost-prometheus-server
    Optional:  false
  storage-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  kubecost-prometheus-server
    ReadOnly:   false
  kube-api-access-tfnxm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                   From               Message
  ----     ------       ----                  ----               -------
  Normal   Scheduled    58m                   default-scheduler  Successfully assigned kubecost/kubecost-prometheus-server-657fb89fdc-5qzp6 to aksnpwin00002s
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_00_59.2247055574\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_00.4030118981\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_01.2512958877\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_03.2033108737\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_07.3884509833\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_15.1952477052\token: not supported by windows
  Warning  FailedMount  58m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_01_32.795173360\token: not supported by windows
  Warning  FailedMount  57m                   kubelet            MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_02_04.2687926109\token: not supported by windows
  Warning  FailedMount  13m (x18 over 56m)    kubelet            Unable to attach or mount volumes: unmounted volumes=[kube-api-access-tfnxm], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount  3m43s (x30 over 56m)  kubelet            (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-tfnxm" : chown c:\var\lib\kubelet\pods\a1e95dac-07cb-4fa3-9be4-940f6d25f1a7\volumes\kubernetes.io~projected\kube-api-access-tfnxm\..2024_02_06_06_56_03.814271718\token: not supported by windows

Slack discussion

No response

Troubleshooting

nidiculageorge commented 9 months ago

Hi Team,

Just an update I used the below commands as per the link below

https://docs.kubecost.com/install-and-configure/advanced-configuration/windows-node-support

helm upgrade kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml

Now the containers are in the following state

image

When i describe the Prometheus

PS C:\Users\nidicula\OneDrive - RM PLC\PlatformandEngineering\Work\KubecostSetup> kubectl describe pod kubecost-prometheus-server-54fd884d8f-spvpl -n kubecost Name: kubecost-prometheus-server-54fd884d8f-spvpl Namespace: kubecost Priority: 0 Service Account: kubecost-prometheus-server Node: aks-linux8core-34370389-vmss000474/172.28.194.90 Start Time: Tue, 06 Feb 2024 12:40:53 +0530 Labels: app=prometheus component=server heritage=Helm pod-template-hash=54fd884d8f release=kubecost Annotations: Status: Pending SeccompProfile: RuntimeDefault IP: IPs: Controlled By: ReplicaSet/kubecost-prometheus-server-54fd884d8f Containers: prometheus-server: Container ID: Image: quay.io/prometheus/prometheus:v2.49.1 Image ID: Port: 9090/TCP Host Port: 0/TCP Args: --storage.tsdb.retention.time=15d --config.file=/etc/config/prometheus.yml --storage.tsdb.path=/data --web.console.libraries=/etc/prometheus/console_libraries --web.console.templates=/etc/prometheus/consoles --web.enable-lifecycle --query.max-concurrency=1 --query.max-samples=1e+08 State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Liveness: http-get http://:9090/-/healthy delay=30s timeout=30s period=10s #success=1 #failure=3 Readiness: http-get http://:9090/-/ready delay=30s timeout=30s period=10s #success=1 #failure=3 Environment: Mounts: /data from storage-volume (rw) /etc/config from config-volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b2g84 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: config-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: kubecost-prometheus-server Optional: false storage-volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: kubecost-prometheus-server ReadOnly: false kube-api-access-b2g84: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 6m15s default-scheduler Successfully assigned kubecost/kubecost-prometheus-server-54fd884d8f-spvpl to aks-linux8core-34370389-vmss000474 Warning FailedAttachVolume 6m15s attachdetach-controller Multi-Attach error for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" Volume is already exclusively attached to one node and can't be attached to another Normal SuccessfulAttachVolume 5m49s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" Warning FailedMount 116s (x2 over 4m12s) kubelet Unable to attach or mount volumes: unmounted volumes=[storage-volume], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition Warning FailedMount 82s (x10 over 5m47s) kubelet MountVolume.MountDevice failed for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" : rpc error: code = Internal desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount, failed with mount failed: exit status 32 Mounting command: mount Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error.

dwbrown2 commented 9 months ago

Not an expert here, @AjayTripathy @thomasvn may know more, but @nidiculageorge is this potentially the issue? https://stackoverflow.com/questions/70945223/kubernetes-multi-attach-error-for-volume-pvc-volume-is-already-exclusively-att

One quick test would be to disable the Prom PV to make sure everything deploys correctly.

AjayTripathy commented 9 months ago

I don't think this is a multi-attach error. We have had some issues with Azure CSI filesystems in the past: see https://github.com/kubecost/docs/pull/697/files

@nidiculageorge would it be possible to try a different storageclass?

nidiculageorge commented 8 months ago

@ajayTripathy thanks for the response ,Could you please clarify how can i try a different storage class,

I was using the below commands to install the application

helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set kubecostToken="xxxx"

AjayTripathy commented 8 months ago

https://github.com/kubecost/cost-analyzer-helm-chart/blob/cdcb75e3fbf473b1e33b670480b35df52ab4f2f3/cost-analyzer/templates/prometheus-server-pvc.yaml#L20 it can be set from here

nidiculageorge commented 8 months ago

@ajaytripathy

Thanks for the response I am running the below command as mentioned earlier

helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set kubecostToken="xxxx"

How can i pass the Prometheus configuration in the above command

thomasvn commented 8 months ago

@nidiculageorge To pass the configuration in, you will need to create a Helm values.yaml file (docs ref). For example, it may look something like this:

# values.yaml
prometheus:
  server:
    persistentVolume:
      storageClass: YOUR_STORAGE_CLASS_NAME_HERE
# Use this command to pass the `values.yaml` file created above
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
  --namespace kubecost --create-namespace \
  --set kubecostToken="xxxx" \
  -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml \
  -f values.yaml
nidiculageorge commented 8 months ago

@thomasvn I have uninstalled kubecost and redeployed using the same commands

helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \ --namespace kubecost --create-namespace \ --set kubecostToken="xxxx" \ -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml

Now I can see the following pods running including Prometheus but some Prometheus node exporter pods in pending state.

image

Queries :

  1. How come the same helm command I used to install kubecost previously have the promethus pod running, previously it was in container creating state and then go to crashloopback state?
  2. Prometheus node exported pods are in pending state

Please see the logs below

PS C:\Users\nidicula\OneDrive - RM PLC\PlatformandEngineering\Work\KubecostSetup> kubectl describe pod kubecost-prometheus-node-exporter-45kvs -n kubecost Name: kubecost-prometheus-node-exporter-45kvs Namespace: kubecost Priority: 0 Service Account: kubecost-prometheus-node-exporter Node: Labels: app=prometheus component=node-exporter controller-revision-hash=bf46fdf8c heritage=Helm pod-template-generation=1 release=kubecost Annotations: Status: Pending IP: IPs: Controlled By: DaemonSet/kubecost-prometheus-node-exporter Containers: prometheus-node-exporter: Image: prom/node-exporter:v1.7.0 Port: 9100/TCP Host Port: 9100/TCP Args: --path.procfs=/host/proc --path.sysfs=/host/sys --web.listen-address=:9100 Environment: Mounts: /host/proc from proc (ro) /host/sys from sys (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tlxrd (ro) Conditions: Type Status PodScheduled False Volumes: proc: Type: HostPath (bare host directory volume) Path: /proc HostPathType: sys: Type: HostPath (bare host directory volume) Path: /sys HostPathType: kube-api-access-tlxrd: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/network-unavailable:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists node.kubernetes.io/pid-pressure:NoSchedule op=Exists node.kubernetes.io/unreachable:NoExecute op=Exists node.kubernetes.io/unschedulable:NoSchedule op=Exists Events: Type Reason Age From Message


Warning FailedScheduling 9m36s (x5472 over 17h) default-scheduler 0/19 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/19 nodes are available: 19 No preemption victims found for incoming pod.. Warning FailedScheduling 4m53s (x2 over 4m54s) default-scheduler 0/19 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/19 nodes are available: 19 No preemption victims found for incoming pod..

thomasvn commented 8 months ago

How come the same helm command I used to install kubecost previously have the promethus pod running, previously it was in container creating state and then go to crashloopback state?

Previously you were in a crashloopback state because kubecost-prometheus-server was attempting to mount a PV to a Windows node, which it could not do. This should not happen anymore.

Prometheus node exported pods are in pending state

These pods are optional in a Kubecost installation, and can be disabled by adding --set prometheus.nodeExporter.enabled=false to your install command

helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
  --namespace kubecost --create-namespace \
  --set kubecostToken="xxxx" \
  -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml \
  --set prometheus.nodeExporter.enabled=false
nidiculageorge commented 8 months ago

@thomasvn thanks for the update .Could you pls let me know what is the use of this node exporter.Will it affect any functionality if we disable the node exporter

I used the below commands to install

helm upgrade kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set prometheus.nodeExporter.enabled=false

image

PS C:\Users\nidicula> kubectl describe pod kubecost-cost-analyzer-747f68c8b7-wbmwx -n kubecost Name: kubecost-cost-analyzer-747f68c8b7-wbmwx Namespace: kubecost Priority: 0 Service Account: kubecost-cost-analyzer Node: aks-linux8core-34370389-vmss00050q/172.28.196.58 Start Time: Thu, 14 Mar 2024 08:43:26 +0530 Labels: app=cost-analyzer app.kubernetes.io/instance=kubecost app.kubernetes.io/name=cost-analyzer pod-template-hash=747f68c8b7 Annotations: Status: Pending SeccompProfile: RuntimeDefault IP: IPs: Controlled By: ReplicaSet/kubecost-cost-analyzer-747f68c8b7 Containers: cost-model: Container ID: Image: gcr.io/kubecost1/cost-model:prod-2.1.1 Image ID: Ports: 9003/TCP, 9090/TCP Host Ports: 0/TCP, 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Requests: cpu: 200m memory: 55Mi Liveness: http-get http://:9003/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Readiness: http-get http://:9003/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Environment: GRAFANA_ENABLED: true

  READ_ONLY:                                  false
  PROMETHEUS_SERVER_ENDPOINT:                 <set to the key 'prometheus-server-endpoint' of config map 'kubecost-cost-analyzer'>  Optional: false
  CLOUD_COST_ENABLED:                         false
  CLOUD_PROVIDER_API_KEY:                     AIzaSyDXQPG_MHUEy9neR7stolq6l0ujXmjJlvk
  CONFIG_PATH:                                /var/configs/
  DB_PATH:                                    /var/db/
  CLUSTER_PROFILE:                            production
  EMIT_POD_ANNOTATIONS_METRIC:                false
  EMIT_NAMESPACE_ANNOTATIONS_METRIC:          false
  EMIT_KSM_V1_METRICS:                        true
  EMIT_KSM_V1_METRICS_ONLY:                   false
  LOG_COLLECTION_ENABLED:                     true
  PRODUCT_ANALYTICS_ENABLED:                  true
  ERROR_REPORTING_ENABLED:                    true
  VALUES_REPORTING_ENABLED:                   true
  SENTRY_DSN:                                 https://71964476292e4087af8d5072afe43abd@o394722.ingest.sentry.io/5245431
  LEGACY_EXTERNAL_API_DISABLED:               false
  OUT_OF_CLUSTER_PROM_METRICS_ENABLED:        false
  CACHE_WARMING_ENABLED:                      false
  SAVINGS_ENABLED:                            true
  ETL_ENABLED:                                true
  ETL_STORE_READ_ONLY:                        false
  ETL_CLOUD_USAGE_ENABLED:                    false
  CLOUD_ASSETS_EXCLUDE_PROVIDER_ID:           false
  ETL_RESOLUTION_SECONDS:                     300
  ETL_MAX_PROMETHEUS_QUERY_DURATION_MINUTES:  1440
  ETL_DAILY_STORE_DURATION_DAYS:              91
  ETL_HOURLY_STORE_DURATION_HOURS:            49
  ETL_WEEKLY_STORE_DURATION_WEEKS:            53
  ETL_FILE_STORE_ENABLED:                     true
  ETL_ASSET_RECONCILIATION_ENABLED:           true
  ETL_USE_UNBLENDED_COST:                     false
  CONTAINER_STATS_ENABLED:                    false
  RECONCILE_NETWORK:                          true
  KUBECOST_METRICS_POD_ENABLED:               false
  PV_ENABLED:                                 true
  MAX_QUERY_CONCURRENCY:                      5
  UTC_OFFSET:                                 +00:00
  CLUSTER_ID:                                 cluster-one
  COST_EVENTS_AUDIT_ENABLED:                  false
  RELEASE_NAME:                               kubecost
  KUBECOST_NAMESPACE:                         kubecost
  POD_NAME:                                   kubecost-cost-analyzer-747f68c8b7-wbmwx (v1:metadata.name)
  KUBECOST_TOKEN:                             <set to the key 'kubecost-token' of config map 'kubecost-cost-analyzer'>  Optional: false
  WATERFOWL_ENABLED:                          true
  DIAGNOSTICS_RUN_IN_COST_MODEL:              false
Mounts:
  /var/configs from persistent-configs (rw)
  /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ltdmv (ro)

cost-analyzer-frontend: Container ID: Image: gcr.io/kubecost1/frontend:prod-2.1.1 Image ID: Port: Host Port: State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Requests: cpu: 10m memory: 55Mi Liveness: http-get http://:9003/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Readiness: http-get http://:9003/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Environment: GET_HOSTS_FROM: dns Mounts: /etc/nginx/conf.d/ from nginx-conf (rw) /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ltdmv (ro) aggregator: Container ID: Image: gcr.io/kubecost1/cost-model:prod-2.1.1 Image ID: Port: 9004/TCP Host Port: 0/TCP Args: waterfowl State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Readiness: http-get http://:9004/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Environment: CLUSTER_ID: cluster-one NUM_DB_COPY_CHUNKS: 25 CONFIG_PATH: /var/configs/ ETL_ENABLED: false CLOUD_PROVIDER_API_KEY: AIzaSyDXQPG_MHUEy9neR7stolq6l0ujXmjJlvk DB_CONCURRENT_INGESTION_COUNT: 3 DB_READ_THREADS: 1 DB_WRITE_THREADS: 1 LOG_LEVEL: info KUBECOST_NAMESPACE: kubecost Mounts: /var/configs from persistent-configs (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ltdmv (ro) cloud-cost: Container ID: Image: gcr.io/kubecost1/cost-model:prod-2.1.1 Image ID: Port: 9005/TCP Host Port: 0/TCP Args: cloud-cost State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Readiness: http-get http://:9005/healthz delay=10s timeout=1s period=10s #success=1 #failure=200 Environment: CONFIG_PATH: /var/configs/ CLOUD_COST_REFRESH_RATE_HOURS: 6 CLOUD_COST_QUERY_WINDOW_DAYS: 7 CLOUD_COST_RUN_WINDOW_DAYS: 3 CLOUD_COST_IS_INCLUDE_LIST: false CLOUD_COST_LABEL_LIST: CLOUD_COST_TOP_N: 1000 Mounts: /var/configs from persistent-configs (rw) Conditions: Type Status Initialized True Ready False ContainersReady False Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: nginx-conf: Type: ConfigMap (a volume populated by a ConfigMap) Name: nginx-conf Optional: false persistent-configs: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: kubecost-cost-analyzer ReadOnly: false kube-api-access-ltdmv: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 18m default-scheduler Successfully assigned kubecost/kubecost-cost-analyzer-747f68c8b7-wbmwx to aks-linux8core-34370389-vmss00050q Warning FailedAttachVolume 18m attachdetach-controller Multi-Attach error for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" Volume is already exclusively attached to one node and can't be attached to another Normal SuccessfulAttachVolume 17m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" Warning FailedMount 61s (x8 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-configs], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition Warning FailedMount 55s (x16 over 17m) kubelet MountVolume.MountDevice failed for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" : rpc error: code = Internal desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount, failed with mount failed: exit status 32 Mounting command: mount Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call.

thomasvn commented 8 months ago

Node Exporter metrics are primarily used for the Reserved Instance Recommendations feature (more details here). In Kubecost 2.0, this pod has been disabled by default!