Closed nidiculageorge closed 8 months ago
Hi Team,
Just an update I used the below commands as per the link below
https://docs.kubecost.com/install-and-configure/advanced-configuration/windows-node-support
helm upgrade kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml
Now the containers are in the following state
When i describe the Prometheus
PS C:\Users\nidicula\OneDrive - RM PLC\PlatformandEngineering\Work\KubecostSetup> kubectl describe pod kubecost-prometheus-server-54fd884d8f-spvpl -n kubecost
Name: kubecost-prometheus-server-54fd884d8f-spvpl
Namespace: kubecost
Priority: 0
Service Account: kubecost-prometheus-server
Node: aks-linux8core-34370389-vmss000474/172.28.194.90
Start Time: Tue, 06 Feb 2024 12:40:53 +0530
Labels: app=prometheus
component=server
heritage=Helm
pod-template-hash=54fd884d8f
release=kubecost
Annotations:
Normal Scheduled 6m15s default-scheduler Successfully assigned kubecost/kubecost-prometheus-server-54fd884d8f-spvpl to aks-linux8core-34370389-vmss000474 Warning FailedAttachVolume 6m15s attachdetach-controller Multi-Attach error for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" Volume is already exclusively attached to one node and can't be attached to another Normal SuccessfulAttachVolume 5m49s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" Warning FailedMount 116s (x2 over 4m12s) kubelet Unable to attach or mount volumes: unmounted volumes=[storage-volume], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition Warning FailedMount 82s (x10 over 5m47s) kubelet MountVolume.MountDevice failed for volume "pvc-6327d72e-0a32-449b-88c9-1262223ee313" : rpc error: code = Internal desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount, failed with mount failed: exit status 32 Mounting command: mount Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/2b1cb69d9a695d1d12e8b8e0b51c541acd66d4c6fee1e97197a576c86589ab87/globalmount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error.
Not an expert here, @AjayTripathy @thomasvn may know more, but @nidiculageorge is this potentially the issue? https://stackoverflow.com/questions/70945223/kubernetes-multi-attach-error-for-volume-pvc-volume-is-already-exclusively-att
One quick test would be to disable the Prom PV to make sure everything deploys correctly.
I don't think this is a multi-attach error. We have had some issues with Azure CSI filesystems in the past: see https://github.com/kubecost/docs/pull/697/files
@nidiculageorge would it be possible to try a different storageclass?
@ajayTripathy thanks for the response ,Could you please clarify how can i try a different storage class,
I was using the below commands to install the application
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set kubecostToken="xxxx"
@ajaytripathy
Thanks for the response I am running the below command as mentioned earlier
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost --create-namespace -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set kubecostToken="xxxx"
How can i pass the Prometheus configuration in the above command
@nidiculageorge To pass the configuration in, you will need to create a Helm values.yaml
file (docs ref). For example, it may look something like this:
# values.yaml
prometheus:
server:
persistentVolume:
storageClass: YOUR_STORAGE_CLASS_NAME_HERE
# Use this command to pass the `values.yaml` file created above
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="xxxx" \
-f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml \
-f values.yaml
@thomasvn I have uninstalled kubecost and redeployed using the same commands
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \ --namespace kubecost --create-namespace \ --set kubecostToken="xxxx" \ -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml
Now I can see the following pods running including Prometheus but some Prometheus node exporter pods in pending state.
Queries :
Please see the logs below
PS C:\Users\nidicula\OneDrive - RM PLC\PlatformandEngineering\Work\KubecostSetup> kubectl describe pod kubecost-prometheus-node-exporter-45kvs -n kubecost
Name: kubecost-prometheus-node-exporter-45kvs
Namespace: kubecost
Priority: 0
Service Account: kubecost-prometheus-node-exporter
Node:
Warning FailedScheduling 9m36s (x5472 over 17h) default-scheduler 0/19 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/19 nodes are available: 19 No preemption victims found for incoming pod.. Warning FailedScheduling 4m53s (x2 over 4m54s) default-scheduler 0/19 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/19 nodes are available: 19 No preemption victims found for incoming pod..
How come the same helm command I used to install kubecost previously have the promethus pod running, previously it was in container creating state and then go to crashloopback state?
Previously you were in a crashloopback state because kubecost-prometheus-server
was attempting to mount a PV to a Windows node, which it could not do. This should not happen anymore.
Prometheus node exported pods are in pending state
These pods are optional in a Kubecost installation, and can be disabled by adding --set prometheus.nodeExporter.enabled=false
to your install command
helm install kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="xxxx" \
-f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml \
--set prometheus.nodeExporter.enabled=false
@thomasvn thanks for the update .Could you pls let me know what is the use of this node exporter.Will it affect any functionality if we disable the node exporter
I used the below commands to install
helm upgrade kubecost --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer --namespace kubecost -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/cost-analyzer/values-windows-node-affinity.yaml --set prometheus.nodeExporter.enabled=false
PS C:\Users\nidicula> kubectl describe pod kubecost-cost-analyzer-747f68c8b7-wbmwx -n kubecost
Name: kubecost-cost-analyzer-747f68c8b7-wbmwx
Namespace: kubecost
Priority: 0
Service Account: kubecost-cost-analyzer
Node: aks-linux8core-34370389-vmss00050q/172.28.196.58
Start Time: Thu, 14 Mar 2024 08:43:26 +0530
Labels: app=cost-analyzer
app.kubernetes.io/instance=kubecost
app.kubernetes.io/name=cost-analyzer
pod-template-hash=747f68c8b7
Annotations:
READ_ONLY: false
PROMETHEUS_SERVER_ENDPOINT: <set to the key 'prometheus-server-endpoint' of config map 'kubecost-cost-analyzer'> Optional: false
CLOUD_COST_ENABLED: false
CLOUD_PROVIDER_API_KEY: AIzaSyDXQPG_MHUEy9neR7stolq6l0ujXmjJlvk
CONFIG_PATH: /var/configs/
DB_PATH: /var/db/
CLUSTER_PROFILE: production
EMIT_POD_ANNOTATIONS_METRIC: false
EMIT_NAMESPACE_ANNOTATIONS_METRIC: false
EMIT_KSM_V1_METRICS: true
EMIT_KSM_V1_METRICS_ONLY: false
LOG_COLLECTION_ENABLED: true
PRODUCT_ANALYTICS_ENABLED: true
ERROR_REPORTING_ENABLED: true
VALUES_REPORTING_ENABLED: true
SENTRY_DSN: https://71964476292e4087af8d5072afe43abd@o394722.ingest.sentry.io/5245431
LEGACY_EXTERNAL_API_DISABLED: false
OUT_OF_CLUSTER_PROM_METRICS_ENABLED: false
CACHE_WARMING_ENABLED: false
SAVINGS_ENABLED: true
ETL_ENABLED: true
ETL_STORE_READ_ONLY: false
ETL_CLOUD_USAGE_ENABLED: false
CLOUD_ASSETS_EXCLUDE_PROVIDER_ID: false
ETL_RESOLUTION_SECONDS: 300
ETL_MAX_PROMETHEUS_QUERY_DURATION_MINUTES: 1440
ETL_DAILY_STORE_DURATION_DAYS: 91
ETL_HOURLY_STORE_DURATION_HOURS: 49
ETL_WEEKLY_STORE_DURATION_WEEKS: 53
ETL_FILE_STORE_ENABLED: true
ETL_ASSET_RECONCILIATION_ENABLED: true
ETL_USE_UNBLENDED_COST: false
CONTAINER_STATS_ENABLED: false
RECONCILE_NETWORK: true
KUBECOST_METRICS_POD_ENABLED: false
PV_ENABLED: true
MAX_QUERY_CONCURRENCY: 5
UTC_OFFSET: +00:00
CLUSTER_ID: cluster-one
COST_EVENTS_AUDIT_ENABLED: false
RELEASE_NAME: kubecost
KUBECOST_NAMESPACE: kubecost
POD_NAME: kubecost-cost-analyzer-747f68c8b7-wbmwx (v1:metadata.name)
KUBECOST_TOKEN: <set to the key 'kubecost-token' of config map 'kubecost-cost-analyzer'> Optional: false
WATERFOWL_ENABLED: true
DIAGNOSTICS_RUN_IN_COST_MODEL: false
Mounts:
/var/configs from persistent-configs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ltdmv (ro)
cost-analyzer-frontend:
Container ID:
Image: gcr.io/kubecost1/frontend:prod-2.1.1
Image ID:
Port:
Normal Scheduled 18m default-scheduler Successfully assigned kubecost/kubecost-cost-analyzer-747f68c8b7-wbmwx to aks-linux8core-34370389-vmss00050q Warning FailedAttachVolume 18m attachdetach-controller Multi-Attach error for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" Volume is already exclusively attached to one node and can't be attached to another Normal SuccessfulAttachVolume 17m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" Warning FailedMount 61s (x8 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-configs], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition Warning FailedMount 55s (x16 over 17m) kubelet MountVolume.MountDevice failed for volume "pvc-f6ae5c27-ef94-4f94-9c39-54cd0e1f1e7d" : rpc error: code = Internal desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount, failed with mount failed: exit status 32 Mounting command: mount Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/32b8a94f7c7b64cb1e5a6c0994c9f7099f982998deba6e4d52a3b92b4334adc9/globalmount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call.
Node Exporter metrics are primarily used for the Reserved Instance Recommendations feature (more details here). In Kubecost 2.0, this pod has been disabled by default!
Kubecost Version
2.0.2
Kubernetes Version
1.27.3
Kubernetes Platform
AKS
Description
Hi Team
I have deployed kubecost using the following commands,as per the below link
https://www.kubecost.com/install#show-instructions
Installing Kubecost
https://www.kubecost.com/install#show-instructions
helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98" --set nodeSelector."kubernetes.io/os"=linux
After I deployed kubecost could see the promethus server pod is in container creating state
Steps to reproduce
Installing Kubecost
https://www.kubecost.com/install#show-instructions
helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98" --set nodeSelector."kubernetes.io/os"=linux
Expected behavior
The kubecost promethus server pod should be in running state
Impact
Unable to load grafana dashboad
Screenshots
Logs
Slack discussion
No response
Troubleshooting