opencost / opencost-helm-chart

OpenCost Helm chart
Apache License 2.0
72 stars 82 forks source link

OpenCost failed to query metrics to VictoriaMetrics #201

Open PabloSebastian opened 2 months ago

PabloSebastian commented 2 months ago

Setup

Issue

After deploying the VictoriaMetrics and OpenCost charts using ArgoCD I got the following error in the OpenCost pod:

2024-05-02T20:38:35.104153093Z ??? Log level set to info
2024-05-02T20:38:35.104649495Z INF Starting cost-model version 1.110 (a1dfd1c)
2024-05-02T20:38:35.104677595Z INF Kubernetes enabled: true
2024-05-02T20:38:35.104772296Z INF Prometheus/Thanos Client Max Concurrency set to 5
2024-05-02T20:38:35.118462666Z ERR Failed to query prometheus at http://vmselect-victoria-metrics-k8s-stack.victoria-metrics.svc:8481/select/0/prometheus. Error: no running jobs on Prometheus at /select/0/prometheus/api/v1/query . Troubleshooting help available at: https://www.opencost.io/docs/integrations/prometheus
2024-05-02T20:38:35.11913717Z INF Retrieved a prometheus config file from: http://vmselect-victoria-metrics-k8s-stack.victoria-metrics.svc:8481/select/0/prometheus
2024-05-02T20:38:35.11918957Z INF Using scrape interval of 60.000000
2024-05-02T20:38:35.120119675Z INF NAMESPACE: opencost
2024-05-02T20:38:35.221552597Z INF Done waiting
2024-05-02T20:38:35.221917699Z INF Starting *v1.Deployment controller
2024-05-02T20:38:35.2221153Z INF Starting *v1.PersistentVolumeClaim controller
2024-05-02T20:38:35.222155601Z INF Starting *v1.StatefulSet controller
2024-05-02T20:38:35.222175101Z INF Starting *v1.ReplicaSet controller
2024-05-02T20:38:35.222193301Z INF Starting *v1.PersistentVolume controller
2024-05-02T20:38:35.222211401Z INF Starting *v1.Job controller
2024-05-02T20:38:35.222229801Z INF Starting *v1.StorageClass controller
2024-05-02T20:38:35.222247301Z INF Starting *v1.PodDisruptionBudget controller
2024-05-02T20:38:35.222264901Z INF Starting *v1.Pod controller
2024-05-02T20:38:35.222283001Z INF Starting *v1.Namespace controller
2024-05-02T20:38:35.222300001Z INF Starting *v1.Node controller
2024-05-02T20:38:35.222316701Z INF Starting *v1.ConfigMap controller
2024-05-02T20:38:35.222333502Z INF Starting *v1.Service controller
2024-05-02T20:38:35.222350602Z INF Starting *v1.DaemonSet controller
2024-05-02T20:38:35.222549003Z INF Starting *v1.ReplicationController controller
2024-05-02T20:38:35.230480643Z INF Found ProviderID starting with "azure", using Azure Provider
2024-05-02T20:38:35.240538995Z INF No metrics-config configmap found at install time, using existing configs: configmaps "metrics-config" not found
2024-05-02T20:38:35.244698417Z INF No pricing-configs configmap found at install time, using existing configs: configmaps "pricing-configs" not found
2024-05-02T20:38:35.245232519Z WRN Controller: pullWatchers: failed to load config statuses from file: open /var/configs/cloud-configurations.json: no such file or directory. Proceeding to create the file
2024-05-02T20:38:35.24537692Z ERR Controller: pullWatchers: failed to save statuses failed to save config statuses to file: open /var/configs/cloud-configurations.json: no such file or directory
2024-05-02T20:38:35.245762322Z INF Using ratecard query OfferDurableId eq 'MS-AZR-0003p' and Currency eq 'USD' and Locale eq 'en-US' and RegionInfo eq 'US'
2024-05-02T20:38:45.246325936Z WRN Controller: pullWatchers: failed to load config statuses from file: open /var/configs/cloud-configurations.json: no such file or directory. Proceeding to create the file
2024-05-02T20:38:45.246477137Z ERR Controller: pullWatchers: failed to save statuses failed to save config statuses to file: open /var/configs/cloud-configurations.json: no such file or directory
2024-05-02T20:38:45.246670737Z ERR Controller: pullWatchers: failed to save statuses failed to save config statuses to file: open /var/configs/cloud-configurations.json: no such file or directory
2024-05-02T20:38:55.247012701Z WRN Controller: pullWatchers: failed to load config statuses from file: open /var/configs/cloud-configurations.json: no such file or directory. Proceeding to create the file

The OpenCost UI works but there is no info in it: imagen

asdfgugus commented 2 months ago

Same issue here since upgrading to v1.110.0 (Chart 1.35.0).

@PabloSebastian does it work when you downgrade to 1.109.0 (Chart 1.34.0)?

PabloSebastian commented 2 months ago

@asdfgugus By mounting a volume and setting en env var CONFIG_PATH the following error disappeared:

ERR Controller: pullWatchers: failed to save statuses failed to save config statuses to file: open /var/configs/cloud-configurations.json: no such file or directory

But I still have this issue on the top of the logs:

ERR Failed to query prometheus at http://vmselect-victoria-metrics-k8s-stack.victoria-metrics.svc:8481/select/0/prometheus. Error: no running jobs on Prometheus at /select/0/prometheus/api/v1/query . Troubleshooting help available at: https://www.opencost.io/docs/integrations/prometheus

As it can't retrieve any metrics from VictoriaMetrics, I'm having the UI empty.

asdfgugus commented 2 months ago

My bad, I thought that you meant the error at the bottom of the logs.

When initializing, OpenCost queries the up metrics to extract the jobs. It seems to me that this query returns nothing and causes the error... Can you verify that?

PabloSebastian commented 2 months ago

Yes, I made a curl request to the vmcluster service:

~ $ curl http://vmselect-victoria-metrics-k8s-stack.victoria-metrics.svc:8481/select/0/prometheus/api/v1/query?query=up
{"status":"success","isPartial":false,"data":{"resultType":"vector","result":[]},"stats":{"seriesFetched": "0","executionTimeMsec":1}}

Also made another curl request to the vmagent targets:

~ $ curl http://vmagent-victoria-metrics-k8s-stack.victoria-metrics.svc:8429/targets | grep opencost
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65289    0 65289    0     0  4739k      0 --:--:-- --:--:-- --:--:-- 4904k
job=serviceScrape/opencost/opencost/0 (1/1 up)
        state=up, endpoint=http://10.0.0.192:9003/metrics, labels={container="opencost",endpoint="http",instance="10.0.0.192:9003",job="opencost",namespace="opencost",pod="opencost-56ddf67cff-jhhqf",service="opencost"}, scrapes_total=113, scrapes_failed=2, last_scrape=25.084s ago, scrape_duration=25ms, samples_scraped=1641, error=

That means that the metrics are being scraped

asdfgugus commented 2 months ago
~ $ curl http://vmselect-victoria-metrics-k8s-stack.victoria-metrics.svc:8481/select/0/prometheus/api/v1/query?query=up
{"status":"success","isPartial":false,"data":{"resultType":"vector","result":[]},"stats":{"seriesFetched": "0","executionTimeMsec":1}}

This should return the up metrics in the result. But it is empty.

mattray commented 2 months ago

Does this need to be over in the opencost/opencost repo? Is it a Helm issue or how OpenCost works with VictoriaMetrics?

PabloSebastian commented 2 months ago

In my case, it is how Opencost works with VictoriaMetrics. I followed the documentation explaining how to configure both applications but couldn't get it to work. Documentations:

AndrewChubatiuk commented 2 months ago

@PabloSebastian this guide is outdated, use only this. Regarding your setup, have you checked VMAgent logs? if you cluster is empty means that VMAgent cannot for some reason ingest data there. Also among your scrape targets there's only opencost. VictoriaMetrics helm chart provides out of box set of scrape configs, and services, which Opencost relies on (node exporter, etc), so initial query should return bunch of metrics, if vmagent works properly

PabloSebastian commented 2 months ago

Yes, I know that the latest is the one from the VictoriaMetrics blog, but I tested both configurations to make sure I tried all I could find about it to make it work. Regarding the VMAgent targets, the query retrieves many scrape targets. For this case, I use the grep command to only return the opencost target. Reviewing the VMAgent logs, the target discovery is working: image

mattray commented 2 months ago

@AndrewChubatiuk should we create a VictoriaMetrics section in the docs? I think this would be pretty useful and we could position VM as a Prometheus alternative directly in the installation since I know it's picking up popularity with OpenCost users.