Closed bsunkad closed 3 years ago
Hi @bsunkad!
In order to do a docker-compose that will fetch the data from multiple CCLOUD_CLUSTER, you will need to provide a custom configuration file. The environment variable allows only one cluster (though it could be improved in the future). Please find bellow an example of docker-compose to provide a custom configuration file:
version: '3.1'
services:
prometheus:
image: prom/prometheus
container_name: prometheus
volumes:
- ./prometheus.yml:/prometheus.yml
command:
- '--config.file=/prometheus.yml'
ports:
- 9090:9090
restart: always
ccloud_exporter:
image: dabz/ccloudexporter
container_name: ccloud_exporter
command:
- '--config=/config.yaml'
environment:
CCLOUD_API_KEY: ${CCLOUD_API_KEY}
CCLOUD_API_SECRET: ${CCLOUD_API_SECRET}
CCLOUD_CLUSTER: ${CCLOUD_CLUSTER}
CCLOUD_CONNECTOR: ${CCLOUD_CONNECTOR}
CCLOUD_KSQL: ${CCLOUD_KSQL}
volumes:
- ./config.simple.yaml:/config.yaml
You can find examples of configuration files on: https://github.com/Dabz/ccloudexporter/tree/master/config
Let me know if you need further assistance!
Thanks, Dabz for the complete step of information.
Now I have a different problem. Prometheus is receiving the metrics, but the default grafana "Confluent Cloud" dashboard is not populated or not visible values in panels. This is the default dashboard that is provided in this repo. upon verifying the variables and metrics, the metrics starts with ccloud_ (eg. ccloud_metric_retained_bytes, ccloud_metric_partition_count etc.).
Only dashboard section Metrics API latency and Scrape duration are populated with values.
Am I missing anything? if not how to fix it to work with the default dashboard to be populated with data. Appreciate your help in advance.
If only the Metrics API latency is populated, it probably means that the queries are not succeeding on the Metrics API. Can you check the logs of ccloudexporter? It should give you a good idea of what the problem is.
Hi Dabz, Based on the logs I am seeing the following errors in ccloudexporter container.
{ "Endpoint": "https://api.telemetry.confluent.cloud//v2/metrics/cloud/query", "StatusCode": 403, "body": "{\"errors\":[{\"status\":\"403\",\"detail\":\"Query must filter by at least one of your authorized resources\"}]}", "level": "error", "msg": "Received invalid response", "time": "2021-04-29T12:15:58Z" } { "error": "Received status code 403 instead of 200 for POST on https://api.telemetry.confluent.cloud//v2/metrics/cloud/query ({\"errors\":[{\"status\":\"403\",\"detail\":\"Query must filter by at least one of your authorized resources\"}]})", "level": "error", "msg": "Query did not succeed", "optimizedQuery": { "aggregations": [ { "agg": "SUM", "metric": "io.confluent.kafka.server/sent_records" } ], "filter": { "op": "AND", "filters": [ { "op": "OR", "filters": [ { "field": "resource.kafka.id", "op": "EQ", "value": "STG-EASTUS2-CLUS-0" }, { "field": "resource.kafka.id", "op": "EQ", "value": "PRD-EASTUS2-CLUS-0" }, { "field": "resource.kafka.id", "op": "EQ", "value": "dev-eastus2-clus-1"
{ "field": "resource.kafka.id", "op": "EQ", "value": "STG-EASTUS2-CLUS-0" }
This is not a proper cluster_id. Cluster ID should looked like something lkc-xxxx
Hi Dabz, Excellent observation and you made my day. Thank you. It was a little confusing to see the cluster name. In confluent.io if go in deep information of cluster settings I could see the cluster id with lkc-xxxx. Now dashboard is populated with most of the panels. I will observe and work on fixing the remaining panels.
Best regards, Bhaskar .S
I am glad that this issue have been fixed ;) Have a good day!
Hi
Using this code, i tried to connect confluent using the docker-compose method. This code is able to connect and pull the metrics only when I am able to pass CCLOUD_CLUSTER with a single cluster only. Having the following issues and need resolve.
Appreciate if anyone able to help me with this.