Closed andrejshapal closed 1 year ago
Are all components on the same version?
@GiedriusS sidecars have 0.32.4, the rest 0.32.5.
Hey, are you able to share some downstream blocks so we can try to reproduce locally? Also what downstream stores apis are you querying?
@mhoffm-aiven
what downstream stores apis are you querying
Not sure what are downstream stores apis. But we have global thanos, which is querieng querier on another cluster via grpc, which works with thanos sidecar. Data is stored in gcp bucket.
are you able to share some downstream blocks so we can try to reproduce locally
Can you help me with some guide? Should I just send you some chunks from bucket? If so, how can I find necessary chunks (we have too many of them and no "created at" in gcp bucket".
Mh, ok so sharing might not be practical. With "downsteam store api" i meant essentially "--endpoint"s!
Can you bump sidecars to 0.32.5? There was this
https://github.com/thanos-io/thanos/pull/6816 Store: fix prometheus store label values matches for external labels
Which feels somewhat related.
@MichaHoffmann I suspect it is enough just to bump sidecar on the cluster where the issue is reproducible? If so, I bumped to 0.32.5:
{
"status": "success",
"data": [
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "cit1-k8s",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.145.73",
"host": "3a2b71b6-8026-4323-a5d9-6b9420258bc5",
"instance": "10.2.145.73:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-6w90",
"pod": "-cassandra-dc1-r3-sts-0",
"pod_name": "-cassandra-dc1-r3-sts-0",
"pool_name": "PerDiskMemtableFlushWriter_0",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r3",
"service": "-cassandra-dc1-all-pods-service"
},
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "cit1-k8s",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.147.193",
"host": "1252ec4c-66b7-47de-9745-42d368198c3e",
"instance": "10.2.147.193:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-xmr5",
"pod": "-cassandra-dc1-r2-sts-0",
"pod_name": "-cassandra-dc1-r2-sts-0",
"pool_name": "PerDiskMemtableFlushWriter_0",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r2",
"service": "-cassandra-dc1-all-pods-service"
},
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "-cassandra",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.150.131",
"host": "1cae4b22-a89b-451f-8f02-d276b86efb83",
"instance": "10.2.150.131:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-movi",
"pod": "-cassandra-dc1-r1-sts-0",
"pod_name": "-cassandra-dc1-r1-sts-0",
"pool_name": "PerDiskMemtableFlushWriter_0",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r1",
"service": "-cassandra-dc1-all-pods-service"
}
]
}
Issue not gone.
what downstream stores apis are you querying With "downsteam store api" i meant essentially "--endpoint"s!
Well, we have many endpoints.
- query
- '--log.level=info'
- '--log.format=logfmt'
- '--grpc-address=0.0.0.0:10901'
- '--http-address=0.0.0.0:10902'
- '--query.replica-label=replica'
- '--endpoint=thanos-sidecar-querier-query-grpc.monitoring.svc:10901'
- '--endpoint=thanos-storegateway.monitoring.svc:10901'
- '--endpoint=lv01-prometheus01.int.company.live:10903'
- '--endpoint=lv01-prometheus02.int.company.live:10903'
- '--endpoint=ro01-prometheus01.int.company.live:10903'
- '--endpoint=ro01-prometheus02.int.company.live:10903'
- '--endpoint=ge01-prometheus01.int.company.live:10903'
- '--endpoint=ge01-prometheus02.int.company.live:10903'
- '--endpoint=thanos.ci.int.company.live:443'
- '--endpoint=thanos.ci-en1.int.company.live:443'
- '--endpoint=thanos.dev.int.company.live:443'
- '--endpoint=thanos.live.int.company.live:443'
- '--endpoint=thanos-1.global.int.company.live:443'
- >-
--endpoint=astradb-thanos-sidecar-querier-query-grpc.monitoring.svc:10901
- '--grpc-client-tls-secure'
- '--grpc-client-tls-cert=/certs/client/tls.crt'
- '--grpc-client-tls-key=/certs/client/tls.key'
- '--grpc-client-tls-ca=/certs/client/ca.crt'
The one, which have metrics in questions is thanos.dev.int.company.live:443
@andrejshapal Can you try bumping up the version? Seems it is the same bug fixed in v0.32.5
@yeya24 Hello, Bumped everything to 0.32.5 and still see the same issue.
Hey @andrejshapal can you share configuration of the offending thanos.dev.int.company.live
please?
@MichaHoffmann Sure:
spec:
project: application-support
sources:
- repoURL: https://helm.onairent.live
chart: any-resource
targetRevision: "0.1.0"
helm:
values: |
anyResources:
- repoURL: https://charts.bitnami.com/bitnami
chart: thanos
targetRevision: "12.13.12"
helm:
values: |
fullnameOverride: thanos-sidecar-querier
query:
dnsDiscovery:
enabled: true
sidecarsService: kube-prometheus-stack-thanos-discovery
sidecarsNamespace: monitoring
service:
annotations:
traefik.ingress.kubernetes.io/service.serversscheme: h2c
serviceGrpc:
annotations:
traefik.ingress.kubernetes.io/service.serversscheme: h2c
ingress:
grpc:
enabled: true
ingressClassName: traefik-internal
annotations:
traefik.ingress.kubernetes.io/router.tls.options: monitoring-thanos@kubernetescrd
hostname: thanos.dev.int.company.live
extraTls:
- hosts:
- thanos.dev.int.company.live
secretName: thanos-client-server-cert-1
bucketweb:
enabled: false
compactor:
enabled: false
storegateway:
enabled: false
receive:
enabled: false
metrics:
enabled: true
serviceMonitor:
enabled: true
labels:
prometheus: main
I also noticed it returns one cluster untill 07:00 27/10/2023 (local time, now is 12:41) and at 07:05 already 2 "clusters".
can you share the prometheus configurations from the instances that monitor the offending cassandra cluster too please?
We use kube-prometheus-stack. Nothing really special:
- repoURL: https://prometheus-community.github.io/helm-charts
chart: kube-prometheus-stack
targetRevision: "50.3.1"
helm:
values: |
fullnameOverride: kube-prometheus-stack
commonLabels:
prometheus: main
defaultRules:
create: false
kube-state-metrics:
fullnameOverride: kube-state-metrics
prometheus:
monitor:
enabled: true
additionalLabels:
prometheus: main
metricRelabelings:
- action: labeldrop
regex: container_id
- action: labeldrop
regex: uid
- sourceLabels: [__name__]
action: drop
regex: 'kube_configmap_(annotations|created|info|labels|metadata_resource_version)'
collectors:
- certificatesigningrequests
- configmaps
- cronjobs
- daemonsets
- deployments
- endpoints
- horizontalpodautoscalers
- ingresses
- jobs
- limitranges
- mutatingwebhookconfigurations
- namespaces
- networkpolicies
- nodes
- persistentvolumeclaims
- persistentvolumes
- poddisruptionbudgets
- pods
- replicasets
- replicationcontrollers
- resourcequotas
- secrets
- services
- statefulsets
- storageclasses
- validatingwebhookconfigurations
- volumeattachments
metricLabelsAllowlist:
- pods=[version]
kubeScheduler:
enabled: false
kubeEtcd:
enabled: false
kubeProxy:
enabled: false
kubeControllerManager:
enabled: false
prometheus-node-exporter:
fullnameOverride: node-exporter
extraArgs:
- --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
- --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
prometheus:
monitor:
enabled: true
additionalLabels:
prometheus: main
relabelings:
- action: replace
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: instance
coreDns:
enabled: false
kubelet:
enabled: true
serviceMonitor:
cAdvisorMetricRelabelings:
- sourceLabels: [__name__]
action: drop
regex: 'container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)'
- sourceLabels: [__name__]
action: drop
regex: 'container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)'
- sourceLabels: [__name__]
action: drop
regex: 'container_memory_(mapped_file|swap)'
- sourceLabels: [__name__]
action: drop
regex: 'container_(file_descriptors|tasks_state|threads_max)'
- sourceLabels: [__name__]
action: drop
regex: 'container_spec.*'
- sourceLabels: [id, pod]
action: drop
regex: '.+;'
- action: labeldrop
regex: id
- action: labeldrop
regex: name
- action: labeldrop
regex: uid
cAdvisorRelabelings:
- action: replace
sourceLabels: [__metrics_path__]
targetLabel: metrics_path
probesMetricRelabelings:
- action: labeldrop
regex: pod_uid
probesRelabelings:
- action: replace
sourceLabels: [__metrics_path__]
targetLabel: metrics_path
resourceRelabelings:
- action: replace
sourceLabels: [__metrics_path__]
targetLabel: metrics_path
relabelings:
- action: replace
sourceLabels: [__metrics_path__]
targetLabel: metrics_path
grafana:
enabled: false
alertmanager:
enabled: false
prometheus:
enabled: true
monitor:
additionalLabels:
prometheus: main
serviceAccount:
create: true
name: "prometheus"
thanosService:
enabled: true
thanosServiceMonitor:
enabled: true
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: traefik-internal
hosts:
- prometheus.dev.int.company.live
tls:
- hosts:
- prometheus.dev.int.company.live
secretName: wildcard-dev-int-company-live
prometheusSpec:
enableRemoteWriteReceiver: true
serviceAccountName: prometheus
enableAdminAPI: true
disableCompaction: true
scrapeInterval: 10s
retention: 2h
additionalScrapeConfigsSecret:
enabled: false
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
externalLabels:
cluster: cit1-k8s
replica: prometheus-cit1-1
additionalAlertManagerConfigs:
- scheme: https
static_configs:
- targets:
- alertmanager.company.live
thanos:
image: quay.io/thanos/thanos:v0.32.5
objectStorageConfig:
name: thanos-objstore
key: objstore.yml
ruleSelector:
matchLabels:
evaluation: prometheus
serviceMonitorSelector:
matchLabels:
prometheus: main
podMonitorSelector:
matchLabels:
prometheus: main
probeSelector:
matchLabels:
prometheus: main
resources:
requests:
cpu: "3.2"
memory: 14Gi
limits:
cpu: 8
memory: 20Gi
Is there another replica somewhere maybe? Asking since it has the external "replica" label
@MichaHoffmann Nope. We have HA prometheuses on some clusters, but added replica label everywhere just for consistency.
having replica label on things that are not replicas of one another feels like ti could be an issue
@MichaHoffmann I can try to remove replica label. But this should not be an issue, sisnce it just used as a deduplication label?
@MichaHoffmann I have removed replica label, but no effect on issue in question.
@MichaHoffmann I have removed replica label, but no effect on issue in question.
Ah well, an attempt was made. Do you have the same issue if you uncheck "Use Deduplication" ?
@MichaHoffmann In thanos query with or without deduplication issue is not noticed. I don't think querieng works via api/v1/series.
You can specify ?dedup=false
i think on the API request ( https://thanos.io/v0.33/components/query.md/#deduplication-enabled )
dedup false:
{
"status": "success",
"data": [
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "cit1-k8s",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.145.73",
"host": "3a2b71b6-8026-4323-a5d9-6b9420258bc5",
"instance": "10.2.145.73:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-6w90",
"pod": "-cassandra-dc1-r3-sts-0",
"pod_name": "-cassandra-dc1-r3-sts-0",
"pool_name": "InternalResponseStage",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r3",
"service": "-cassandra-dc1-all-pods-service"
},
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "-cassandra",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.147.193",
"host": "1252ec4c-66b7-47de-9745-42d368198c3e",
"instance": "10.2.147.193:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-xmr5",
"pod": "-cassandra-dc1-r2-sts-0",
"pod_name": "-cassandra-dc1-r2-sts-0",
"pool_name": "InternalResponseStage",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r2",
"service": "-cassandra-dc1-all-pods-service"
}
]
}
dedup true:
{
"status": "success",
"data": [
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "-cassandra",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.145.73",
"host": "3a2b71b6-8026-4323-a5d9-6b9420258bc5",
"instance": "10.2.145.73:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-6w90",
"pod": "-cassandra-dc1-r3-sts-0",
"pod_name": "-cassandra-dc1-r3-sts-0",
"pool_name": "InternalResponseStage",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r3",
"service": "-cassandra-dc1-all-pods-service"
},
{
"__name__": "org_apache_cassandra_metrics_thread_pools_completed_tasks",
"cassandra_datastax_com_cluster": "-cassandra",
"cassandra_datastax_com_datacenter": "dc1",
"cluster": "-cassandra",
"container": "cassandra",
"datacenter": "dc1",
"endpoint": "metrics",
"exported_instance": "10.2.147.193",
"host": "1252ec4c-66b7-47de-9745-42d368198c3e",
"instance": "10.2.147.193:9000",
"job": "-cassandra-dc1-all-pods-service",
"namespace": "cit1--core",
"node_name": "gke-cit1-k8s-cit1-nodepool-1-331970fb-xmr5",
"pod": "-cassandra-dc1-r2-sts-0",
"pod_name": "-cassandra-dc1-r2-sts-0",
"pool_name": "InternalResponseStage",
"pool_type": "internal",
"prometheus": "monitoring/kube-prometheus-stack-prometheus",
"prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0",
"rack": "r2",
"service": "-cassandra-dc1-all-pods-service"
}
]
}
Would it be possible to send promtool tsdb dump
output with appropriate matcher from the offending prometheus? ( With the labels censored like in this example ); I could build a block and try to debug locally from that!
@MichaHoffmann Sorry for long waiting, had busy week. dump.zip
Hey,
I did small local setup of prometheus, sidecar ,querier (on latest main) and your data and can reproduce!
$ curl -sq -g '0.0.0.0:10904/api/v1/series?' --data-urlencode 'match[]=foo' | jq '.data.[].cluster'
"xxx-cassandra"
fedora ~ git thanos-repro andrej
$ curl -sq -g '0.0.0.0:10904/api/v1/series?' --data-urlencode 'match[]=foo' | jq '.data.[].cluster'
"xxx-cassandra"
fedora ~ git thanos-repro andrej
$ curl -sq -g '0.0.0.0:10904/api/v1/series?' --data-urlencode 'match[]=foo' | jq '.data.[].cluster'
"cluster_1"
fedora ~ git thanos-repro andrej
with prometheus configured like
global:
external_labels:
cluster: cluster_1
querier and sidecar are configured mostly as default.
Thanks, ill look into this in the debugger a bit later!
Ok i think i have found the issue and have a fix; was able to reproduce in a minimal acceptance test case.
Hello,
I am using Thanos 0.32.5.
Issue: It was noticed flaky issue, that Thanos always exposing external_label when executing queries. But it randomly giving external_label or internal_label value when labels key is the same when querieng api/v1/series.
Here is prometheus output:
And after application external_label, the new label cluster shows in thanos:
But when I am querieng api/v1/series endpoint, it randomly gives value of cluster:
{ "status": "success", "data": [ { "__name__": "collectd_collectd_queue_length", "cassandra_datastax_com_cluster": "cassandra", "cassandra_datastax_com_datacenter": "dc1", "cluster": "cassandra", "collectd": "write_queue", "container": "cassandra", "dc": "dc1", "endpoint": "prometheus", "exported_instance": "10.2.150.192", "instance": "10.2.150.192:9103", "job": "cassandra-dc1-all-pods-service", "namespace": "cit1-core", "pod": "cassandra-dc1-r2-sts-0", "prometheus": "monitoring/kube-prometheus-stack-prometheus", "prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0", "rack": "r2", "service": "cassandra-dc1-all-pods-service" }, { "__name__": "collectd_collectd_queue_length", "cassandra_datastax_com_cluster": "cassandra", "cassandra_datastax_com_datacenter": "dc1", "cluster": "cassandra", "collectd": "write_queue", "container": "cassandra", "dc": "dc1", "endpoint": "prometheus", "exported_instance": "10.2.151.7", "instance": "10.2.151.7:9103", "job": "cassandra-dc1-all-pods-service", "namespace": "cit1-core", "pod": "cassandra-dc1-r3-sts-0", "prometheus": "monitoring/kube-prometheus-stack-prometheus", "prometheus_replica": "prometheus-kube-prometheus-stack-prometheus-0", "rack": "r3", "service": "cassandra-dc1-all-pods-service" } ] }
Expected: Any api call should prioritise external_label and return it as a result of request.
Possible solution: The current solution is to rename internal label in scrape config. But mostly we are using configs via helm out of the box. Meaning, we do not set configs. Therefore, there is a chance external label match some random label from some random metric. Since that, this is worth to fix discrepancy. Series api endpoint is used by Grafana.