thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.99k stars 2.08k forks source link

failed to lookup SRV records #5366

Open Dauber01 opened 2 years ago

Dauber01 commented 2 years ago
image

when i add the head of "dnssrv+_grpc._tcp." i got a error like the picture show caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-receive-ingestor-default.namespace.svc err="no such host" and when i don't add the head , it work as well,so what should i do if i want to use the head "dnssrv+_grpc._tcp." at the verson of 24 in kubenets v1.20.10

stale[bot] commented 2 years ago

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

subbuedcast commented 2 years ago

level=warn ts=2022-09-20T08:30:14.347379507Z caller=proxy.go:279 component=proxy request="min_time:1663662000000 max_time:1663662600000 matchers:<name:\"cluster\" > matchers:<type:RE name:\"node\" > matchers:<type:NEQ name:\"container\" > matchers:<name:\"name\" value:\"node_namespace_pod_container:container_memory_working_set_bytes\" > max_resolution_window:6000 aggregates:COUNT aggregates:SUM " err="No StoreAPIs matched for this query" stores="store Addr: 10.2.2.192:10901 LabelSets: {prometheus=\"platform/prometheus-operator-prometheus\", prometheus_replica=\"prometheus-prometheus-operator-prometheus-0\"} Mint: 1630390089788 Maxt: 1663653600000 filtered out: does not have data within this time period: [1663662000000,1663662600000]. Store time ranges: [1630390089788,1663653600000]" level=error ts=2022-09-20T08:30:39.290908121Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:31:09.243997623Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:31:39.297070824Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:32:09.264274188Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:32:39.275291006Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:33:09.306956837Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host"

level=error ts=2022-09-20T08:33:39.27053823Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:34:09.288647186Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host" level=error ts=2022-09-20T08:34:39.294578868Z caller=resolver.go:99 msg="failed to lookup SRV records" host=_grpc._tcp.thanos-sidecar-grpc.platform.svc.cluster.loca err="no such host"

subbuedcast commented 2 years ago

The error is fixed because labeling was not done properly

osipovdaniil commented 1 year ago

i have same problem when i try to connect to memcached i have service memcached-thanos-system In thanos-store:

- |-
          --index-cache.config="config":
            "addresses":
            - "dnssrv+_client._tcp.memcached-thanos-system.{{ .Release.Namespace }}.svc.cluster.local"
          "type": "memcached"
        - |-
          --store.caching-bucket.config="blocks_iter_ttl": "5m"
          "chunk_object_attrs_ttl": "24h"
          "chunk_subrange_size": 16000
          "chunk_subrange_ttl": "24h"
          "config":
            "addresses":
            - "dnssrv+_client._tcp.memcached-thanos-system.{{ .Release.Namespace }}.svc.cluster.local"
          "type": "memcached"

I got msg="failed to lookup SRV records" host=_grpc._tcp.memcached-thanos-system.prometheus-system-layer err="no such host"

But from other pod, i run nslookup memcached-thanos-system.prometheus-system-layer and this work, dns resolve ip

Help please, Any ideas? @Dauber01 Did you solve the problem?

UPD:

I use

"addresses":
            - "memcached-thanos-system.{{ .Release.Namespace }}.svc.cluster.local:11211"

Instead

"addresses":
            - "dnssrv+_client._tcp.memcached-thanos-system.{{ .Release.Namespace }}.svc.cluster.local"

And this work for me

ricardov1 commented 1 year ago

The error is fixed because labeling was not done properly

@subbuedcast could you elaborate on this? which labels on which pod/s were incorrect and how did you fix them?