thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.07k stars 2.09k forks source link

Query frontend crashing when ttl specified in cache config #6996

Open hagaibarel opened 10 months ago

hagaibarel commented 10 months ago

Thanos, Prometheus and Golang version used:

Thanos - quay.io/thanos/thanos:v0.33.0 (deployed on kubernetes) Prometheus / Golang = N/A

Object Storage Provider: GCS

What happened:

Specifying ttl in the redis cache config causes query fronted to fail to start with an exception

What you expected to happen:

Normal startup

How to reproduce it (as minimally and precisely as possible):

Start query frontend with the following redis config:

type: REDIS
config:
  addr: my-redis.default.svc.cluster.local:6379
  ttl: 48h

Full logs to relevant components:

ts=2023-12-19T09:37:03.011912624Z caller=factory.go:43 level=info msg="loading tracing configuration"
ts=2023-12-19T09:37:03.012570367Z caller=main.go:135 level=error err="yaml: unmarshal errors:\n  line 3: field ttl not found in type queryfrontend.RedisResponseCacheConfig\ninitializing the query range cache config\nmain.runQueryFrontend\n\t/app/cmd/thanos/query_frontend.go:234\nmain.registerQueryFrontend.func1\n\t/app/cmd/thanos/query_frontend.go:160\nmain.main\n\t/app/cmd/thanos/main.go:133\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650\npreparing query-frontend command failed\nmain.main\n\t/app/cmd/thanos/main.go:135\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"
MichaHoffmann commented 10 months ago

Oh damn, did it work on 0.32.5? I think we unified cache configs but frontend parses stricter then bucket caches.

hagaibarel commented 10 months ago

I haven't tried in with 0.32.5, AFAIK this was added in https://github.com/thanos-io/thanos/pull/6773 which was part of 0.33.0

ahurtaud commented 10 months ago

Hello @hagaibarel ,

The query-frontend does not have a ttl config for its cache, but an expiration :/ https://thanos.io/tip/components/query-frontend.md/#redis

hagaibarel commented 10 months ago

Thanks @ahurtaud for clearing that up, are you aware of any plans to unify the configuration?

ahurtaud commented 10 months ago

Thanks @ahurtaud for clearing that up, are you aware of any plans to unify the configuration?

no, not aware. it should be done I think to avoid confusion

MichaHoffmann commented 10 months ago

I agree! Do you mind opening a pull request for it?

ahurtaud commented 10 months ago

I dont feel I have the time currently to do it sorry. Also, that would mean a breaking change now that 0.33 is released? I am not sure we want to do it right now. Maybe a refactor of the cache configs so they use the same code (query-frontend and store).

MichaHoffmann commented 10 months ago

I dont feel I have the time currently to do it sorry. Also, that would mean a breaking change now that 0.33 is released? I am not sure we want to do it right now. Maybe a refactor of the cache configs so they use the same code (query-frontend and store).

I would accept both and then error if both were provided i think. Ill see if i can muster some time on the weekend!

yeya24 commented 9 months ago

We should definitely unify configurations at some point. The current QFE cache config is from Cortex code so not the same cache client code.