thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.04k stars 2.09k forks source link

Thanos stores are unable to share the same redis database (bucket_cache) #6939

Open Chr0my opened 10 months ago

Chr0my commented 10 months ago

Hello guys,

I’m currently working on the redis cache & bucket_cache implementation on some of our store gateways and I’m facing some weird behaviors when enabling redis bucket_cache.

The redis cluster setup is a 3 node cluster running version 7 :

Redis server v=7.0.11 sha=00000000:0 malloc=jemalloc-5.3.0 bits=64 build=c4e7f6bf175a885b

Please find below the cache config that I use :

Index cache config :

config:
  addr: thns-cache1.domain:6379,thns-cache2.domain:6379,thns-cache3.domain:6379
  db: 0
  dial_timeout: 10s
  read_timeout: 5s
  write_timeout: 5s
  max_get_multi_concurrency: 100
  get_multi_batch_size: 100
  max_set_multi_concurrency: 100
  set_multi_batch_size: 100
  tls_enabled: false
  cache_size: 2GB
  expiration: 24h0m0s
  username: default
  password: SuperPassword

Bucket_cache config :

type: REDIS
config:
  addr: thns-cache1.domain:6379,thns-cache2.domain:6379,thns-cache3.domain:6379
  db: 0
  dial_timeout: 10s
  read_timeout: 5s
  write_timeout: 5s
  max_get_multi_concurrency: 100
  get_multi_batch_size: 100
  max_set_multi_concurrency: 100
  set_multi_batch_size: 100
  tls_enabled: false
  cache_size: 2GB
  expiration: 24h0m0s
  username: default
  password: SuperPassword

chunk_subrange_size: 16000
max_chunks_get_range_requests: 3
chunk_object_attrs_ttl: 24h
chunk_subrange_ttl: 24h
blocks_iter_ttl: 5m
metafile_exists_ttl: 2h
metafile_doesnt_exist_ttl: 15m
metafile_content_ttl: 24h
metafile_max_size: 1MiB

Here's the initial status of the loaded blocks before enabling cache & bucket_cache :

image

Here's what I see after enabling it :

image

... and getting the following logs :

Logs

``` Nov 29 10:16:21 thns-store.domain thanos[305695]: ts=2023-11-29T10:16:21.355740843Z caller=bucket.go:681 level=warn msg="loading block failed" elapsed=11.963466ms id=01HGCQ5GH923J1A0FKTW3FGC22 err="create index header reader: write index header: new index reader: get object attributes of 01HGCQ5GH923J1A0FKTW3FGC22/index: The specified key does not exist." Nov 29 10:16:21 thns-store.domain thanos[305695]: ts=2023-11-29T10:16:21.355784695Z caller=bucket.go:681 level=warn msg="loading block failed" elapsed=11.53174ms id=01HGBDZ543XFFB4VQDP9MJREYD err="create index header reader: write index header: new index reader: get object attributes of 01HGBDZ543XFFB4VQDP9MJREYD/index: The specified key does not exist." Nov 29 10:16:21 thns-store.domain thanos[305695]: ts=2023-11-29T10:16:21.355849129Z caller=bucket.go:681 level=warn msg="loading block failed" elapsed=11.693688ms id=01HGCG9SB4Q62KZ1YGY4TRY77D err="create index header reader: write index header: new index reader: get object attributes of 01HGCG9SB4Q62KZ1YGY4TRY77D/index: The specified key does not exist." Nov 29 10:16:21 thns-store.domain thanos[305695]: ts=2023-11-29T10:16:21.355894902Z caller=bucket.go:681 level=warn msg="loading block failed" elapsed=11.84724ms id=01HGCY17NT22Z59X03JTE5AZ69 err="create index header reader: write index header: new index reader: get object attributes of 01HGCY17NT22Z59X03JTE5AZ69/index: The specified key does not exist. ```

Just by curiosity, I wanted to see if I was able to reproduce it exactly so I did the same operation one more time:

This time, when the store has restarted, everything looks just fine 🤷‍♂️ image

Thanos version :

thanos, version 0.32.4 (branch: HEAD, revision: fcd5683e3049924ae26a680e166ae6f27a344896)
  build user:       root@afb5016d2fc4
  build date:       20231002-07:45:12
  go version:       go1.20.8
  platform:         linux/amd64
  tags:             netgo

What you expected to happen:

I expect that all the blocks are loaded without issue when enabling the bucket_cache

Second point, I don't know if it can be linked, or not (if not, my apologies and I can create another issue), I've then tried to load some metrics (last 3d) and in the logs :

``` Nov 29 11:00:35 thns-store.domain thanos[306234]: ts=2023-11-29T11:00:35.149732061Z caller=memcached.go:164 level=error msg="failed to cache series in memcached" err="the async buffer is full" Nov 29 11:00:35 thns-store.domain thanos[306234]: ts=2023-11-29T11:00:35.149828754Z caller=memcached.go:164 level=error msg="failed to cache series in memcached" err="the async buffer is full" Nov 29 11:00:35 thns-store.domain thanos[306234]: ts=2023-11-29T11:00:35.149927293Z caller=memcached.go:164 level=error msg="failed to cache series in memcached" err="the async buffer is full" Nov 29 11:00:35 thns-store.domain thanos[306234]: ts=2023-11-29T11:00:35.15002694Z caller=memcached.go:164 level=error msg="failed to cache series in memcached" err="the async buffer is full" Nov 29 11:00:35 thns-store.domain thanos[306234]: ts=2023-11-29T11:00:35.150109701Z caller=memcached.go:164 level=error msg="failed to cache series in memcached" err="the async buffer is full" ```

I'm not using any memcache in the configs, so I tried to find where this can come from, and I discovered that it might be related to NewRemoteIndexCache function that is called from the factory.go, which is the one from memcache.go.

Is it really trying to call memcache even when using redis ? Or is it just a matter or printing and this is the async buffer of the redis implementation that need to be increased ? In both scenarios (if I'm not mistaken), a small explanation / clarification is needed here, because it's confusing. 🤔

Environment:

Thanks a lot ! 🙏

MichaHoffmann commented 10 months ago

Same for us

MichaHoffmann commented 10 months ago

I think the memcached thing is only cosmetic because the remote index cache is situated in a file called memcached.go

MichaHoffmann commented 10 months ago

Redis cache should have a config option "max_async_buffer_size" which does feel related. Why blocks are not loading because of it is a mystery though.

Chr0my commented 10 months ago

Thanks for your answer @MichaHoffmann.

I think the memcached thing is only cosmetic because the remote index cache is situated in a file called memcached.go

Yes that was my first feeling but wasn't 100% sure, thank you.

Why blocks are not loading because of it is a mystery though.

I'm not sure the logs and the fact that the blocks are not loaded are linked finally, as I only launched the queries on the stores that were successfully started with all blocks.

Redis cache should have a config option "max_async_buffer_size" which does feel related.

Exactly, I played with it quickly and didn't found the good setup ATM. I can try to tweak the conf, but I wont go deeper right now until the initial issue is fixed, we have currently ~40 different stores to setup and I cannot afford to loose them randomly 😄

I'd be glad to do any tests you need to help finding out this issue 👍

MichaHoffmann commented 10 months ago

We also face the same issue ( though sporadically ); I wonder if we just hit the redis cache too hard and because of network latency the buffer fills too quickly.

Chr0my commented 10 months ago

Hi,

I did another set of tests increasing the log level (debug) and focused on a specific block.

Dec 04 10:08:08 thns-store-domain thanos[309001]: ts=2023-12-04T10:08:08.597954706Z caller=bucket.go:672 level=debug msg="loading new block" id=01HGK3X3JE5XYB796KRXD4HDA9
Dec 04 10:08:08 thns-store.domain thanos[309001]: ts=2023-12-04T10:08:08.598146159Z caller=binary_reader.go:536 level=debug msg="failed to read index-header from disk; recreating" path=/mnt/space1/thanos/store/01HGK3X3JE5XYB796KRXD4HDA9/index-header err="try lock file: open /mnt/space1/thanos/store/01HGK3X3JE5XYB796KRXD4HDA9/index-header: no such file or directory"
Dec 04 10:08:08 thns-storedomain thanos[309001]: ts=2023-12-04T10:08:08.743357367Z caller=bucket.go:681 level=warn msg="loading block failed" elapsed=145.403965ms id=01HGK3X3JE5XYB796KRXD4HDA9 err="create index header reader: write index header: new index reader: get object attributes of 01HGK3X3JE5XYB796KRXD4HDA9/index: The specified key does not exist."
Dec 04 10:08:08 thns-store.domain thanos[309001]: ts=2023-12-04T10:08:08.786764973Z caller=bucket.go:672 level=debug msg="loading new block" id=01HGK3X3JE5XYB796KRXD4HDA9
Dec 04 10:08:08 thns-store.domain thanos[309001]: ts=2023-12-04T10:08:08.786894177Z caller=binary_reader.go:536 level=debug msg="failed to read index-header from disk; recreating" path=/mnt/space1/thanos/store/01HGK3X3JE5XYB796KRXD4HDA9/index-header err="try lock file: open /mnt/space1/thanos/store/01HGK3X3JE5XYB796KRXD4HDA9/index-header: no such file or directory"

Could it be more global than just redis ? I've read this issue and we can observe the same kind of logs using memcached 🤔 Sadly it has been closed without real answer.

Chr0my commented 10 months ago

Thanks to the issue mentioned above ☝️ I realized that the failing blocks comes from another stores. After ensuring only 1 store use the redis cluster, it restarted without any issue.

@MichaHoffmann Can you tell me if you are in the same situation with more than 1 store using the same cluster ?

Is it a known / wanted behavior not be able to share the cache system across different stores or can it be marked as a bug? That's not very practical if we have to deploy a cluster for each store 😕 Edit : I'll try to do some tests using different redis db / store Edit 2 : I'm unable to select a different database as mentioned here 😞

MichaHoffmann commented 10 months ago

The cache should be shared across different deployments; this is for sure a bug i think

MichaHoffmann commented 10 months ago

Ah so we are using redis only for index cache thats why i dont see the bucket meta sync issues i think! For bucket cache we use groupcache which works fine as far as i can tell!

Chr0my commented 10 months ago

Yes it looks that only bucket_cache is concerned, I didn't notice any trouble using only index cache, but my tests didn't last long because of the issue mentioned.

A potential solution could be be to add kind of storeID that would prefix the keys in the cache so each store would be able to retrieve only the keys that belong to it 🤔 And this would work for either memcache or redis, no?

On our side, right now, we will go back and deploy simple redis instances and use a dedicated databases for each stores.

About groupcache, if I understand well, a group should only contains stores that point to the same object storage; on our side it would mean multiplying at least by 2 or 3 the number of stores (from ~40 to 80 or 120). I'm not sure it's worth it

GiedriusS commented 5 months ago

Added path to the hash here https://github.com/thanos-io/thanos/pull/7158 so this should help given that you have a separate file for specifying the bucket cache configuration. Long-term fix would be to add a field like name to the cache config.

gjshao44 commented 3 weeks ago

I ran into a similar issue with memcached, two or more stores would eventually make long term metrics not usable. Revert to in-memory cache for bucket for now. Definitily would appreciate prioritize this fix.

kaiohenricunha commented 1 day ago

I'm having the same issues on a single instance Elasticache Redis with index cache on db 0 and bucket cache on db 1. So, the problem is not that they're sharing the same database.

Changing the bucket cache to in-memory, as mentioned by @gjshao44 , solved it.