Open duj4 opened 3 months ago
Am too facing this error, using grafana/loki helm-chart 6.8.0 with app version 3.1.0
level=error ts=2024-08-08T05:23:38.230753658Z caller=http.go:107 msg="error getting delete requests from the store" err="unexpected status code: 404" ts=2024-08-08T05:23:38.230776322Z caller=spanlogger.go:109 user=fake level=error msg="failed loading deletes for user" err="unexpected status code: 404"
The loki config is below `auth_enabled: false chunk_store_config: chunk_cache_config: embedded_cache: enabled: false memcached: batch_size: 100 expiration: 30m parallelism: 100 memcached_client: consistent_hash: true host: memcached-chunk.loki.svc.cluster.local service: memcached-chunk write_dedupe_cache_config: memcached: batch_size: 100 expiration: 30m parallelism: 100 memcached_client: consistent_hash: true host: memcached-write.loki.svc.cluster.local service: memcached-write common: compactor_address: http://loki-read:3100 path_prefix: /var/loki replication_factor: 1 ring: kvstore: store: memberlist storage: s3: bucketnames: loki-data insecure: false region: eu-central-1 s3forcepathstyle: false compactor: delete_request_store: s3 retention_enabled: true frontend: compress_responses: true log_queries_longer_than: 20s max_outstanding_per_tenant: 4096 frontend_worker: grpc_client_config: max_recv_msg_size: 50331648 max_send_msg_size: 50331648 ingester: chunk_encoding: snappy chunk_idle_period: 15m chunk_retain_period: 30s chunk_target_size: 1572864 max_chunk_age: 1h ingester_client: grpc_client_config: grpc_compression: snappy max_recv_msg_size: 50331648 max_send_msg_size: 50331648 limits_config: ingestion_burst_size_mb: 1000 ingestion_rate_mb: 1000 max_cache_freshness_per_query: 10m max_query_parallelism: 2 max_query_series: 2000 per_stream_rate_limit: 20MB per_stream_rate_limit_burst: 20MB query_timeout: 2m reject_old_samples: true reject_old_samples_max_age: 168h retention_period: 8760h split_queries_by_interval: 15m deletion_mode: filter-and-delete memberlist: join_members:
@Hitesh-Agrawal it seems that your error is different from mine, which mode are you using to deploy your Loki stack?
@duj4 I am running it in deploymentMode: SimpleScalable
. The storage is in aws s3.
ok, if that is the case, I think you may need to modify your compactor address per https://github.com/grafana/loki/blob/9315b3d03d790506cf8e69fb7407b476de9d0ed6/production/helm/loki/templates/_helpers.tpl#L1000
@duj4 The compactor address is already set as per the need , I am not using any backend targets and only have loki-read and loki-write pods with loki-gateway
common: compactor_address: http://loki-read:3100/
@Hitesh-Agrawal ok, so it is a mixed config of SSD and distributed LOL, which is out of my knowledge, sorry man
Configuration questions have a better chance of being answered if you ask them on the community forums.
Describe the bug I am running Loki 3.1.0 in SSD mode with
retention_enabled
astrue
, but when the stack is up and running for a while, loki-read pod starts complaining error as below:Per the link https://grafana.com/docs/loki/latest/operations/troubleshooting/#cache-generation-errors, I found the metrics of
loki_delete_cache_gen_load_failures_total
is larger than 1 and it requires to setallow_deletes
as true, but this flag has been marked asdeprecated
in current version and as the substitution,deletion_mode
is set asfilter-and-delete
already.compactor:
limits:
If
deletion_mode
has been set in limits_config, do I have to set it again in runtime_config for each tenant? Ifallow_deletes
has been marked as deprecated, do I need to set it astrue
still?To Reproduce Steps to reproduce the behavior:
Expected behavior There should be no error post out if
filter-and-delete
is set correctly.Environment: