Closed cbartz closed 2 weeks ago
Hi @cbartz
Please, may we have an screenshot for the Loki Dashboad as well as the content of the Loki config file?
@Abuelodelanada Sure
The dashboard definition: https://github.com/canonical/github-runner-operator/blob/b9dce2243310ef2c627fe7244b62824e5d44b541/src/grafana_dashboards/metrics.json
Loki charm config :
juju config loki cpu='10000'
juju config loki memory='2Gi'
juju config loki retention-period='30'
juju config loki trust='True'
Loki config
~$ kubectl exec loki-0 -c loki -it -- bash
root@loki-0:/# cat /etc/loki/loki-local-config.yaml
auth_enabled: false
chunk_store_config:
chunk_cache_config:
embedded_cache:
enabled: true
common:
path_prefix: /loki
replication_factor: 1
ring:
instance_addr: loki-0.loki-endpoints.prod-cos-k8s-ps6-is-charms.svc.cluster.local
kvstore:
store: inmemory
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
compactor:
retention_enabled: true
shared_store: filesystem
working_directory: /loki/compactor
frontend:
compress_responses: true
max_outstanding_per_tenant: 8192
ingester:
wal:
dir: /loki/chunks/wal
enabled: true
flush_on_shutdown: true
limits_config:
ingestion_burst_size_mb: 15.0
ingestion_rate_mb: 4.0
per_stream_rate_limit: 4MB
per_stream_rate_limit_burst: 15MB
retention_period: 30d
split_queries_by_interval: '0'
querier:
max_concurrent: 20
query_range:
parallelise_shardable_queries: false
results_cache:
cache:
embedded_cache:
enabled: true
ruler:
alertmanager_url: http://alertmanager-0.alertmanager-endpoints.prod-cos-k8s-ps6-is-charms.svc.cluster.local:9093
enable_alertmanager_v2: true
external_url: https://cos-ps6.is-devops.canonical.com/prod-cos-k8s-ps6-is-charms-loki-0
schema_config:
configs:
- from: '2020-10-24'
index:
period: 24h
prefix: index_
object_store: filesystem
schema: v11
store: boltdb-shipper
server:
http_listen_address: 0.0.0.0
http_listen_port: 3100
storage_config:
boltdb_shipper:
active_index_directory: /loki/boltdb-shipper-active
cache_location: /loki/boltdb-shipper-cache
shared_store: filesystem
filesystem:
directory: /loki/chunks
target: all
It feels like this is a Loki (workload) bug:
Can you confirm that the chunks files in the Loki workload container are all non-zero size? If you do not have empty chunks, then perhaps deleting the time-range for when the error occurs may "solve" the issue.
https://grafana.com/docs/loki/latest/reference/loki-http-api/#request-log-deletion
This also can be "solved" once the chunks are deleted due to the retention period.
Besides this issue seems to be fixed in a newer version of Loki that we might update to
Closing it, feel free to re-open if something new occurs.
Bug Description
We have a Grafana dashboard that uses Loki queries.
Today (and in the past https://matrix.to/#/!nHXpRkcSNJHlHdUGbQ:ubuntu.com/$5ZtI4_oARaV2Dlzz6UonBQ6-vSX62d31PXPZhBDKFUg?via=ubuntu.com&via=matrix.org) a lot of panels were not rendered because Loki responded with "invalid chunk checksum size".
Changing the timerange made the dashboard work again.
There were failing liveness probes in that timerange:
(https://pastebin.canonical.com/p/Hqphq85CyS/)
To Reproduce
Usual COS deployment.
Environment
juju
3.1.8
Relevant log output
Additional context
No response