This charmed operator is part of automating the operational procedures of running Grafana Mimir, an open-source metrics backend, in microservices mode.
Model Controller Cloud/Region Version SLA Timestamp
mimir microk8s microk8s/localhost 3.4.0 unsupported 08:52:20-03:00
App Version Status Scale Charm Channel Rev Address Exposed Message
coord active 1 mimir-coordinator-k8s 37 10.152.183.60 no
mimir 2.10.0 active 1 mimir-worker-k8s 4 10.152.183.133 no
prom 2.50.1 active 1 prometheus-k8s edge 173 10.152.183.101 no
s3-integrator active 1 s3-integrator edge 17 10.152.183.171 no
Unit Workload Agent Address Ports Message
coord/0* active idle 10.1.200.104
mimir/0* active idle 10.1.200.96
prom/0* active idle 10.1.200.91
s3-integrator/0* active idle 10.1.200.108
Integration provider Requirer Interface Type Message
coord:mimir-cluster mimir:mimir-cluster mimir_cluster regular
coord:self-metrics-endpoint prom:metrics-endpoint prometheus_scrape regular
prom:prometheus-peers prom:prometheus-peers prometheus_peers peer
s3-integrator:s3-credentials coord:s3 s3 regular
s3-integrator:s3-integrator-peers s3-integrator:s3-integrator-peers s3-integrator-peers peer
Relevant log output
ts=2024-04-11T11:49:53.788253976Z caller=sanity_check.go:39 level=info msg="Checking object storage config"
ts=2024-04-11T11:49:55.813866533Z caller=seed.go:127 level=warn msg="failed to read cluster seed file from object storage" err="Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:49:55.814834682Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:49:57.161752961Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:49:57.591882368Z caller=seed.go:127 level=warn msg="failed to read cluster seed file from object storage" err="Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:49:59.940176817Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:01.2347005Z caller=seed.go:127 level=warn msg="failed to read cluster seed file from object storage" err="Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:04.403887627Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:07.125160902Z caller=seed.go:127 level=warn msg="failed to read cluster seed file from object storage" err="Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:08.474983073Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:12.652371086Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:17.319101362Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:19.67827222Z caller=seed.go:127 level=warn msg="failed to read cluster seed file from object storage" err="Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
ts=2024-04-11T11:50:21.970860115Z caller=sanity_check.go:115 level=warn msg="Unable to successfully connect to configured object storage (will retry)" err="2 errors: blocks storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host; ruler storage: unable to successfully send a request to object storage: Get \"https://endpoint/mimir/?location=\": dial tcp: lookup endpoint on 10.152.183.10:53: no such host"
this probably is happening because creds we receive from S3_integrator are invalid
it would be helpful if the mimir charm had some health checks that can detect if the workload's S3 is misconfigured. Maybe we can do this through health checks, or we might need something more advanced (grep logs? other?)
S3_integrator could be validating the creds before passing them to us, but that same feature would be needed on other s3 providers
Bug Description
A misconfiguration in S3_integrator causes mimir worker not to work, but it is still in the
active
status.Mimir worker should be in
blocked
status with a meaningful message.To Reproduce
Deploy coordinator:
juju deploy ./*.charm coord --resource nginx-image=ubuntu/nginx:1.18-22.04_beta --resource nginx-prometheus-exporter-image=nginx/nginx-prometheus-exporter:1.1.0 --trust
Deploy worker:
juju deploy ./*.charm mimir --resource mimir-image=ubuntu/mimir:2.10.0-22.04 --trust --config all=True
Deploy s3_integrator:
juju deploy s3-integrator --channel edge --trust
Deploy prometheus:
juju deploy prometheus-k8s prom --channel edge --trust
Config s3_integrator:
juju run s3-integrator/leader sync-s3-credentials access-key=AccessKey secret-key=SecretKey bucket="mimir" endpoint="endpoint"
Relate
coord
tomimir
:juju relate coord mimir
Relate
coord
toprometheus
:juju relate prom:metrics-endpoint coord:self-metrics-endpoint
Check in
prometheus
that the scrape job is scrapeable:Relate
coord
tos3_integrator
:juju relate coord s3-integrator
Check in
prometheus
that the scrape job is NOT scrapeable:Environment
Relevant log output
Additional context