grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
4.11k stars 526 forks source link

feat: add support for docker `healthcheck` to `grafana/mimir` image #9034

Closed DeadNews closed 2 months ago

DeadNews commented 2 months ago

Is your feature request related to a problem? Please describe.

Enable the healthcheck for the docker container.

Describe the solution you'd like

Add wget to the docker image via busybox.

Like loki does:

healthcheck:
  test:
    [
      "CMD-SHELL",
      "wget --no-verbose --tries=1 --spider http://localhost:3100/ready || exit 1",
    ]
  interval: 10s
  timeout: 5s
  retries: 5

Describe alternatives you've considered

Use internal commands to check the health of the service.

Like, traefik, redis, postgres, minio and others do:

healthcheck:
  test: [CMD, traefik, healthcheck, --ping]
  interval: 1m
  retries: 3
  timeout: 10s
  start_period: 1m

or this way.

Additional context

ref: https://github.com/grafana/loki/issues/11590 https://github.com/grafana/loki/pull/11711

narqo commented 2 months ago

There is a related discussion about publishing a "debug" version of the image, which will ship with a shell in #3202. So far, there aren't much of strong arguments for doing it.

If your use case requires running Mimir under an orchestrator that requires running a health check from inside the container, consider using the grafana/mirmir-alpine image. It comes with the shell and wget. The main grafana/mimir is expected to run in production-like environment, and come with as few external dependencies as possible, to minimize both hypothetical attack surface, and the noise from different CVE-watching tooling.

DeadNews commented 2 months ago

There is a related discussion about publishing a "debug" version of the image, which will ship with a shell in #3202. So far, there aren't much of strong arguments for doing it.

mimir-debug image to use for debugging purposes (with all our tooling inside)

I don't see what this has to do with regular healthcheck.

If your use case requires running Mimir under an orchestrator that requires running a health check from inside the container,

This is how docker works. How can it be done differently? Not in kubernetes.

consider using the grafana/mirmir-alpine image. It comes with the shell and wget.

https://github.com/grafana/mimir/blob/970fe27e1b9a11e474db8cfbdbebdc5289f63b35/CHANGELOG.md?plain=1#L207

mirmir-alpine is deprecated.

The main grafana/mimir is expected to run in production-like environment

Without health checks? And other grafana/images are not for production? They support healthchecks by default.

and come with as few external dependencies as possible, to minimize both hypothetical attack surface, and the noise from different CVE-watching tooling.

The alternative is to implement an internal command to check the health of the service. As noted in the issue.

pstibrany commented 2 months ago

Hello. Our only supported target for deployment is Kubernetes. We have deliberately made default Mimir images to be based on distroless with as few additional dependencies as possible, to reduce attack surface and number of CVEs reported for unrelated components (eg. OpenSSL). We do not plan to add more dependencies into grafana/mimir images.

If it is required for your use case to include more tools, it should easy enough to build a custom Mimir image.

As @narqo said above, for now we still publish grafana/mimir-alpine images, that you can use as a base image, although we plan to eventually deprecate those.