stefanprodan / dockprom

Docker hosts and containers monitoring with Prometheus, Grafana, cAdvisor, NodeExporter and AlertManager
MIT License
5.97k stars 1.71k forks source link

Alerting for any docker container #268

Closed r3nor closed 1 year ago

r3nor commented 2 years ago

I have several servers running simmilar stacks (i.e: nginx, wordpress...). In every machine, the stacks have the same names. I am trying to set an alert that will fire if any container in any server is down at any moment:

Imagine machines A and B, both running a compose with nginx and wordpress. If nginx on machine A has a problem I want to be notified. I don't want to create an alert for each machine and each container as I have much more machines than 2. I am trying to set an alert that will fire if ANY container in ANY server is down. Preferably it would be great if I could extract the last() data so I can know which instance is down.

Is there any way to achieve this?

philyuchkoff commented 2 years ago

Try using Prometheus function "absent()", for example:

ALERT nginx_absent
  IF absent(container_cpu_usage_seconds_total{com_docker_compose_service="nginx"})
  FOR 5s
  LABELS {
    severity="critical"
  }
  ANNOTATIONS {
  SUMMARY= "Instance {{$labels.instance}} down",
  DESCRIPTION= "Instance {{$labels.instance}}, Service/Job ={{$labels.job}} is down for more than 5 sec."
  }

Maybe this will work for you.

philyuchkoff commented 2 years ago

or another option:

ALERT ContainerKilled
    EXPR: time() - container_last_seen > 60
    FOR: 0m
    LABELS:
      severity="critical"
    ANNOTATIONS:
      SUMMARY= "Container killed (instance {{ $labels.instance }})"
      DESCRIPTION= "A container has disappeared\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
nightah commented 1 year ago

Some examples have been provided so I'm going to close this issue. If you're still having issues please explain what you have tried and we may be able to assist.