eclipse-mosquitto / mosquitto

Eclipse Mosquitto - An open source MQTT broker
https://mosquitto.org
Other
9.06k stars 2.39k forks source link

healthcheck #1270

Open frank3427 opened 5 years ago

frank3427 commented 5 years ago

does anyone have a healthcheck to check the container, or is it in the container? how to use in docker-compose

Helloworld-zyt commented 4 years ago

i want to know

mozzhead164 commented 4 years ago

bump

kaizensparc commented 4 years ago

Hello, You can use mosquitto_sub to try to connect to the broker and use the -E option to exit immediatly if it works. To avoid automatic reconnection and make the probe fail if it does not works after some time, you can execute it inside a timeout command

genieai-vikas commented 2 years ago

@didjcodt could you please paste the sample. I am getting error; tried multiple ways

kaizensparc commented 2 years ago

Example:

timeout 1 mosquitto_sub -h localhost -p 1883 -t 'topic' -E -i probe

What kind of error do you have? Can you paste the logs?

genieai-vikas commented 2 years ago

@didjcodt This is the error "Output": "Connection error: Connection Refused: not authorised.\n"

kaizensparc commented 2 years ago

Did you setup any authentication method (like username/password) or are you filtering based of a clientid maybe? You can look for that in the configuration file of mosquitto (if you have an option like password_file, acl_file, allow_anonymous, plugin for instance)

genieai-vikas commented 2 years ago

Yes. I have created a password file. Below is my conf file:

allow_anonymous false
password_file /mosquitto/config/pwfile
port 1883
listener 9001
persistence true
persistence_location /mosquitto/data/
log_dest file /mosquitto/log/mosquitto.log
kaizensparc commented 2 years ago

So that means your probe also needs a username/password :) If you have created a user named probe_user with password probe_password you can add the following flags in the command: -u probe_user -P "probe_password"

genieai-vikas commented 2 years ago

So this is the problem I have created the pwfile which is having a username and password. If I pass that in healthcheck it will expose it. There should have been a healtcheck for which authentication was not required

Daedaluz commented 2 years ago

I think it should be possible to configure another listener that only listen to localhost and have allow_anonymous true. This way you don't need a username/password for the probes but retain required authentication from remote connections.

Something like this (very untested config)

persistence true
persistence_location /mosquitto/data
log_dest file /mosuqitto/log/mosquitto.log

per_listener_settings true

listener 1883 0.0.0.0
allow_anonymous false
password_file /mosquitto/config/pwfile

# why have this listener?
# listener 9001

listener 1880 127.0.0.1
allow_anonymous true

then you could use mosquitto sub as a probe check without password: mosquitto_sub -p 1880 -t 'topic' -C 1 -E -i probe -W 3 (Also untested)

LostOnTheLine commented 1 year ago

Still nothing for this?

I don't need great security (though I'm not really sure why that's an issue as a healthcheck runs on the container's CLI ) but I have found a few things online... none of which work

version: "3"
services:
  mosquitto:
    image: eclipse-mosquitto
    container_name: mosquitto
    user: 1000:1000
    environment:
      - PUID=1000 #optional
      - PGID=1000 #optional
      - TZ=America/Phoenix
    ports:
      - 1883:1883
      #- 9001:9001
    volumes:
      - /docker/homeassistant/mqtt/mosquitto/config:/mosquitto/config
      - /docker/homeassistant/mqtt/mosquitto/data:/mosquitto/data
      - /docker/log/var/log/mosquitto:/mosquitto/log
      - /docker/log/var/log:/var/log:rw
      - /etc/localtime:/etc/localtime:ro
    restart: always
    healthcheck:
      #test: ["mosquitto_sub", "-h", "localhost", "-p", "1883", "-t", "test", "-C", "1"] #Stuck [Running]
    #  test: ["CMD-SHELL", "timeout -t 5 mosquitto_sub -t '$$SYS/#' -C 1 | grep -v Error || exit 1"] #Stuck [Starting] but runs. Becomes [Unhealthy] after 7-8 minutes
      test: ["CMD-SHELL", "mosquitto_sub -h localhost -t test -C 1"] #stuck [starting] but runs & logs active. Becomes [Unhealthy] after 5-12 minutes (7-8 typical)
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 20s
    #security_opt:
    #  - no-new-privileges:true
    labels:
      - "com.centurylinklabs.watchtower.scope=dockerhub"

Looking through things I think the best solution is going to be to add a healthcheck topic that has the -r (retain last message) flag. Ideally I was trying to set it to publish the time every 5 minutes, the idea being that the healthcheck could print that topic then grep it to determine if the time was in the last 10 minutes, if not it'd be unhealthy. I was trying with this

sh -c date | mosquitto_pub -h localhost -t healthcheck -l -r --quiet --repeat 999999 --repeat-delay 60 

hoping that it'd update the timecheck every minute, but that sadly didn't work. But I think this is the direction that will serve best for a healthcheck

    healthcheck:
      #test: ["mosquitto_sub", "-h", "localhost", "-p", "1883", "-t", "healthcheck", "-C", "1"] #Stuck [Running] with no healthcheck status
    #  test: ["CMD-SHELL", "timeout -t 5 mosquitto_sub -t '$$SYS/#' -C 1 | grep -v Error || exit 1"] #Stuck [Starting] but runs. Becomes [Unhealthy] after 7-8 minutes
  #    test: ["CMD-SHELL", "mosquitto_sub -h localhost -t healthcheck -C 1"] #stuck [starting] but runs & logs active. Becomes [Unhealthy] after 5-12 minutes (7-8 typical)
      #test: ["sh", "-c", "date | mosquitto_pub -h localhost -t healthcheck -l"] # Publishes date & time to "healthcheck" topic
      #test: ["mosquitto_sub", "-h", "localhost", "-t", "healthcheck", "-C", "1"]
      #test: ["mosquitto_sub", "-h", "localhost", "-p", "1883", "-t", "healthcheck", "-C", "1", "-W", "5"]
      test: ["sh", "-c", "mosquitto_sub -h localhost -C 1 -t healthcheck | grep ."] #Stuck [Running] with no healthcheck status
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 20s
Daedaluz commented 1 year ago

This seems to work for me...

    healthcheck:
      test: ["CMD", "mosquitto_sub", "-t", "$$SYS/#", "-C", "1", "-i", "healthcheck", "-W", "3"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 20s
LostOnTheLine commented 1 year ago

healthcheck: test: ["CMD", "mosquitto_sub", "-t", "$$SYS/#", "-C", "1", "-i", "healthcheck", "-W", "3"] interval: 30s timeout: 10s retries: 5 start_period: 20s

Thanks. This seems to work for me too. I don't know why I didn't find this when I searched, found a bunch of things similar, but not this

EDIT: Although when I try it in the CLI I get

/ $ mosquitto_sub -t $$SYS/# -C 1 -i healthcheck -W 3
Timed out
/ $ 
Daedaluz commented 1 year ago

note that $$ in bash results in the current shells pid... echo mosquitto_sub -t $$SYS/# -C 1 -i healthcheck1 -W 3 => mosquitto_sub -t 107926SYS/# -C 1 -i healthcheck1 -W 3 which is not what you want to listen for...

try mosquitto_sub -t '$SYS/#' -C 1 -i healthcheck1 -W 3 when running locally instead.

LostOnTheLine commented 1 year ago

Ah, so it's escaped with the 2nd $ or something like that. That makes sense

/ $ mosquitto_sub -t '$SYS/#' -C 1 -i healthcheck1 -W 3
mosquitto version 2.0.15
/ $ 

So it's just a test if it shows a non-null value when version is requested? I mean, it works & shows as Healthy... but I'm not confident that it won't show as Healthy even when it isn't working, which defeats the whole point of the healthcheck. Is there not a way to have it check to see that a topic can be subscribed to? Ideally I'd want a topic that outputs the time every 10 minutes & then a check that sees if the last message in that topic is less than 30 minutes old

Daedaluz commented 1 year ago

if you turn on verbose logging in the broker, you should see some logs about healthcheck making subscriptions.

if you add the -v flag to the mosquitto_sub command, you'll also see that the version is actually sent over a topic, and that's what you see there. $SYS/broker/version

if you still isn't convinced, you could try edit the topic and not push anything to it.. does it still show as healthy?

If really want to go the extra step, you could write a simple program to connect, subscribe to some topic and push on the same, then wait for it to arrive and measure the time difference. this way you test the whole chain and get an idea of how much work the broker is doing. this obviously involves creating your own container with the supplied test program.

LostOnTheLine commented 1 year ago
  • -t '$SYS/#' topic to subscribe to
  • -C 1 receive one message and exit
  • -i healthcheck set client-id to "healthcheck"
  • -W 3 timeout after 3 seconds if not recieved any message

if you turn on verbose logging in the broker, you should see some logs about healthcheck making subscriptions.

if you add the -v flag to the mosquitto_sub command, you'll also see that the version is actually sent over a topic, and that's what you see there. $SYS/broker/version

if you still isn't convinced, you could try edit the topic and not push anything to it.. does it still show as healthy?

If really want to go the extra step, you could write a simple program to connect, subscribe to some topic and push on the same, then wait for it to arrive and measure the time difference. this way you test the whole chain and get an idea of how much work the broker is doing. this obviously involves creating your own container with the supplied test program.

Ah. So it's subscribing to a topic, alright, that should be good then. I've just seen too many homemade "healthchecks" that don't actually check the health of the container that I'm always skeptical until I know what it's doing

ingoratsdorf commented 1 year ago

If you have a separate listener on a non-standard port, you also have to specify the port in your healthcheck, ie:

healthcheck:
      test: ["CMD", "mosquitto_sub", "-p", "1880", "-t", "$$SYS/#", "-C", "1", "-i", "healthcheck", "-W", "3"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 20s
s0170071 commented 1 year ago

I am wondering: mosquitto_sub is probably part of the mosquitto package. If I want to test the mosquitto container, I have to install mosquitto on the host system as well ?

ingoratsdorf commented 1 year ago

No, because the test is executing within the container, not on the host. Would not make sense otherwise.

LostOnTheLine commented 1 year ago

I am wondering: mosquitto_sub is probably part of the mosquitto package. If I want to test the mosquitto container, I have to install mosquitto on the host system as well ?

The way Docker HealtChecks work is by running a test essentially in a CLI inside the container. You can have a healtchcheck check to see if a page is reachable, if a link goes to an actual page or even more complex if a link goes to a page that contains certain words, but you can only do those things if the tools to do so are installed inside the container. Oftentimes checks do things like check to see if this page is reachable & larger than some number of KB.

What this HelatchCheck is doing is, in the CLI of the container, subscribing to a message thread. If that thread doesn't have anything in it it determines that the container is not healthy. But everything is happening inside the OS of the container.

There are certain containers that are designed to test if a thing on your local machine is present, but that is usually done via BIND MOUNTS or by pinging the machine over the network, in either case your machine only needs to be running docker, which is required for it to be running the docker container, & have the variables set for the container in the Docker-Compose or the command used to start the container

Guiorgy commented 1 year ago

The only annoyance with this is that the logs get filled by:

[timestamp]: New connection from 127.0.0.1:[port] on port 1880.
[timestamp]: New client connected from 127.0.0.1:[port] as healthcheck (p2, c1, k60).
[timestamp]: Client healthcheck disconnected.

Unfortunately, Mosquitto doesn't support per-listener logging configuration, otherwise I would've disabled logging for the localhost listener.

I tried using grep to filter out those logs:

command: ['/bin/sh', '-c', '/usr/sbin/mosquitto -c /mosquitto/config/mosquitto.conf 2>&1 | grep -v -E "^.*:[ ]New connection from 127\\.0\\.0\\.1:[0-9]+ on port 1880\\.$"']

However, grep would buffer the output and when stopped SIGTERM would not be propagated to mosquitto (since it was called though a shell) resulting in container being killed instead with lost logs.

LostOnTheLine commented 1 year ago

The only annoyance with this is that the logs get filled by:

[timestamp]: New connection from 127.0.0.1:[port] on port 1880.
[timestamp]: New client connected from 127.0.0.1:[port] as healthcheck (p2, c1, k60).
[timestamp]: Client healthcheck disconnected.

Unfortunately, Mosquitto doesn't support per-listener logging configuration, otherwise I would've disabled logging for the localhost listener.

I tried using grep to filter out those logs:

command: ['/bin/sh', '-c', '/usr/sbin/mosquitto -c /mosquitto/config/mosquitto.conf 2>&1 | grep -v -E "^.*:[ ]New connection from 127\\.0\\.0\\.1:[0-9]+ on port 1880\\.$"']

However, grep would buffer the output and when stopped SIGTERM would not be propagated to mosquitto (since it was called though a shell) resulting in container being killed instead with lost logs.

In that case having it's default HealtchCheck time be an hour could be an option. It's less quick to notice problems, but having an hourly log entry doesn't seem bad to me

Guiorgy commented 8 months ago

Just an FYI, made a docker image that adds a check-health.sh script, and filters out the healthcheck client messages from the logs. You can grab the source and build it yourself too.

PS. Not yet well tested.

TimChaubet-I4U commented 4 months ago

Seems to me like good practice could also be to use a .env file, when using docker compose.

MQTTADMIN=myuser
MQTTADMINPASS=somesupersecurepw

Then, use a more modern approach to docker-compose & yml.
The entrypoint has 'exec "$@"' which makes it also execute the CMD command, so we can override with an autocreation of default config.

x-volume-localtime:
  &etclocaltime
  type: 'bind'
  source: /etc/localtime
  target: /etc/localtime
  read_only: true
x-volume-mosquitto-mqtt-config:
  &mosquitto-mqtt-config
  type: 'bind'
  source: /mnt/user/docker_volumes/domo/mosquitto-mqtt/config
  target: /mosquitto/config
  bind:
    create_host_path: true
x-volume-mosquitto-mqtt-data:
  &mosquitto-mqtt-data
  type: 'bind'
  source: /mnt/user/docker_volumes/domo/mosquitto-mqtt/data
  target: /mosquitto/data
  bind:
    create_host_path: true
x-volume-mosquitto-mqtt-log:
  &mosquitto-mqtt-log
  type: 'bind'
  source: /mnt/user/docker_volumes/domo/mosquitto-mqtt/log
  target: /mosquitto/log
  bind:
    create_host_path: true 

services:
  mosquitto-mqtt:
    ## prerequisites (https://github.com/sukesh-ak/setup-mosquitto-with-docker)
    # - The entrypoint has 'exec "$@"' which makes it also execute the CMD command.
    # - Set MQTTADMIN='mqttadmin' and MQTTADMINPASS='something' in .env
    image: eclipse-mosquitto:latest
    labels:
      net.unraid.docker.icon: "https://timmer.ninja/images/ico/mqtt.ico" 
    env_file:
      - ./.env      
    environment:
      DEFAULT_CONFIG: |-
        allow_anonymous false
        listener 1883
        listener 9001
        protocol websockets
        persistence true
        password_file /mosquitto/config/pwfile
        persistence_file mosquitto.db
        persistence_location /mosquitto/data/
    command: 
      - /bin/sh
      - -c
      - |
        cf="/mosquitto/config/mosquitto.conf"
        pf="/mosquitto/config/pwfile"
        if [ ! -f $$cf ]; then
          echo "> Creating config file $$cf"
          echo -n "$$CONFIG_CONTENT" > $$cf
        else
          echo "> Found config file $$cf"
        fi
        cat $$cf
        echo " "
        if [ ! -f $$pf ]; then
          echo "> Creating password file"
          touch $$pf
          chown mosquitto:mosquitto $$pf
          chmod 600 $$pf
          /usr/bin/mosquitto_passwd -b $$pf $$MQTTADMIN $$MQTTADMINPASS
        fi
        echo "> Running Eclipse-Mosquitto MQTT"
        /usr/sbin/mosquitto -c $$cf
    ports:
      - "1883:1883" #default mqtt port
      - "9001:9001" #default mqtt port for websockets
    volumes:
      - <<: *etclocaltime
      - <<: *mosquitto-mqtt-config
      - <<: *mosquitto-mqtt-data
      - <<: *mosquitto-mqtt-log
    logging: 
      options:
        max-size: "10m"
        max-file: "3"      
    healthcheck:
      test: mosquitto_sub -u $$MQTTADMIN -P $$MQTTADMINPASS -t '$$SYS/#' -C 1 -i healthcheck -W 3
      interval: 10s
      timeout: 10s
      retries: 3     
    restart: unless-stopped      
    network_mode: bridge

edit: thx @Guiorgy, I went for that syntax

Guiorgy commented 4 months ago

@TimChaubet-I4U I prefer defining file contents inside environment variables instead of writing them line by line:

mosquitto:
    image: eclipse-mosquitto:latest
    # ...
    environment:
      CONFIG_CONTENT: |-
        # Docker Internal MQTT
        listener 1881
        socket_domain ipv4
        allow_anonymous false
        password_file /mosquitto/config/passwords_1881
        acl_file /mosquitto/config/access_control_1881
      PASSWORDS_1881_CONTENT: |-
        docker:docker
      ACCESS_CONTROL_1881_CONTENT: |-
        user docker
        topic read $$SYS/broker/uptime
    entrypoint: >
      /bin/sh -c '
        echo -n "$$ACCESS_CONTROL_1881_CONTENT" > /mosquitto/config/access_control_1881 &&
        chmod 0700 /mosquitto/config/access_control_1881 &&
        echo -n "$$PASSWORDS_1881_CONTENT" > /mosquitto/config/passwords_1881 &&
        chmod 0700 /mosquitto/config/passwords_1881 &&
        mosquitto_passwd -U /mosquitto/config/passwords_1881 &&
        echo -n "$$CONFIG_CONTENT" > /mosquitto/config/mosquitto.conf &&
        exec /docker-entrypoint.sh /usr/sbin/mosquitto -c /mosquitto/config/mosquitto.conf
      '
    healthcheck:
      test: mosquitto_sub -u docker -P docker -t '$$SYS/broker/uptime' -C 1 -i healthcheck -W 3
      interval: 10s
      timeout: 5s
      retries: 2     
    restart: unless-stopped

EDIT: @TimChaubet-I4U In your modified compose file, you define the environmental variable as DEFAULT_CONFIG, yet in the entrypoint you try to use the CONFIG_CONTENT variable, which doesn't exist. Double check this :P