vernemq / docker-vernemq

VerneMQ Docker image - Starts the VerneMQ MQTT broker and listens on 1883 and 8080 (for websockets).
https://vernemq.com
Apache License 2.0
178 stars 231 forks source link

Docker swarm deployment with multiple networks not possible #372

Closed GrigoriOH closed 7 months ago

GrigoriOH commented 7 months ago

Hello there,

while fiddeling around with VerneMQ in a docker swarm environment, I noticed an odd behaviour utilizing multiple networks. Let's start with a working example of the docker-compose.yml:

version: "3.7"
services:
  vmq_discovery_node:
    image: vernemq:1.13.0
    environment:
      DOCKER_VERNEMQ_SWARM: 1
    networks:
      - internal

  vmq-main-nodes:
    image: vernemq:1.13.0
    environment:
      DOCKER_VERNEMQ_SWARM: 1
      DOCKER_VERNEMQ_DISCOVERY_NODE: vmq_discovery_node
    deploy:
      replicas: 2
    networks:
      - internal

networks:
  internal:
    driver: overlay

As expected, this results in a healthy discovery node which supplies two healthy worker nodes.

When introducing another network (like the one I wanted to utilize for traefik as shown below), this changes drastically:

version: "3.7"
services:
  vmq_discovery_node:
    image: vernemq:1.13.0
    environment:
      DOCKER_VERNEMQ_SWARM: 1
    networks:
      - internal

  vmq-main-nodes:
    image: vernemq:1.13.0
    environment:
      DOCKER_VERNEMQ_SWARM: 1
      DOCKER_VERNEMQ_DISCOVERY_NODE: vmq_discovery_node
    deploy:
      replicas: 2
    networks:
      - internal
      - traefik                                            # <- new

networks:
  internal:
    driver: overlay
  traefik:                                                 # <- new
    driver: overlay                                        # <- new

Taking a deeper look into the problem and what might cause it, I noticed that the error log shows the following:

Invalid -name given to erl, VerneMQ@10.0.14.3 10.0.15.3

Indicating that the name of the node was mangled to something that includes two IPs.

The root of this can be found in the vernemq.sh startup script at lines 55ff:

# Ensure the Erlang node name is set correctly
if env | grep "DOCKER_VERNEMQ_NODENAME" -q; then
    sed -i.bak -r "s/-name VerneMQ@.+/-name VerneMQ@${DOCKER_VERNEMQ_NODENAME}/" ${VERNEMQ_VM_ARGS_FILE}
else
    if [ -n "$DOCKER_VERNEMQ_SWARM" ]; then
        NODENAME=$(hostname -i)
        sed -i.bak -r "s/VerneMQ@.+/VerneMQ@${NODENAME}/" ${VERNEMQ_VM_ARGS_FILE}
    else
        sed -i.bak -r "s/-name VerneMQ@.+/-name VerneMQ@${IP_ADDRESS}/" ${VERNEMQ_VM_ARGS_FILE}
    fi
fi

Here, hostname -i delivers multiple IPs in the case of multiple networks, rendering the script unfit for this case.

Maybe I am missing something here, but to me, multiple networks would be a necessity in a standard setup.

Please let me know if there are some thoughts mangled up in my mind and if I can do something to support troubleshooting.

Thanks for you help!

ioolkos commented 7 months ago

@GrigoriOH Thanks, I guess you have already found the issue (in the script). If you manage to fix this for you (and others), we'd be certainly thankful for a PR.


šŸ‘‰ Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq šŸ‘‰ Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

GrigoriOH commented 7 months ago

Thanks for the quick response. I'll try to dig in to this and give a heads-up (or even PR). Not being a professional myself I'm looking forward for feedback or hints.

ioolkos commented 7 months ago

@GrigoriOH merged your patch, thanks for your contribution! :)