Closed sirockin closed 2 years ago
This example demonstrates a bug in operation when attempting to connect/reconnect to an mqtt broker operating from the host machine. The changes I have made to the original project are as follows:
pub/main.go
and sub/main.go
:
docker-compose.yml
:
docker-compose.mosquitto.yml
to allow starting mosquitto in separate network and exposing to hostdocker-compose up
Result:
pub
and sub
connect to the broker and behave normally
In terminal 1:
docker-compose up
In terminal 2:
docker-compose stop mosquitto
docker-compose start mosquitto
Result:
pub
and sub
lose connections then successfully reconnect
docker-compose up
In terminal 2:
docker-compose stop mosquitto
docker-compose restart sub pub
docker-compose start mosquitto
Result:
After restart, pub
and sub
services initially can't connect, then succeed when mosquitto is started
In terminal 2:
# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d
In terminal 1:
# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up
Result:
pub
and sub
connect to the broker and behave normally
In terminal 2:
# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d
In terminal 1:
# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up
In terminal 2:
# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down
# Restart the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d
Result:
Both services display single reconnection message, but never recover (pub keeps publishing)
sub_1 | [ERROR] [client] Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
sub_1 | [DEBUG] [client] Reconnect failed, sleeping for 1 seconds: network Error : EOF
pub_1 | [ERROR] [net] connect got error EOF
pub_1 | [ERROR] [client] Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
pub_1 | [DEBUG] [client] Reconnect failed, sleeping for 1 seconds: network Error : EOF
...
In terminal 2:
# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down
In terminal 1:
# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up
In terminal 2:
# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d
Result:
Both services start and hang at connect started
, never recover
sub_1 | SERVERADDRESS: host.docker.internal:1883
sub_1 | [DEBUG] [client] Connect()
sub_1 | [DEBUG] [store] memorystore initialized
pub_1 | SERVERADDRESS: host.docker.internal:1883
pub_1 | [DEBUG] [client] Connect()
pub_1 | [DEBUG] [store] memorystore initialized
pub_1 | [DEBUG] [client] about to write new connect msg
pub_1 | [DEBUG] [client] socket connected to broker
pub_1 | [DEBUG] [client] Using MQTT 3.1.1 protocol
pub_1 | [DEBUG] [net] connect started
sub_1 | [DEBUG] [client] about to write new connect msg
sub_1 | [DEBUG] [client] socket connected to broker
sub_1 | [DEBUG] [client] Using MQTT 3.1.1 protocol
sub_1 | [DEBUG] [net] connect started
Unfortunately I don't have access to a Mac so am going to struggle to debug this (as you mention it works OK under Windows/Linux). I suspect that the connection to the broker is being opened to a black hole (so the connection stays open but nothing is received).
Can you please try https://github.com/ChIoT-Tech/paho.mqtt.golang/tree/Issue597 and see if that resolves the issue? If not some additional logging will be needed to identify what is happening.
Hi @MattBrittan . Thank you for the swift response. Yes this does resolve the issue. I'll link to that for the time being and await the merge.
Summary
When attempting to connect from a docker container to a broker running on the host, the client successfully connects as long as the broker is running. However:
ConnectRetry
set, the client will not connect if the broker starts after the clientAutoReconnect
set, the client will not reconnect if the broker stops then startsSteps to Reproduce/Minimum Working Example
In this fork I have adapted the
/cmd/docker
example to allow connection to a broker on the host machine, and documented changes and steps to reproduce in the example readme.I repeat these in the next comment (below).
System Info
OS: macOS Big Sur Version 11.6.5 Docker version 20.10.13, build a224086
I have separately tested on WSL/Ubuntu and the bug does not appear: WSL Ubuntu 20.04 Docker version 20.10.14, build a224086