eclipse-paho / paho.mqtt.golang

Other
2.77k stars 534 forks source link

`ConnectRetry` and `AutoReconnect` do not work when connecting to docker host on MacOS #597

Closed sirockin closed 2 years ago

sirockin commented 2 years ago

Summary

When attempting to connect from a docker container to a broker running on the host, the client successfully connects as long as the broker is running. However:

Steps to Reproduce/Minimum Working Example

In this fork I have adapted the /cmd/docker example to allow connection to a broker on the host machine, and documented changes and steps to reproduce in the example readme.

I repeat these in the next comment (below).

System Info

OS: macOS Big Sur Version 11.6.5 Docker version 20.10.13, build a224086

I have separately tested on WSL/Ubuntu and the bug does not appear: WSL Ubuntu 20.04 Docker version 20.10.14, build a224086

sirockin commented 2 years ago

This example demonstrates a bug in operation when attempting to connect/reconnect to an mqtt broker operating from the host machine. The changes I have made to the original project are as follows:

Standard operation (all working):

Normal operation

docker-compose up

Result: pub and sub connect to the broker and behave normally

To demonstrate succesful reconnection when mqtt broker goes down then up

In terminal 1:

docker-compose up

In terminal 2:

docker-compose stop mosquitto
docker-compose start mosquitto

Result: pub and sub lose connections then successfully reconnect

Succesful connection when services start without broker, then broker starts

docker-compose up

In terminal 2:

docker-compose stop mosquitto
docker-compose restart sub pub
docker-compose start mosquitto

Result: After restart, pub and sub services initially can't connect, then succeed when mosquitto is started

Using External Host:

Normal Operation

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

Result: pub and sub connect to the broker and behave normally

Unsuccesful reconnection when mqtt broker goes down then up

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

In terminal 2:

# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down

# Restart the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

Result:

Both services display single reconnection message, but never recover (pub keeps publishing)

sub_1        | [ERROR] [client]   Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
sub_1        | [DEBUG] [client]   Reconnect failed, sleeping for 1 seconds: network Error : EOF
pub_1        | [ERROR] [net]      connect got error EOF
pub_1        | [ERROR] [client]   Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
pub_1        | [DEBUG] [client]   Reconnect failed, sleeping for 1 seconds: network Error : EOF
...

Failure to connect when services start without broker, then broker starts

In terminal 2:

# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

Result:

Both services start and hang at connect started, never recover

sub_1        | SERVERADDRESS: host.docker.internal:1883
sub_1        | [DEBUG] [client]   Connect()
sub_1        | [DEBUG] [store]    memorystore initialized
pub_1        | SERVERADDRESS: host.docker.internal:1883
pub_1        | [DEBUG] [client]   Connect()
pub_1        | [DEBUG] [store]    memorystore initialized
pub_1        | [DEBUG] [client]   about to write new connect msg
pub_1        | [DEBUG] [client]   socket connected to broker
pub_1        | [DEBUG] [client]   Using MQTT 3.1.1 protocol
pub_1        | [DEBUG] [net]      connect started
sub_1        | [DEBUG] [client]   about to write new connect msg
sub_1        | [DEBUG] [client]   socket connected to broker
sub_1        | [DEBUG] [client]   Using MQTT 3.1.1 protocol
sub_1        | [DEBUG] [net]      connect started
MattBrittan commented 2 years ago

Unfortunately I don't have access to a Mac so am going to struggle to debug this (as you mention it works OK under Windows/Linux). I suspect that the connection to the broker is being opened to a black hole (so the connection stays open but nothing is received).

Can you please try https://github.com/ChIoT-Tech/paho.mqtt.golang/tree/Issue597 and see if that resolves the issue? If not some additional logging will be needed to identify what is happening.

sirockin commented 2 years ago

Hi @MattBrittan . Thank you for the swift response. Yes this does resolve the issue. I'll link to that for the time being and await the merge.