eclipse / mosquitto

Eclipse Mosquitto - An open source MQTT broker
https://mosquitto.org
Other
8.99k stars 2.39k forks source link

auto reconnect stopped after a few times #3107

Open tdk2nyt opened 1 month ago

tdk2nyt commented 1 month ago

mosquitto version: 2.0.15 system: Linux kernel 3.4.40

Hi,

I use libmosquitto as MQTT client. One day, I found MQTT connection lost, I checked log that recorded a few reconnection events, but it stopped to reconnect with Unable to connect (Lookup error.) In log, there is a strange scenario: the app try to subscribe topic, this is invoked in on_connect callback, subscribe fails with code 4(MOSQ_ERR_NO_CONN)

I'm using these APIs: MQTT_PROTOCOL_V311 mosquitto_connect_bind_async mosquitto_loop_start mosquitto_connect_v5_callback_set mosquitto_subscribe_v5_callback_set mosquitto_message_v5_callback_set mosquitto_disconnect_v5_callback_set mosquitto_publish_v5_callback_set

Any suggestion is appreciated, thanks!

rickvargas commented 3 days ago

Hi, Libmosquitto has some return codes (MOSQERR\<code>) that are considered fatal errors and once happened will stop the automatic reconnection and your client application should 'manually' request to connect again. One of these fatal errors is the MOSQ_ERR_EAI (Lookup error), that indicates the mosquitto wasn't able to resolve the DNS to connect to your broker host. That said, your application should request a new connection.

As the error of MOSQ_ERR_EAI happened, it means your application is not connected to the broker and any subscribe/publish will fail due not having a connection to the broker (rc MOSQ_ERR_NO_CONN).

Note that on the on connect callback, your application should check if the return code received is MOSQ_ERR_SUCCESS and if yes then it is fine, but if it is any other MOSQERR\<code> (like your case MOSQ_ERR_EAI), the application should do a new mosquitto_connect_bind_async and mosquitto_loop_start.

Mosquitto fatal return code errors are:

But for any rc, if a errno EPROTO happened, it will also exit the main loop and you should ask for a new connection (not much programmatic, you see)