eclipse-mosquitto / mosquitto

Eclipse Mosquitto - An open source MQTT broker
https://mosquitto.org
Other
9.12k stars 2.41k forks source link

Combination of `max_inflight_messages 1` and `max_queued_messages 0` causes "Bad socket read/write" error #3088

Open mossblaser opened 3 months ago

mossblaser commented 3 months ago

Mosquitto v2.0.17 - v2.0.18 (latest at time of writing)

Since mosquitto v2.0.17, or more specifically, commit bfb373d774 ("Fix max_queued_message 0 stopping clients from receiving messages."), when the following configuration is used:

max_inflight_messages 1
max_queued_messages 0
log_type all  # Not part of bug, just to see what's happening

Sending two QoS 2 PUBLISH messages from a client in quick succession produces a "Bad socket read/write" error and disconnects the client.

Full mosquitto output (click to expand) $ mosquitto -c mosquitto.conf 1721319767: mosquitto version 2.0.18 starting 1721319767: Config loaded from my_mosquitto.conf. 1721319767: Starting in local only mode. Connections will only be possible from clients running on this machine. 1721319767: Create a configuration file which defines a listener to allow remote access. 1721319767: For more details see https://mosquitto.org/documentation/authentication-methods/ 1721319767: Opening ipv4 listen socket on port 1883. 1721319767: Opening ipv6 listen socket on port 1883. 1721319767: mosquitto version 2.0.18 running 1721319769: New connection from ::1:51211 on port 1883. 1721319769: New client connected from ::1:51211 as auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (p2, c1, k60). 1721319769: No will message specified. 1721319769: Sending CONNACK to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (0, 0) 1721319769: Received PUBLISH from auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (d0, q2, r0, m1, 'foo', ... (3 bytes)) 1721319769: Sending PUBREC to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (m1, rc0) 1721319769: Received PUBLISH from auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (d0, q2, r0, m2, 'foo', ... (3 bytes)) 1721319769: Sending PUBREC to auto-44FED64F-E9D6-546E-F356-DC5F36A5111C (m2, rc151) 1721319769: Bad socket read/write on client auto-44FED64F-E9D6-546E-F356-DC5F36A5111C: Unknown error.

The following minimal paho-mqtt based Python client can be used to reproduce this issue:

repro.py script (click to expand) ```python import paho.mqtt.client as mqtt mqttc = mqtt.Client(mqtt.CallbackAPIVersion.VERSION2) mqttc.connect("localhost", 1883, 60) mqttc.on_connect = lambda *_: print("on_connect") mqttc.on_publish = lambda *_: print("on_publish") mqttc.loop_start() mqttc.publish("foo", b"bar", qos=2) mqttc.publish("foo", b"bar", qos=2) input("press enter to exit...") ``` Run using: ```bash $ python3 -m venv venv $ venv/bin/pip install paho-mqtt $ venv/bin/python repro.py ```

It is the combination of both max_* options which triggers this bug -- remove either one and the client can connect and publish successfully.

Similarly, if a lower QoS than 2 is used, the bug does not manifest.

Likewise, don't send the two PUBLISH messages in quick succession and the problem goes away too. For example, just send one or put a time.sleep(1) between them and the bug does not manifest.

Mosquitto 2.0.16 (related, earlier bug)

Though the bug above was first released in mosquitto 2.0.17, the configuration shown also breaks mosquitto 2.0.16 due to a different bug introduced in commit 6113eac95a ("Fix for CVE-2023-28366"). This bug caused all messages to be dropped when max_queued_messages 0 is used with the log message "Outgoing messages are being dropped".

Full mosquitto 2.0.16 output (click to expand) 1721320706: mosquitto version 2.0.16 starting 1721320706: Config loaded from my_mosquitto.conf. 1721320706: Starting in local only mode. Connections will only be possible from clients running on this machine. 1721320706: Create a configuration file which defines a listener to allow remote access. 1721320706: For more details see https://mosquitto.org/documentation/authentication-methods/ 1721320706: Opening ipv4 listen socket on port 1883. 1721320706: Opening ipv6 listen socket on port 1883. 1721320706: mosquitto version 2.0.16 running 1721320708: New connection from ::1:40175 on port 1883. 1721320708: New client connected from ::1:40175 as auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (p2, c1, k60). 1721320708: No will message specified. 1721320708: Sending CONNACK to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (0, 0) 1721320708: Outgoing messages are being dropped for client auto-F893A8DD-E931-3880-A59C-5BE2C381C192. 1721320708: Received PUBLISH from auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (d0, q2, r0, m1, 'foo', ... (3 bytes)) 1721320708: Sending PUBREC to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (m1, rc0) 1721320708: Received PUBLISH from auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (d0, q2, r0, m2, 'foo', ... (3 bytes)) 1721320708: Sending PUBREC to auto-F893A8DD-E931-3880-A59C-5BE2C381C192 (m2, rc151) 1721320708: Bad socket read/write on client auto-F893A8DD-E931-3880-A59C-5BE2C381C192: Unknown error.

In this case, the bug is entirely caused by the max_queued_messages 0 option and prevents clients from even connecting.

It appears that the fix for this bug -- the commit bfb373d774 ("Fix max_queued_message 0 stopping clients from receiving messages.") mentioned at the top of this report -- may have been some way incomplete since it introduced a compatibility issue with the max_queued_messages 0 option.

Mosquitto 2.0.15 and earlier (works)

Mosquitto 2.0.15 works correctly with both max_inflight_messages 1 and max_queued_messages 0.

Other details

I have tested this on Arch Linux using Mosquitto binaries built directly from the repository with default settings. (The same bug does, however, appear in Arch's packaged versions).

The commits identified in this report were found using git bisect rather than an exhaustive scan! It is possible (though hopefully unlikely) I've identified the wrong changes. Sorry I haven't had the time to dig any deeper!

In case it is relevant, the reason for my use of both max_inflight_messages 1 and max_queued_messages 0 is that my application requires guaranteed, in-order delivery of messages, even during bursts of messages. (Naturally, all clients are trusted and any bursts in messages are strictly bounded -- I'm aware of the caveats of using these options!).

As ever, thank you so much to everyone for all the hard work in making and maintaining mosquitto!

ftapajos commented 2 months ago

Yep. The problem from 6113eac95a9df634fbc858be542c4a0456bfe7b9 is here:

if(context->out_packet_count >= db.config->max_queued_messages){
  rc = MQTT_RC_QUOTA_EXCEEDED;
}

I am not brave enough to patch it to:

if(context->out_packet_count >= db.config->max_queued_messages && db.config->max_queued_messages > 0 ){
  rc = MQTT_RC_QUOTA_EXCEEDED;
}

Since it may be harmful to prevent CVE-2023-28366 from happening again. Another option is to change documentation and make max_queued_messages=0 equals to max_queued_messages=1000, which is the default.

@ralight could shed some light on this matter

mossblaser commented 2 months ago

Not having the option to have unlimited queued messages does not enable reliable in-order delivery, though. (I realise this isn't achievable in many settings for other reasons but it is in some, in particular mine ;)).

Thanks very much for looking into this!