eclipse-mosquitto / mosquitto

Eclipse Mosquitto - An open source MQTT broker
https://mosquitto.org
Other
9.01k stars 2.39k forks source link

"per_listener_settings true" causes messages for clients to no longer be stored #2618

Open JSchy65 opened 2 years ago

JSchy65 commented 2 years ago

mosquitto broker version 2.0.15 (Oracle Linux 8)

Start mosquitto with the following configuration: sys_interval 1 per_listener_settings true listener 1883 allow_anonymous true

mosquitto_sub -p 1883 -v -t '$SYS/broker/clients/connected' -t '$SYS/broker/clients/total' -t '$SYS/broker/store/messages/count'

what leads to:

$SYS/broker/clients/total 1 $SYS/broker/clients/connected 1 $SYS/broker/store/messages/count 53

mosquitto_sub -p 1883 -q 2 -t test/# -d -c -i clientA

$SYS/broker/clients/connected 2 $SYS/broker/clients/total 2

stop client / ctrl-c / -> disconnect

$SYS/broker/clients/connected 1

mosquitto_pub -p 1883 -q 2 -t test/xxx -m "Testnachricht"

The count of messages should increment by 1 but is not !!!

mosquitto_sub -p 1883 -q 2 -t test/# -d -c -i clientA

Does not return the expected message !!!

Retry with per_listener_settings false

Now it works like expected.

Thank You very much, Regards Joachim Schachermayer

JSchy65 commented 2 years ago

The issue occurs in versions 2.0.12, 2.0.13, 2.0.14 and 2.0.15. The issue occurs NOT in 1.6.9, 1.6.15, 2.0.0, 2.0.10 and 2.0.11 (others not tested).

jh-profilsys-de commented 2 years ago

I found the same Bug, for people who rely on the queueing of QOS 1 or 2 Messages this is a Showstopper for updating to the latest versions.

Steps to reproduce:

Change the Setting to "false" and repeat

You will get the queued messages!

Rotario commented 2 years ago

I've found the same issue today 2.0.14 and 2.0.15 mosquitto.conf

persistence true
persistence_location /mosquitto/data
log_type all

per_listener_settings true

listener 8883
acl_file /mosquitto/config/acl

listener 1883
acl_file /mosquitto/config/acl
allow_anonymous true

test script

timeout 1 mosquitto_sub -c -i test -t /test -q 1
mosquitto_pub -t /test -m "$(date)" -q 1
RET=$(timeout 1 mosquitto_sub -c -i test -t /test -q 1)
if [ ! -z "$RET" ]
   then
    echo "Success! $RET"
else
    echo "Failed!"
fi
JSchy65 commented 2 years ago

To emphasize the importance of this topic, I repeat what jh-profilsys-de already wrote above:

"for people who rely on the queueing of QOS 1 or 2 Messages this is a Showstopper for updating to the latest versions"

I think this problem affects a large group of users. Therefore, in my opinion, its priority should be increased.

strahlc commented 1 year ago

+1

gblach commented 1 year ago

I found this bug was introduced by the commit 79542158f47ac589b2248ad40102a3c8e545d413

pat1 commented 1 year ago

related to #2526

pat1 commented 1 year ago

revert commit https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413 solve the problem

diff -ru mosquitto-2.0.15-orig/lib/net_mosq.c mosquitto-2.0.15/lib/net_mosq.c
--- mosquitto-2.0.15-orig/lib/net_mosq.c    2022-08-16 15:34:02.000000000 +0200
+++ mosquitto-2.0.15/lib/net_mosq.c 2023-03-06 09:43:51.515905263 +0100
@@ -250,7 +250,7 @@
 #ifdef WITH_BROKER
    if(mosq->listener){
        mosq->listener->client_count--;
-       mosq->listener = NULL;
+       //mosq->listener = NULL;
    }
 #endif

but I do not know other feed back and bugs this can create

LiamDavey commented 1 year ago

+1 took me quite a while to work out this was the cause!

pat1 commented 1 year ago

Reverting commit https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413 do not solve the problem; after some time the broker do not send delayed messages and need a restart. After one year any news about this bug ?

Jasihien commented 1 year ago

I'm also having exactly the same issue. Here is my dockerized setup that reveals the problem, should be easy to reproduce:

docker-compose.yml

version: "3.9"

services:
  broker:
    image: eclipse-mosquitto:${VERSION}
    command: /usr/sbin/mosquitto -v -c /mosquitto.conf
    volumes:
      - ./mosquitto.conf:/mosquitto.conf

  clients:
    image: eclipse-mosquitto:${VERSION}
    command: >
        /bin/sh -c "
          /usr/bin/mosquitto_sub -h broker -t TEST -i SUB -q 2 -c -d -v -W 5 ;
          /usr/bin/mosquitto_pub -h broker -t TEST -m HELLO -q 2 -i PUB -d ;
          /usr/bin/mosquitto_sub -h broker -t TEST -i SUB -q 2 -c -d -v -W 5
        "
    depends_on:
      - broker

mosquitto.conf (variant 1, fails with mosquitto >= 2.0.12)

per_listener_settings true
listener 1883
allow_anonymous true

mosquitto.conf (variant 2, works fine in all versions)

per_listener_settings false
listener 1883
allow_anonymous true

Run with different broker versions, using either variant 1 or 2 of the config:

VERSION=2.0.10 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.11 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.12 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.13 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.14 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.15 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.16 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.17 docker compose up --force-recreate --abort-on-container-exit
VERSION=2.0.18 docker compose up --force-recreate --abort-on-container-exit

Expected output is something like this:

mosquitto_session-broker-1   | 1695734381: mosquitto version 2.0.11 starting
mosquitto_session-broker-1   | 1695734381: Config loaded from /mosquitto.conf.
mosquitto_session-broker-1   | 1695734381: Opening ipv4 listen socket on port 1883.
mosquitto_session-broker-1   | 1695734381: Opening ipv6 listen socket on port 1883.
mosquitto_session-broker-1   | 1695734381: mosquitto version 2.0.11 running
mosquitto_session-clients-1  | Client SUB sending CONNECT
mosquitto_session-broker-1   | 1695734381: New connection from 172.18.0.3:47330 on port 1883.
mosquitto_session-broker-1   | 1695734381: New client connected from 172.18.0.3:47330 as SUB (p2, c0, k60).
mosquitto_session-broker-1   | 1695734381: No will message specified.
mosquitto_session-broker-1   | 1695734381: Sending CONNACK to SUB (0, 0)
mosquitto_session-broker-1   | 1695734381: Received SUBSCRIBE from SUB
mosquitto_session-broker-1   | 1695734381:      TEST (QoS 2)
mosquitto_session-broker-1   | 1695734381: SUB 2 TEST
mosquitto_session-broker-1   | 1695734381: Sending SUBACK to SUB
mosquitto_session-broker-1   | 1695734386: Received DISCONNECT from SUB
mosquitto_session-broker-1   | 1695734386: Client SUB disconnected.
mosquitto_session-clients-1  | Timed out
mosquitto_session-clients-1  | Client SUB received CONNACK (0)
mosquitto_session-clients-1  | Client SUB sending SUBSCRIBE (Mid: 1, Topic: TEST, QoS: 2, Options: 0x00)
mosquitto_session-clients-1  | Client SUB received SUBACK
mosquitto_session-clients-1  | Subscribed (mid: 1): 2
mosquitto_session-clients-1  | Client SUB sending DISCONNECT
mosquitto_session-clients-1  | Client PUB sending CONNECT
mosquitto_session-broker-1   | 1695734386: New connection from 172.18.0.3:47332 on port 1883.
mosquitto_session-broker-1   | 1695734386: New client connected from 172.18.0.3:47332 as PUB (p2, c1, k60).
mosquitto_session-broker-1   | 1695734386: No will message specified.
mosquitto_session-broker-1   | 1695734386: Sending CONNACK to PUB (0, 0)
mosquitto_session-broker-1   | 1695734386: Received PUBLISH from PUB (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-broker-1   | 1695734386: Sending PUBREC to PUB (m1, rc0)
mosquitto_session-broker-1   | 1695734386: Received PUBREL from PUB (Mid: 1)
mosquitto_session-broker-1   | 1695734386: Sending PUBCOMP to PUB (m1)
mosquitto_session-broker-1   | 1695734386: Received DISCONNECT from PUB
mosquitto_session-clients-1  | Client PUB received CONNACK (0)
mosquitto_session-broker-1   | 1695734386: Client PUB disconnected.
mosquitto_session-clients-1  | Client PUB sending PUBLISH (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client PUB received PUBREC (Mid: 1)
mosquitto_session-clients-1  | Client PUB sending PUBREL (m1)
mosquitto_session-clients-1  | Client PUB received PUBCOMP (Mid: 1, RC:0)
mosquitto_session-clients-1  | Client PUB sending DISCONNECT
mosquitto_session-clients-1  | Client SUB sending CONNECT
mosquitto_session-broker-1   | 1695734386: New connection from 172.18.0.3:47334 on port 1883.
mosquitto_session-broker-1   | 1695734386: New client connected from 172.18.0.3:47334 as SUB (p2, c0, k60).
mosquitto_session-broker-1   | 1695734386: No will message specified.
mosquitto_session-broker-1   | 1695734386: Sending CONNACK to SUB (1, 0)
mosquitto_session-broker-1   | 1695734386: Sending PUBLISH to SUB (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client SUB received CONNACK (0)
mosquitto_session-broker-1   | 1695734386: Received SUBSCRIBE from SUB
mosquitto_session-clients-1  | Client SUB sending SUBSCRIBE (Mid: 1, Topic: TEST, QoS: 2, Options: 0x00)
mosquitto_session-clients-1  | Client SUB received PUBLISH (d0, q2, r0, m1, 'TEST', ... (5 bytes))
mosquitto_session-clients-1  | Client SUB sending PUBREC (m1, rc0)
mosquitto_session-clients-1  | Client SUB received SUBACK
mosquitto_session-clients-1  | Subscribed (mid: 1): 2
mosquitto_session-broker-1   | 1695734386:      TEST (QoS 2)
mosquitto_session-broker-1   | 1695734386: SUB 2 TEST
mosquitto_session-broker-1   | 1695734386: Sending SUBACK to SUB
mosquitto_session-broker-1   | 1695734386: Received PUBREC from SUB (Mid: 1)
mosquitto_session-broker-1   | 1695734386: Sending PUBREL to SUB (m1)
mosquitto_session-broker-1   | 1695734386: Received PUBCOMP from SUB (Mid: 1, RC:0)
mosquitto_session-clients-1  | Client SUB received PUBREL (Mid: 1)
mosquitto_session-clients-1  | Client SUB sending PUBCOMP (m1)
mosquitto_session-clients-1  | TEST HELLO                         <=============== HELLO received here
mosquitto_session-clients-1  | Timed out
mosquitto_session-broker-1   | 1695734391: Received DISCONNECT from SUB
mosquitto_session-clients-1  | Client SUB sending DISCONNECT
mosquitto_session-broker-1   | 1695734391: Client SUB disconnected.
Summary of my results ("OK" - HELLO received, "NOT OK" - HELLO not received on reconnect): Version Variant 1 Variant 2
2.0.10 OK OK
2.0.11 OK OK
2.0.12 NOT OK OK
2.0.13 NOT OK OK
2.0.14 NOT OK OK
2.0.15 NOT OK OK
2.0.16 NOT OK OK
2.0.17 NOT OK OK
2.0.18 NOT OK OK
torrunes commented 7 months ago

Anyone knows if there is any plans or progress on this bug? I would think having the option of queuing messages while at the same time having several differently configured listeners would be a major feature for many users.

xammmue commented 7 months ago

Anyone knows if there is any plans or progress on this bug? I would think having the option of queuing messages while at the same time having several differently configured listeners would be a major feature for many users.

We also just encountered this in our project. Wasn't easy to find out that the issue of QoS 1 messages not being delivered properly was this (from my pov) unrelated config setting.

llamaonaskateboard commented 3 months ago

After also running into this issue, I managed to track down the problem commit which came in 2.0.12: https://github.com/eclipse/mosquitto/commit/79542158f47ac589b2248ad40102a3c8e545d413

This commit sets mosq->listener = NULL upon disconnect, however mosquitto_acl_check() (called by subs__send() for new messages) requires the listener to still exist if per_listener_settings is true:

if(db.config->per_listener_settings){
        if(context->listener){
            opts = &context->listener->security_options;
        }else{
            return MOSQ_ERR_ACL_DENIED;
        }

I'm not sure how mosq->listener = NULL fixes the original issue of _Fix maxconnections option not being correctly counted. which that commit intended.

@ralight

llamaonaskateboard commented 3 months ago

It looks like the original max_connections issue stems from connect__on_authorised() calling do_disconnect() which in turn calls context__disconnect() if the context already exists upon client connection.

context__disconnect() ends up with net__socket_close() getting called twice in this scenario (decrementing the client_count each time), once directly (_mosq_csduplicate) and then a second time as part of context__add_to_disused()/context__freedisused()/context__cleanup() (_mosq_csdisused) .

The result without mosq->listener = NULL or some other mitigation is that listener->client_count can end up as a negative number after the multiple decrements.

Another possible solution could be as below? If I'm following this all correctly, the state only ends up being _mosq_csduplicate or _mosq_csdisused for persisted or existing clients upon reconnect.

if(mosq->listener && mosq->state != mosq_cs_duplicate && mosq->state != mosq_cs_disused){
    mosq->listener->client_count--;
}
reubenmiller commented 1 week ago

@ralight Is there anything that the community can assist with here?