halfgaar / FlashMQ

FlashMQ is a fast light-weight MQTT broker/server, designed to take good advantage of multi-CPU environments
https://www.flashmq.org/
Open Software License 3.0
173 stars 24 forks source link

Another protocol violations or bugs in FlashMQ #104

Closed songxpu closed 1 month ago

songxpu commented 1 month ago

Hi @halfgaar, some findings still have not been reported in #103, so I have submitted this issue and apologize for any inconvenience caused.

Running command:

./FlashMQBuildRelease/flashmq -c /home/ubuntu/conf/flashmq.conf

flashmq.conf:

listen {
    protocol mqtt
    inet_protocol ip4
    inet4_bind_address 0.0.0.0
    port 1883
}

bridge {
    address 192.168.222.128
    port 1884
    tls off
    protocol_version mqtt5
    keepalive 600
    bridge_protocol_bit false
    publish topic 1
    subscribe topic 1
}

allow_anonymous true
log_file    /tmp/flashmq/flashmq.log
storage_dir /tmp/flashmq
songxpu commented 1 month ago

According to the specification of MQTTv3.1.1 and MQTTv5.0:

[MQTT-3.1.0-1]
After a Network Connection is established by a Client to a Server, the first packet sent from the Client to the Server MUST be a CONNECT packet.

However, FlashMQ unexpectedly responded to any PINGREQ initiated by clients who had not yet established an MQTT connection with the broker, that is, the first request could be a PINGREQ message instead of a CONNECT, which violates protocol specification.

echo c000  | xxd -p -r | nc 172.17.0.5 1883

image


Similarly, as defined by the specification, PINGREQ is sent from the client to the server.

However, this seems to be violated in bridged mode, where FlashMQ and the remotely bridged broker have the identities of client and server, respectively.

It is permissible behavior for FlashMQ to initiate a PINGREQ to the remote broker (IP 192.168.222.128) and receive a PINGRES.

However, I found that if the remote broker accidentally sends a PINGREQ message to FlashMQ and FlashMQ returns a PINGRESP, this seems to be a bug, and I've noticed that other open-source MQTT brokers (EMQX、HiveMQ、Vernemq、NanoMQ) don't seem to have this problem.

I think this is essentially due to the mistake of treating the remote broker and the local client as the same identity, but essentially they are different.

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

[MQTT-3.3.4-6] 
A PUBLISH packet sent from a Client to a Server MUST NOT contain a Subscription Identifier.

However, if we send such a packet:

echo 106100044d5154540540b3b037119afb60e317001901215c5326000f766e72366d4541644d78553c44327800173049574d36324268715a6179524b5a62536749534a31360013317675535434755733374e64397846585a38570008676235716d5836363554000841684b3146454962f71a16010109000c79a506aff5eef39ed5210cd60bba849b523155596942337761334c376e765936573739413862666a46414e4e3172647544345773415778724a6667386d3258653363 | xxd -p -r | nc 172.17.0.5 1883

FlashMQ unexpectedly returned a response instead of rejecting such a request. image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

[MQTT-3.4.2-1] 
The Client or Server sending the PUBACK packet MUST use one of the PUBACK Reason Codes.

# note that Table 3‑4  define the PUBACK Reason Codes.

However, if we send such a PUBACK (with invalid reason codes like 0x01) to FlashMQ, FlashMQ seems to have received such messages normally.

# the packet contain CONNECT -> PUBACK -> PUBLISH (QoS1)
echo 10800100044d51545405c20001411701212e3b22464426001a5a554c7742485a7a676a6e6579664f77447230496a374a46304c001a38436f4c643657616b6a6a4e74306c436b6f78536b593143356c001b6d6c33456b677234313272796d6d707a5a7664534e6c735965594200086732514e52776a38000b4743684f424f6c7a7967624004b65d0100320f0005746f706963000100746f706963  | xxd -p -r | nc 172.17.0.5 1883

The response is CONACK->PUBACK, which proves that the FlashMQ received PUBACK was not rejected as expected.

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

[MQTT-3.6.1-1]

Bits 3,2,1 and 0 of the Fixed Header in the PUBREL packet are reserved and MUST be set to 0,0,1 and 0 respectively. The Server MUST treat any other value as malformed and close the Network Connection.

However, the situation is different:

# The PUBREL (the reserve field in fixed header is 0, 0, 0, 0)
echo 107900044d5154540574625e0a1700190121de6022e03400126c5141593872614674494b73536a7956444d250101027f4801a708001b4a43394d424c416c5033654d704d457a626a5177527259663777330008465a416b4f4f4367001638344d77536b3430774f526451496463617259773337000636725930564c603ac6b000361f0017736d73496a39436331524a6d7541307644425a373968792600136e656f3666454367794c4a457948686e64696f000479326175 | xxd -p -r | nc 172.17.0.5 1883 

The expected behavior was considered illegal and the network connection was closed, but the corresponding request was unexpectedly received.

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include Request Problem Information more than once, or to have a value other than 0 or 1.

If we send a Connect (the value of Resequest Problem Information is set to 2), FlashMQ returns the CONNACK with successful code unexpectedly.

echo 105600044d5154540580404724118580d11d16000478697441170219012600075869563769414100083237746a6f4c3972001169587932653251324d6a3845697435517100126e63424f613833736f464e4e4d5944366442 | xxd -p -r | nc 172.17.0.5 1883

image image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include Authentication Data if there is no Authentication Method.

Replay:

echo 107500044d515454054069d62216001a3836444175657452417668423854684a735a7474744935793341274e45fab6001974496e5645793058343872467a735361747763776c6849787a001d575a796d784c4e55317a335a7175744936434e673748576f34307a4f42000c4b4c32484b46385933343065 | xxd -p -r | nc 172.17.0.5 1883

image

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Receive Maximum value more than once or for it to have the value 0.

Replay such packet:

echo 105100044d5154540540ea5c2e11a5b8420316001956343176417a6c647a593745557238787133336b526e324850190021e74721e7472785e0551c000000147648777435796e39333565667a586a5773763766 | xxd -p -r | nc 172.17.0.5 1883

image

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Message Expiry Interval more than once.

Replay such packet:

echo 108a0100044d515454059469d30519012181a700154e4838466b6a54617152696e326c797052316b50664103001c4935556b465530434d3778366d736371444e5a777a6b35654637386508000950624a747043434d4a0900132cc6f6912ac734b5e6a8b2acdc301886dd273600095a6f6a6632473268710007526a744d756576000b64364d61664257353733593c8d0100036e6b45dd3b510252ab83340252ab833403001854556873333351354b6f42643261696c4258364c6a547138080005724f4a4e6a09001e06ec9601ab2f750b9df8c182d84259566a61556c59d1a95fd38688652caa23409657777a6d484d344e5342766631556c5248653130516865757a665768594b787957636c5a7143394b713848737070504942687469 | xxd -p -r | nc 172.17.0.5 1883

The expected behavior is to consider this message as a protocol error, but unexpectedly received the corresponding PUBREC. image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Response Topic more than once.

Replay:

echo 10ae0100044d51545405c6f8d7021701000f46356d30636e4b34633958413030332c020748306803000026000449454148001b366f694536686241734132583462704368426b6a77355832314d450018416b484869364752656558436c53725a36534a6f5a393962001733555a41655261594764374341486f4f4a78474d756c300014425243504679514b6e42534a4373453552434c570018496451446b6a736d44366a66796d513556557a7a6c4a68703d4a00025959c10f1a010002d31586140800013108000132260003417968000350754f4e616b6937656a796a696e77434a66775573544e377376644f4c38544f32434e6431365972304c4f6ece00 | xxd -p -r | nc 172.17.0.5 1883

The expected behavior is to consider this message as a protocol error, but unexpectedly receive the corresponding PUBREC.

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Content Type more than once.

Replay such packet:

echo 10950100044d515454054e28d00a1120b290b627036e6116000261453d01000800074679685377495218ef612d5926000c72376a72376d747062733257001b735a595569453539795a487333616b4833494e77496b4762494157000658415873325a00166c47547172693658526b536f4e724666496d486f5573001c306d54567544763978426f526b59374c3672346a433142514c503838318d010014556b7768636d324d5371307667584f696943623351010103001a4e624553556651574a394b354f58367333397a6b6a636157373203001a4e624553556651574a394b354f58367333397a6b6a636157373308000d6856344a4e7873487062727a390bbeaee12f52675542547a5373475876796d644d68373161785a336d715430305a414b5346524c746e55ce00  | xxd -p -r | nc 172.17.0.5 1883

I debug with gdb and found it seems that FlashMQ is iteratively reading multiple ContentType property values and saving them together.

image

image

songxpu commented 1 month ago

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Topic Alias value more than once.

The situation is the same as the above. Replay such packet:

echo 101000044d5154540502003c03210014000030150006746f706963310623000a23000b313233313233 | xxd -p -r | nc 172.17.0.5 1883
halfgaar commented 1 month ago

Hi @halfgaar, some findings still have not been reported in #103, so I have submitted this issue and apologize for any inconvenience caused.

No apologies necessary. I just didn't know you weren't done. I'm OK with you opening a separate issue per finding to avoid that problem.

Anyway, fixes are underway. I'll push some code soon.

Some open up some considerations. Like the invalid puback code. FlashMQ doesn't actually do anything with it; the only thing it can do is clear the message from the queue, which it must do always. Checking it would just add overhead to a potentially hot path of the code. :thinking:

bigsmoke commented 1 month ago

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

songxpu commented 1 month ago

Thanks, I'm honored to help make FlashMQ better!

halfgaar commented 1 month ago

I addressed most of the issues in this branch: https://github.com/halfgaar/FlashMQ/tree/more-protocol-fixes

It's not final yet.

Currently, two are in an undecided state:

JaylinYu commented 1 month ago

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

How come a smiling face is "abuse"?

halfgaar commented 1 month ago

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

How come a smiling face is "abuse"?

That comment was marked as abuse. I unmarked it again. I'm not sure what happened there.

halfgaar commented 1 month ago

I had forgotten one mqtt5 property to check for 'more than once'. I fixed that. It's been pushed to master. A new release with this included will be made soon.

I decided to leave the ping/pong and suback codes unchanged. If you can show how it will actually be a problem, you can always open a new issue, or perhaps a discussion.

Thanks for the reports, and the good reproduction code :+1:. These 'not more than once' things were somewhere on my TODO list, so your report was good incentive.