hiveeyes / terkin-datalogger

Datalogger for MicroPython and CPython.
https://terkin.org
GNU Affero General Public License v3.0
60 stars 28 forks source link

Adjust software tests to use Mosquitto 2.0.12 #97

Closed amotl closed 3 years ago

amotl commented 3 years ago

Hi there,

in order to investigate https://github.com/micropython/micropython-lib/issues/445, this bumps the Mosquitto version from 2.0.11 to 2.0.12. See also [1] ff.

With kind regards, Andreas.

[1] https://community.hiveeyes.org/t/upgrade-to-mosquitto-2-0/4154

amotl commented 3 years ago

Indeed, when invoking pytest -vvv --capture=no -k test_uplink_wifi_mqtt, the test suite croaks with

        resp = self.stream.read(4)
        assert resp[0] == 0x20 and resp[1] == 0x02
        if resp[3] != 0:
>           raise MQTTException(resp[3])
E           umqtt.MQTTException: 2

dist-packages/umqtt.py:110: MQTTException

So, we can confirm something changed with Mosquitto 2.0.12 breaking MicroPython's umqtt.simple module, whereas it was able to connect to Mosquitto 2.0.11 flawlessly.

/cc @cinadr, @ralight

ralight commented 3 years ago

@amotl Any hints on getting this running on a normal (linux) desktop?

Mosquitto 2.0.12 adds a lot of fixes for strict protocol compliance, so it's likely that is the cause. If you could get a wireshark/tcpdump trace of the test being run I'm sure we could figure out what was happening.

amotl commented 3 years ago

Dear Roger,

thank you very much for responding here, I just created this in a hurry today while travelling. I am also suspecting the "Strict protocol compliance fixes, plus test suite." changes to be the cause here, without having looked into any details yet. So, we will probably have to aim at making MicroPython's umqtt.simple module comply in the long run.

On the other hand, many devices might be out there running this module which are not easily upgradable without further ado, so it might be sweet to add a backwards-compatibility option to Mosquitto and release it on behalf of a 2.0.13 version.

In any case, because a full Terkin installation is rather heavy, I will provide you with a short and concise repro which can be used to investigate the problem further.

With kind regards, Andreas.

ralight commented 3 years ago

The change that applies is:

- Fix `max_keepalive` not applying to MQTT v3.1.1 and v3.1 connections.
  These clients are now rejected if their keepalive value exceeds
  max_keepalive. This option allows CVE-2020-13849, which is for the MQTT
  v3.1.1 protocol itself rather than an implementation, to be addressed.

The max_keepalive option has been around since version 1.6, but it didn't apply to MQTT v3.x connections, which meant it was impossible to stop malicious clients using up resources. umqtt.simple sets a keepalive time of 0 by default, which means "never expire this connection". I'd argue that this is really bad behaviour :) At the moment it's not possible to set max_keepalive to 0, which would fix the problem for umqtt.simple. I'll see about getting that changed - but I'd also really suggest changing the default umqtt.simple keepalive value.

amotl commented 3 years ago

Dear Roger,

thanks a stack for sharing your insights. In order to make this easily reproducible with the umqtt module for MicroPython, I created a dedicated repository at [1].

On top of that, both https://github.com/daq-tools/umqtt-example/pull/1 and https://github.com/daq-tools/pycopy-lib/commit/fe68f4186 reflect the improvements based on your suggestion. Do you believe that 60 seconds is a reasonable default value for the keepalive setting from the perspective of a MQTT client library?

With kind regards, Andreas.

[1] https://github.com/daq-tools/umqtt-example [2] https://github.com/micropython/micropython-lib/tree/master/micropython

P.S.: I will also try to carry this update forward to the original umqtt.{simple,robust} modules [2] as well. For doing that, please confirm if you think 60 seconds (as a default) is reasonable.

amotl commented 3 years ago

With https://github.com/daq-tools/pycopy-lib/commit/fe68f4186 being merged, I believe this is good to go. We can have any further discussions about improving upstream umqtt.simple at https://github.com/micropython/micropython-lib/issues/445.

Thanks again for your quick help, @ralight!

ralight commented 3 years ago

Just to wrap this up my side, yes, 60 seconds is a good value - but it does mean that either the client library or the end user application needs to keep on top of the keepalive timeout. In client libraries I have written the library does this itself, but I presume it may be more difficult handling this on a micro.

I've also pushed a commit to mosquitto to allow max_keepalive to be set to 0: https://github.com/eclipse/mosquitto/commit/d942ed7eec1619bc60d3997fecddb85b2fecaa37

amotl commented 3 years ago

Thank you, @ralight. For responding so quickly and for your excellent work on Mosquitto - keep up the spirit!

Maddox-zephyr commented 3 years ago

I made the change in my copy of umqttsimple.py to use the 60 second keepalive. Now I can send messages just fine, but on one of my nodes that was receive only, I only receive messages for the duration of the keepalive time. I changed the keepalive time to 180 and sure enough, I stop receiveing messages after 3 minutes. During this time I was not sending any messages. I then changed my code to send a message once every 45 seconds (arbitrary) and the connection stays alive. Just a note to others who may have had receive only mqtt nodes to send your own keepalive messages