peterhinch / micropython-mqtt

A 'resilient' asynchronous MQTT driver. Recovers from WiFi and broker outages.
MIT License
549 stars 116 forks source link

Interest in library improvements #75

Closed bobveringa closed 2 years ago

bobveringa commented 2 years ago

Hi,

We use a heavily modified version of this library on our devices. I am currently looking at simplifying our external decencies by making removing our own modifications from them. However, after Almost a year of using some version of this library, the modifications have stacked up quite high. I would like to contribute some changes we have made back to this repository. However, this will cost me quite a bit of time, so before I start on that, is there any interest in the changes we have made?

Here is a list of changes we have made or are planned short-term (I have only included the changes that make sense to include).

peterhinch commented 2 years ago

I'm definitely interested in improvements and (especially) bugfixes. A few caveats:

Re adding descriptions to errors, this is fine so long as it doesn't add too much to RAM use.

It might be useful to know the general background behind what you are doing with the library if you're free to tell.

bobveringa commented 2 years ago

I am currently busy with reworking our modifications of the library and separating code that is only needed internally from code that is relevant to the library. As it stands right now, It should be possible to keep the API the same. There might be one exception to this, where we have added a sta_if.disconnect() call the .close() function in other to resolve some issues with sleep. But we will discuss these changes in the related MR, I assume.

As far RAM usage, I haven't noticed an increase. But our use might be slightly different from was most users are doing. We have a wearable running MicroPython, and somewhat simplified it is a data collection device that sends all of its data (like movement data) to the cloud using MQTT where it is then processed. Our data rates are not that fast, but if the device is active, it sends ~1 KB/s of data over MQTT. If the device is not used, it goes to sleep automatically, and also wakes up automatically. And after sleep, it immediately tries to reconnect to our cloud services again.

After trying many other libraries, we have settled on this one, as it was by far the best in reconnecting to the MQTT broker after loss of Wi-Fi signal.

We used to have some issues with this library, not directly that it was using too much memory, but that it was fragmenting our available memory. After implementing some memory improvements in the .as_read() and .as_write() functions, we have had very few issues.

I don't think I will be able to include all the changes we have made, as some things are probably too niche for most users, and would require extensive documentation. Or they depend on one of our custom C modules.

peterhinch commented 2 years ago

Thanks for that, it helps to see where you are coming from.

One point which may be already evident is that we didn't pay a lot of attention to what happens when initial connection fails. This is because we envisaged a user checking for connectivity at the REPL, with failure requiring manual intervention. After this the device's main.py would be edited, USB would be disconnected and the device would run from its normal power source. The assumption was that initial connection would then proceed normally or be manually investigated.

There are applications such as this test script which go into deepsleep periodically, where "initial connection" happens periodically and programmatically. There may well be issues around this: this mode has not been exhaustively tested. It sounds like you have acquired a lot of experience operating this way.

bobveringa commented 2 years ago

We use the initial connection behavior as a sort of validation check. Using the perhaps somewhat naive approach that if we can connect once, then we should be able to do it again. And during initial connection is where most of the error happens, so it is nice that you get some feedback from the function call if things go wrong.

And for our application, we have written a wrapper around the library that deals with a lot of the complicated interaction stuff. That wrapper handles the initial connection with a while True: in an async function, and then handles any errors. It also does some more complicated actions like allowing subscriptions before a connection is established.

With the rework, I have been looking at ways to keep the driver as is, but still have the custom Wi-Fi connect features we need. I have come up with something that allows this, which should hopefully make it a lot easier for us to contribute changes back here in the future.

bobveringa commented 2 years ago

I think this issue can be closed.