technyon / nuki_hub

Use an ESP32 as a Hub between a NUKI Lock and your smarthome.
MIT License
526 stars 40 forks source link

Inconsistent lock binaryState/state in MQTT #51

Closed alexdelprete closed 1 year ago

alexdelprete commented 1 year ago

Today I noticed that lock sensor in HA was reporting unlocked when Nuki was locked, I checked MQTT and noticed this:

image

You will notice that lock/binaryState is unlocked but lock/state is locked.

I restarted Nuki Hub to force it to reset the states from the lock and everything was good again.

BTW: is it possible to add a restart button in the UI? right now I go into settings and hit save to force it to restart)

alexdelprete commented 1 year ago

Could indeed be a library problem. As far as I understand if a an incoming messages is received while publishing, it can corrupt memory.

So we old dinosaurs of IT still have a good instinct when troubleshooting. ;)

I'll try the updated firmware and keep you guys posted. Thanks a lot for all the support from everybody.

alexdelprete commented 1 year ago

Jan, I see that https://github.com/knolleary/pubsubclient/pull/835 and also https://github.com/knolleary/pubsubclient/pull/843 were never merged.

So from my perspective, it seems the problem is still in the library. Question: is there another library that could be used to manage MQTT communication?

UPDATE:

technyon commented 1 year ago

Also an option:

https://github.com/bertmelis/espMqttClient

Now which one to choose?

alexdelprete commented 1 year ago

The first one (arduino-libraries) should be maintained by the official Arduino Team:

This organization hosts the official libraries maintained or supervised by the Arduino team

I'd stick with the official...if it does everything you need in Nuki Hub.

technyon commented 1 year ago

Here you go, NUKI Hub with ArduinoMqttClient. I had to work out a few quirks of the new lib, but all in all it wasn't too hard to replace. Please go ahead and test.

nuki_hub-7.0-arduino-mqtt-1.zip

Mincka commented 1 year ago

Thanks again for the effort. I hope this helps. Both ESP are now running 7.0-arduino-mqtt-1. I removed one from the door and put it next to the ESP just to see if it also makes a difference for my connection issue.

Quick question: why the nuki_hub.bin above in the zip is not the same as the nuki_hub.bin in the ZIP artefact for this build (1488 KB vs 1388 KB)? I was thinking we could use the bin from the build artefacts directly.

alexdelprete commented 1 year ago

Here you go, NUKI Hub with ArduinoMqttClient. I had to work out a few quirks of the new lib, but all in all it wasn't too hard to replace. Please go ahead and test.

Wow Jan, I didn't expect it so quickly. Hope it didn't take too much of your time.

I just upgraded and made sure the MQTT topics were "clean". Now let's wait some days to see what happens.

Thank you so much for this. :)

technyon commented 1 year ago

@Mincka I'd need to check ... I think the CI is still running with Arduino core for ESP32 2.0.5, I'm using 2.0.6 ... or maybe some cmake options, or because I'm using ninja instead of make. The difference is size is somewhat significant though, I'll investigate it when I have time.

alexdelprete commented 1 year ago

Jan, I just checked and I don't see malformed topics under the main nukihub topic:

image

Bad news is that I had all nuki sensors unavailable in HA, so checked the autodiscovery topics, and found a problem: they are getting truncated.

image image

Looks like the maximum payload is 256 chars:

{"dev":{"ids":["nuki_1bf77d65"],"mf":"Nuki","mdl":"SmartLock","name":"Portoncino"},"~":"nuki_hub","name":"Portoncino battery voltage","unique_id":"1bf77d65_battery_voltage","dev_cla":"voltage","ent_cat":"diagnostic","stat_t":"~/battery/voltage","state_cla"
{"dev":{"ids":["nuki_1bf77d65"],"mf":"Nuki","mdl":"SmartLock","name":"Portoncino"},"~":"nuki_hub","name":"Portoncino bluetooth signal strength","unique_id":"1bf77d65_bluetooth_signal_strength","dev_cla":"signal_strength","ent_cat":"diagnostic","stat_t":"~/
{"dev":{"ids":["nuki_1bf77d65"],"mf":"Nuki","mdl":"SmartLock","name":"Portoncino"},"~":"nuki_hub","name":"Portoncino battery voltage","unique_id":"1bf77d65_battery_voltage","dev_cla":"voltage","ent_cat":"diagnostic","stat_t":"~/battery/voltage","state_cla"
technyon commented 1 year ago

The default tx buffer size is only 256 bytes. Upped it to 6144, please give it a try.

nuki_hub-7.0-arduino-mqtt-2.zip

alexdelprete commented 1 year ago

The default tx buffer size is only 256 bytes. Upped it to 6144, please give it a try.

All good now. And no malformed topics after almost 24h. :)

Please remember to fix the unit of measurement, you closed #72 with 6.11, but I still see db instead of dBm for BT/Wifi RSSI.

technyon commented 1 year ago

I'm confused I thought it was about being lowercase, so I changed db to dB. This is not a valid unit?

@rodriguezst Could you check if encrypted MQTT still works with the new library?

alexdelprete commented 1 year ago

You are right, I was still reading it as lower-case...need to wear my eyeglasses more often. :)

rodriguezst commented 1 year ago

I'm confused I thought it was about being lowercase, so I changed db to dB. This is not a valid unit?

@rodriguezst Could you check if encrypted MQTT still works with the new library?

Hello @technyon. I have not been following the project lately but I just updated from 6.0 to 7.0-arduino-mqtt-2 and everything seems to be working with encrypted MQTT with the new lib:

image

Ignore the opener undefined status... batteries run out a few days ago and I haven't replaced them :) Thank you!

technyon commented 1 year ago

@rodriguezst Changed from PubSubClient to Arduino MQTT because the former has some memory corruption issues. Thanks for the update

Let's test it a bit longer, if everything works I'll merge it prepare a release.

Mincka commented 1 year ago

No malformed topic on 7.0-arduino-mqtt-1 neither. Using 7.0-arduino-mqtt-2 now on both locks.

alexdelprete commented 1 year ago

No malformed topic on 7.0-arduino-mqtt-1 neither. Using 7.0-arduino-mqtt-2 now on both locks.

I thought we had solved...:(

image

technyon commented 1 year ago

Well damn. I'm still in favor of using the new library, PubSubClient seems to be unsupported, not commits for two years.

alexdelprete commented 1 year ago

I agree. It's much better. Since my last report (18h ago), no malformed topics.

And this library is officially supported by the Arduino team.

technyon commented 1 year ago

Let's test it one more day, and I'll do a new release tomorrow.

alexdelprete commented 1 year ago

Let's test it one more day, and I'll do a new release tomorrow.

Agreed. Maybe that malformed topic was an old one I forgot to delete...;)

Mincka commented 1 year ago

Still not seeing this issue on my side but it was always very random and often after few days.

alexdelprete commented 1 year ago

Still not seeing this issue on my side but it was always very random and often after few days.

With the previous library, I saw malformed topics at least every 2 days, sometimes even more often.

I would wait another day (3 days) just to be on the safe side.

alexdelprete commented 1 year ago

Only two minor issues left:

  1. Even if the Nuki Opener is disabled, the topic is created:

image

  1. A small typo:

image

technyon commented 1 year ago

I went ahead and released it as 7.0 ... the new library is better anyway. Fingers crossed the garbled topics are gone.

alexdelprete commented 1 year ago

I went ahead and released it as 7.0 ... the new library is better anyway. Fingers crossed the garbled topics are gone.

Makes perfectly sense. Let's see how it goes...:)

alexdelprete commented 1 year ago

Guys, do we consider this malformed (drain)?

image

technyon commented 1 year ago

Can anyone observe messages like in issue #87 or #88 ?

mundschenk-at commented 1 year ago

Yes, both. The RSSI update frequency is up to every other second. In addition, I do still see frequent Mosquitto error messages and there seems to be binary data in the RSSI topics:

Can't decode payload b'1\xce\x01' on nuki/lock/rssi with encoding utf-8 (for <Job HassJobType.Callback <function MqttSensor._prepare_subscribe_topics.<locals>.message_received at 0x7f8358f640>>)
Can't decode payload b'1\xf3\x01' on nukiopener/lock/rssi with encoding utf-8 (for <Job HassJobType.Callback <function MqttSensor._prepare_subscribe_topics.<locals>.message_received at 0x7f835e9900>>)
Can't decode payload b'1\xf4\x01' on nuki/lock/rssi with encoding utf-8 (for <Job HassJobType.Callback <function MqttSensor._prepare_subscribe_topics.<locals>.message_received at 0x7f8358f640>>)
Can't decode payload b'1\xf3\x01' on nuki/lock/rssi with encoding utf-8 (for <Job HassJobType.Callback <function MqttSensor._prepare_subscribe_topics.<locals>.message_received at 0x7f8358f640>>)
Can't decode payload b'\xe0\x001' on nuki/maintenance/wifiRssi with encoding utf-8 (for <Job HassJobType.Callback <function MqttSensor._prepare_subscribe_topics.<locals>.message_received at 0x7f83f453f0>>)

Edit: I indeed see both issues and have commented there.

alexdelprete commented 1 year ago

Can anyone observe messages like in issue #87 or #88 ?

Can you check if the new library has an option for persistent connections? It seems to open a new connection every time it needs to communicate with the broker. That is a common problem found also in other projects that communicate with an MQTT broker.

technyon commented 1 year ago

There's not that much to configure:

technyon commented 1 year ago

Actually not so fast! Persistent connections in MQTT are configured via the clean sessions flag (what a misleading name). Settings it to false makes connections persistent, of course it defaults to true.

Try this binary with clean sessions set to false:

nuki_hub-7.0-cs-false.zip

Mincka commented 1 year ago

Finally, I also got a buggy one: image

Now running 7.0-cs-false.

alexdelprete commented 1 year ago

Persistent connections in MQTT are configured via the clean sessions flag

Do we use a dynamic clientID? That could be a problem. It would be also good to have an option in MQTT and Network Config section to define a specific clientID (with a default maybe) and also if the Clean Session flag should be on or off (default off).

ref. https://www.emqx.com/en/blog/mqtt-session

image

I also hope the library takes care of this:

image

alexdelprete commented 1 year ago

Finally, I also got a buggy one:

You had the malformed one with the clean session firmware or with the previous one?

I don't think it's a coincidence RSSI topic is often the malformed one, it's the one that is updated more frequently.

alexdelprete commented 1 year ago

@technyon RSSI is updating every 1-2s, I think this is a problem if the value comes from the Nuki BLE API, because it will drain the battery. If it's the RSSI on the ESP32 side, no battery issue, but anyway it's too chatty, it should be updated like the other values.

I didn't have any malformed topics, I checked before upgrading to the 7.0 cleansession fw.

I can confirm from my EMQX console that the Clean Session flag is false and Session Expiry Interval is 2h, the device ID is static and it's the hostname.

image

Mincka commented 1 year ago

Finally, I also got a buggy one:

You had the malformed one with the clean session firmware or with the previous one?

With the previous one.

technyon commented 1 year ago

There's no battery drain from the bluetooth rssi value. The lock (or opener) sends beacons all the time anyway, the ESP just picks them up. This is actually what is changed by the "energy-saving" setting in the app. Depending on wether you set it to slow or fast, more or less beacons are sent, hence the battery drains faster on fast.

alexdelprete commented 1 year ago

the ESP just picks them up

ok, that's what I needed to clear, thanks.

one issue remains: the 1s update is also being picked up by HA through MQTT, and usually it's not recommended to have an entity recorded that updates so frequently. Is there any chance we can have that throttled in some way? An option to align the frequency update to the other sensors?

I think the fact that the RSSI topic is frequently the one being malformed is due to this frequency. It stresses things a little bit too much. :)

Mincka commented 1 year ago

I don't see an issue with this personally. The broker is getting lots of updates per second and can handle this without issues. Since it has no impact on the Nuki battery, throttling mechanism would just add more complexity in the code without being sure that will fix anything, it's just hypothetical.

In the end, what's the impact of having a malformed maintenance or wifi topic once in the while? You just miss an update of a non critical sensor. We can live with that. We have no proof that's related to reboots. I have reboots without malformed requests.

mundschenk-at commented 1 year ago

Actually not so fast! Persistent connections in MQTT are configured via the clean sessions flag (what a misleading name). Settings it to false makes connections persistent, of course it defaults to true.

Try this binary with clean sessions set to false:

Does not seem to have helped much (if at all), the Mosquitto log still looks the same (lots of connects, including error messages about Bad client <nukihub> sending multiple CONNECT messages.).

mundschenk-at commented 1 year ago

@technyon, while I understand the wish to change to a maintained library, the results with the new one are not good so far. I have created #90 for the new issue of binary data in the payload. It looks like the new library has a buffer corruption problem as well (albeit a different one). For me, the most stable version was https://github.com/technyon/nuki_hub/issues/51#issuecomment-1383137614.

alexdelprete commented 1 year ago

Does not seem to have helped much (if at all), the Mosquitto log still looks the same (lots of connects, including error messages about Bad client <nukihub> sending multiple CONNECT messages.).

That is really strange, I don't have these multiple connect messages warnings/errors. And if the client is using a persistent connection, there shouldn't be. Can you check with your broker that nuki_hub is actually using a persistent connection? I verified with mine and it's using it.

the results with the new one are not good so far

in my setup, this version is the best one so far, no malformed topics, just one binary payload one time, and that's it, no problems apart that one.

alexdelprete commented 1 year ago

I don't see an issue with this personally. The broker is getting lots of updates per second and can handle this without issues.

Like I wrote above, it's not about the broker, MQTT brokers are designed for heavy loads. The problem is HA recorder. Some users with RPi might have issues, very frequent updates are not the HA realm on those kind of setups. For me I have no issues, HA is running on a pretty good server with SSDs etc. but I'm thinking about other users. Eventually, they could exclude RSSI entity from the recorder, but sincerely, I see no value updating that sensor every second. It should update like the others.

In the end, what's the impact of having a malformed maintenance or wifi topic once in the while? You just miss an update of a non critical sensor.

The RSSI is not a critical sensor either, so why updating it every second?

We have no proof that's related to reboots

I didn't say it causes reboots, I said it't contributing to malformed topics. In my case the RSSI topic is usually the malformed one. And I don't believe in coincidences when they are so frequent.

mundschenk-at commented 1 year ago

Can you check with your broker that nuki_hub is actually using a persistent connection? I verified with mine and it's using it.

How would I do that?

alexdelprete commented 1 year ago

How would I do that?

I don't know with Mosquitto, that's one of the reasons I chose EMQX vs Mosquitto.

With EMQX I can check the Clean Session flag via the admin UI:

image

Maybe with mosquitto there's some cli command or maybe in the logs...

mundschenk-at commented 1 year ago

Looks like its not possible without storing the client IDs manually on connect. Maybe I should switch broker as well.

alexdelprete commented 1 year ago

Looks like its not possible without storing the client IDs manually on connect. Maybe I should switch broker as well.

Nuki Hub uses the hostname as client ID when it connects, it's not dynamic. What to they mean by "manually"?

mundschenk-at commented 1 year ago

Looks like its not possible without storing the client IDs manually on connect. Maybe I should switch broker as well.

Nuki Hub uses the hostname as client ID when it connects, it's not dynamic. What to they mean by "manually"?

That you have to code the logic yourself. See this Stack Overflow post for details: https://stackoverflow.com/questions/9767040/get-a-list-of-connected-client-ids-from-mqtt-client

alexdelprete commented 1 year ago

That you have to code the logic yourself.

Luckily, when I chose what broker to use for my homelab, I spent quite some time doing my research. I will never regret that decision. EMQX v5 is higly recommended if you want to properly manage the broker.