technyon / nuki_hub

Use an ESP32 as a Hub between a NUKI Lock and your smarthome.
MIT License
506 stars 39 forks source link

[9.0-beta/master] No valid Opener config received #438

Closed mundschenk-at closed 2 months ago

mundschenk-at commented 2 months ago

PROBLEM DESCRIPTION

After updating to 9.0 build 10086733497.8.1, the ring detect entities (event and binary sensor) are missing.

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

TO REPRODUCE

Clean retained MQTT messages and update/restart NH.

EXPECTED BEHAVIOUR

The entities should exist.

SCREENSHOTS

N/A

ADDITIONAL CONTEXT

N/A

(Please, remember to close the issue when the problem has been addressed)

iranl commented 2 months ago

Your cnfInfoEnabled shows false although you showed an image on discord where the setting in the web config was on. Might be a case of defaults when the setting is not set not being shown or used correctly. Although I don't think this will fix your issue it is atleast something to try

Can you save your ACL config page (with the publish config setting on) without any changes and check that cnfInfoEnabled is true after?

And report if this changes anything regarding the config mqtt topics and auto discovery (for both lock and opener)?

iranl commented 2 months ago

Confirmed through looking at the code that there are inconsistencies when upgrading to 8.35 and above when not saving the ACL page once with the enable publishing of config info.

It will look like it is enabled but actually is unset and in some places in the code will default to evaluating as false which will cause removal of config topics and not adding of HA entities.

This should atleast be mostly fixed in latest update of PR #436.

Need to have a look if this will also prevents ring sensors from being created.

mundschenk-at commented 2 months ago

I saved the ACL page as recommended, but that did not immediately fix the issue. Only when I downgraded (?) to 10093253100.1221.1 did a restart (triggered by BLE watchdog) recreate all entities. With this version, the opener can also be controlled again (i.e. it reacts to actions).

mundschenk-at commented 2 months ago

Still an issue with the BLE branch, even when very close with very good BLE RSSI

System Information
Nuki Hub version: 9.00
Nuki Hub build: 10199724409.1256.1
Nuki Hub build type: Release
run: true
confVersion: 900
deviceId: 760860867
deviceIdOp: 760860867
nukiId: ***
nukidOp: ***
mqttbroker: <mqtt>
mqttport: 8883
mqttuser: ***
mqttpass: ***
mqttlog: true
checkupdates: true
websrvena: false
lockena: true
lockpin: 1
mqttpath: nuki
openerena: true
openerpin: 1
openercont: false
mqttoppath: nukiopener
maxkpad: 
opmaxkpad: 
maxtc: 
opmaxtc: 
enabtlprst: false
mqttca: ***
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
hassConfigUrl: 
buffsize: 
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: <gateway>
dnssrv: <DNS>
nwhw: 1
nwwififb: false
rssipb: 10
hostname: <hostname>
nwbestrssi: true
nettmout: 60
restdisc: true
rstbcn: 30
lockStInterval: 3600
tcPerEntry: false
kpPerEntry: false
configInterval: 3600
batInterval: 3600
kpInterval: 3600
kpCntrlEnabled: true
kpInfoEnabled: false
kpPubCode: false
tcCntrlEnabled: false
tcInfoEnabled: false
cnfInfoEnabled: true
regAsApp: false
regOpnAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: ***
crdpass: ***
disnonjson: true
pubAuth: false
pubdbg: false
prdtimeout: -1
offHybrid: false
hybridTimer: 600
hybridAct: false
hybridRtry: false
hasmac: false
macb0: 
macb1: 
macb2: 
latest: 8.35
tsksznetw: 
tsksznuki: 
authmaxentry: 
kpmaxentry: 
tcmaxentry: 
updMqtt: false
showSecr: false
bleTxPwr: 
recNtwMqttDis: false
MQTT connected: Yes
Lock firmware version: 3.9.5
Lock hardware version: 2.10
Lock paired: Yes
Lock valid PIN set: Yes
Lock has door sensor: No
Lock has keypad: No
Lock ACL (Lock): Allowed
Lock ACL (Unlock): Allowed
Lock ACL (Unlatch): Allowed
Lock ACL (Lock N Go): Allowed
Lock ACL (Lock N Go Unlatch): Allowed
Lock ACL (Full Lock): Allowed
Lock ACL (Fob Action 1): Allowed
Lock ACL (Fob Action 2): Allowed
Lock ACL (Fob Action 3): Allowed
Lock config ACL (Name): Disallowed
Lock config ACL (Latitude): Disallowed
Lock config ACL (Longitude): Disallowed
Lock config ACL (Auto Unlatch): Disallowed
Lock config ACL (Pairing enabled): Allowed
Lock config ACL (Button enabled): Allowed
Lock config ACL (LED flash enabled): Allowed
Lock config ACL (LED brightness): Disallowed
Lock config ACL (Timezone offset): Disallowed
Lock config ACL (DST mode): Disallowed
Lock config ACL (Fob Action 1): Disallowed
Lock config ACL (Fob Action 2): Disallowed
Lock config ACL (Fob Action 3): Disallowed
Lock config ACL (Single Lock): Allowed
Lock config ACL (Advertising Mode): Disallowed
Lock config ACL (Timezone ID): Disallowed
Lock config ACL (Unlocked Position Offset Degrees): Disallowed
Lock config ACL (Locked Position Offset Degrees): Disallowed
Lock config ACL (Single Locked Position Offset Degrees): Disallowed
Lock config ACL (Unlocked To Locked Transition Offset Degrees): Disallowed
Lock config ACL (Lock n Go timeout): Disallowed
Lock config ACL (Single button press action): Disallowed
Lock config ACL (Double button press action): Disallowed
Lock config ACL (Detached cylinder): Disallowed
Lock config ACL (Battery type): Disallowed
Lock config ACL (Automatic battery type detection): Disallowed
Lock config ACL (Unlatch duration): Disallowed
Lock config ACL (Auto lock timeout): Disallowed
Lock config ACL (Auto unlock disabled): Allowed
Lock config ACL (Nightmode enabled): Disallowed
Lock config ACL (Nightmode start time): Disallowed
Lock config ACL (Nightmode end time): Disallowed
Lock config ACL (Nightmode auto lock enabled): Disallowed
Lock config ACL (Nightmode auto unlock disabled): Disallowed
Lock config ACL (Nightmode immediate lock on start): Disallowed
Lock config ACL (Auto lock enabled): Allowed
Lock config ACL (Immediate auto lock enabled): Disallowed
Lock config ACL (Auto update enabled): Disallowed
Opener firmware version: 1.10.1
Opener hardware version: 4.17
Opener paired: Yes
Opener valid PIN set: Yes
Opener has keypad: Yes
Opener ACL (Activate Ring-to-Open): Allowed
Opener ACL (Deactivate Ring-to-Open): Allowed
Opener ACL (Electric Strike Actuation): Allowed
Opener ACL (Activate Continuous Mode): Allowed
Opener ACL (Deactivate Continuous Mode): Allowed
Opener ACL (Fob Action 1): Allowed
Opener ACL (Fob Action 2): Allowed
Opener ACL (Fob Action 3): Allowed
Opener config ACL (Name): Disallowed
Opener config ACL (Latitude): Disallowed
Opener config ACL (Longitude): Disallowed
Opener config ACL (Pairing enabled): Disallowed
Opener config ACL (Button enabled): Allowed
Opener config ACL (LED flash enabled): Allowed
Opener config ACL (Timezone offset): Disallowed
Opener config ACL (DST mode): Disallowed
Opener config ACL (Fob Action 1): Disallowed
Opener config ACL (Fob Action 2): Disallowed
Opener config ACL (Fob Action 3): Disallowed
Opener config ACL (Operating Mode): Disallowed
Opener config ACL (Advertising Mode): Disallowed
Opener config ACL (Timezone ID): Disallowed
Opener config ACL (Intercom ID): Disallowed
Opener config ACL (BUS mode Switch): Disallowed
Opener config ACL (Short Circuit Duration): Disallowed
Opener config ACL (Eletric Strike Delay): Disallowed
Opener config ACL (Random Electric Strike Delay): Disallowed
Opener config ACL (Electric Strike Duration): Disallowed
Opener config ACL (Disable RTO after ring): Disallowed
Opener config ACL (RTO timeout): Disallowed
Opener config ACL (Doorbell suppression): Disallowed
Opener config ACL (Doorbell suppression duration): Disallowed
Opener config ACL (Sound Ring): Disallowed
Opener config ACL (Sound Open): Disallowed
Opener config ACL (Sound RTO): Disallowed
Opener config ACL (Sound CM): Disallowed
Opener config ACL (Sound confirmation): Disallowed
Opener config ACL (Sound level): Allowed
Opener config ACL (Single button press action): Disallowed
Opener config ACL (Double button press action): Disallowed
Opener config ACL (Battery type): Disallowed
Opener config ACL (Automatic battery type detection): Disallowed
Network device: Built-in Wi-Fi
BSSID of AP: <BSSID>
Uptime: 5 minutes
Heap: 41548
Stack watermarks: nw: 8268, nuki: 4800
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_POWERON: Reset due to power-on event.

After a manual reboot, the Opener entities showed up. RSSI is not as good as I would have expected (for the Lock it is in the 30s, Opener hasn't changed much, 70s, despite both Opener and Lock being very close together, but the wall is curving slightly away from the position of the NH). Not much improvement over the AtomLite for the Opener, but a lot for the Lock.

I'll try to provide serial logs tomorrow.

iranl commented 2 months ago

I am going to need a usb/serial log (MQTT log is not enough) using a debug binary to figure out what is causing this. Preferably using https://github.com/technyon/nuki_hub/actions/runs/10202945526/artifacts/1765925629

mundschenk-at commented 2 months ago

Yes, I'll try to set up my laptop to record serial logs tomorrow.

iranl commented 2 months ago

If USB serial is too much of a hassle you can wait for the currently open PRs #444 and #445 to be merged.

You should then be able to easily update to a debug binary with WebSerial which allows you too get as much relevent information using WebSerial as you could using USB serial without connecting the device to a computer

iranl commented 2 months ago

@mundschenk-at: Are you still having issues?

Can you provide the serial logs with a debug binary?

You can also do the following:

mundschenk-at commented 2 months ago

Hi @iranl, I've just enabled webserial and it does look like there is still something going on. I'm currently redacting the file, but just to be sure, the "manufacturer data" and UUIDs in the beacons is not in any way sensitive? (I really don't know much about BLE on the protocol level.)

iranl commented 2 months ago

Anything in the beacon is not sensitive, as these values can be read by anyone within BLE range of your opener.

mundschenk-at commented 2 months ago

So I'm now on this build, but it does not seem stable - the MQTT connection appears to get lost every few minutes (with ARDUINO: fail on 0, errno: 9, "Bad file number" in the webserial log), and after that it never correctly reconnects until a reboot.

System Information
------------ NUKI HUB ------------
Version: 9.01-master8
Build: 10429667593.42.1
Build type: Debug
Build date: 2024-08-17
Updater version: 9.01-master8
Updater build: 10429667593.42.1
Updater build date: 2024-08-17
Uptime (min): 0
Config version: 901
Last restart reason FW: NotApplicable
Last restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
Free heap: 15112
Network task stack high watermark: 8664
Nuki task stack high watermark: 7148

------------ GENERAL SETTINGS ------------
Network task stack size: 12288
Nuki task stack size: 8192
Check for updates: Yes
Latest version: 9.01-master8
Allow update from MQTT: No
Web configurator username: ***
Web configurator password: ***
Web configurator enabled: Yes
Publish debug information enabled: Yes
MQTT log enabled: No
Webserial enabled: Yes
Bootloop protection enabled: No

------------ NETWORK ------------
Network device: Built-in Wi-Fi
Network connected: Yes
IP Address: <IP>
SSID: <SSID>
BSSID of AP: <BSSID>
ESP32 MAC address: <ESP_MAC>

------------ NETWORK SETTINGS ------------
Nuki Hub hostname: nukihub
DHCP enabled: Yes
Fallback to Wi-Fi / Wi-Fi config portal disabled: No
Connect to AP with the best signal enabled: Yes
RSSI Publish interval (s): 10
Restart ESP32 on network disconnect enabled: Yes
Reconnect network on MQTT connection failure enabled: No
MQTT Timeout until restart (s): 60

------------ MQTT ------------
MQTT connected: Yes
MQTT broker address: <BROKER>
MQTT broker port: 8883
MQTT username: ***
MQTT password: ***
MQTT lock base topic: nuki
MQTT opener base topic: nuki
MQTT SSL CA: ***
MQTT SSL CRT: Not set
MQTT SSL Key: Not set

------------ BLUETOOTH ------------
Bluetooth TX power (dB): 9
Bluetooth command nr of retries: 4
Bluetooth command retry delay (ms): 100
Seconds until reboot when no BLE beacons recieved: 60

------------ QUERY / PUBLISH SETTINGS ------------
Lock/Opener state query interval (s): 3600
Publish Nuki device authorization log: No
Max authorization log entries to retrieve: 5
Battery state query interval (s): 3600
Most non-JSON MQTT topics disabled: Yes
Publish Nuki device config: Yes
Config query interval (s): 3600
Publish Keypad info: No
Keypad query interval (s): 3600
Enable Keypad control: Yes
Publish Keypad topic per entry: No
Publish Keypad codes: No
Max keypad entries to retrieve: 10
Publish timecontrol info: No
Keypad query interval (s): 3600
Enable timecontrol control: No
Publish timecontrol topic per entry: No
Max timecontrol entries to retrieve: 10

------------ HOME ASSISTANT ------------
Home Assistant auto discovery enabled: Yes
Home Assistant auto discovery topic: homeassistant/
Nuki Hub configuration URL for HA: http://<IP>

------------ NUKI LOCK ------------
Lock enabled: Yes
Paired: No
Nuki Hub device ID: 760860867
Nuki device ID: ***
Firmware version: 
Hardware version: 
Valid PIN set: -
Has door sensor: No
Has keypad: No
Timecontrol highest entries count: 0
Register as: Bridge

------------ HYBRID MODE ------------
Hybrid mode enabled: No

------------ NUKI LOCK ACL ------------
Lock: Allowed
Unlock: Allowed
Unlatch: Allowed
Lock N Go: Allowed
Lock N Go Unlatch: Allowed
Full Lock: Allowed
Fob Action 1: Allowed
Fob Action 2: Allowed
Fob Action 3: Allowed

------------ NUKI LOCK CONFIG ACL ------------
Name: Disallowed
Latitude: Disallowed
Longitude: Disallowed

Here's the webserial log for the first bootup after updating the FW to the debug binary: nukihub-debug-2024-08-17.log

mundschenk-at commented 2 months ago

I am monitoring, but after powering off the device for 5 seconds or so, the stability issue seems to be gone and the config querying after booting also didn't exhibit any of the issues seen in the attached log. Is it possible @iranl that OTA (or soft reboots in general) does not correctly clear memory/the CPU state, resulting in overall weirdness?

iranl commented 2 months ago

No shouldn't be an issue.

Debug build (adds to ram usage) with webserial (adds to ram usage) on with both a lock and an opener (two devices adds to ram usage) just seems a bit too much for the limited heap on the esp32dev. This probably causes the crashes.

On the newer variants much more heap is available because of efficiency improvements in esp-idf, although total RAM hasnt really changed in the newer variants.

9.01 includes optimizations to config retrieval and it does seem to succesfully retrieve both your lock and opener configs. Is HA discovery updated correctly now?

If you revert to a release build of 9.01 and disable webserial it should be much more stable.

Note that master8 has a bug in the ota buttons so to revert to a release build you need to: -look up a confirm code on the credentials page (for factory reset or unpair) -Open http://NUKIHUBIP/autoupdate?master=1&release=1&token=CONFIRMCODE

mundschenk-at commented 2 months ago

And it's super flaky again (after another power toggle). I deleted all the topics before and they were not recreated. Webserial only set in while it was querying the opener, so I don't know what exactly happened for the lock.

nukihub-debug-2024-08-17-power-cycle-2.log

Should I flash an S3 instead for my use case (lock & opener)?

mundschenk-at commented 2 months ago

Strangely, it's now even missing the discovery topics for the lock entity of the Smart Lock (still on the debug build). I'll disable webserial again and switch to the non-debug build.

iranl commented 2 months ago

My free heap with esp32dev vs esp32s3 with the same settings (high task sizes to accommodate higher keypad/auth log/timecontrol entries) on a release build of 9.01-master8 is around 45k vs 100k.

With debug and webserial that 45k also drops to about 15k on a debug build on a esp32dev in my situation and isn't stable on it. With an S3 you have a lot more headroom especially when having both a lock and an opener connected.

technyon commented 2 months ago

@mundschenk-at I've been following this issue, but I'm not aware of everything that was discussed, because it started on discord.

Since it seems only you have this issue, and it's really hard for us to debug it like this, maybe the best option is to start over: Factory reset your opener, flash Nuki Hub to a new ESP, possibly an S3 and try again? I did have some problems with the opener before, and factory resetting it did help at some point. Sorry I can't be of more specific help here, but maybe that's easiest way of resolving your issues.

mundschenk-at commented 2 months ago

I think this was at least partially the result of my rather unfortunate physical placement of the opener. I had it mounted on the wall which slightly curved back at at that point, and on the other side of the intercom handset. It was nicely unobtrusive, but probably not good for RF communication. I've (temporarily at this point) moved it to the other side of the intercom and taped it to the door frame. While BLE RSSI is still not bad (~ -48 dBm at 1.5 m distance!), it's less bad than before.

I'll do some more tests and then maybe a factory reset of the opener.

mundschenk-at commented 2 months ago

Update: I've updated to the "9.01" FW from today (removal of presence detection) and experimented with querying the config standalone (using nukiopener/lock/query/config). Even with the new positioning, it almost always fails for the "normal" configuration (not the advanced config though), with undefined(255).

I'll try factory resetting the opener next. However from what I've seen with the debug build earlier (see logfiles further up this thread), the actual config values seem to be retrieved, just some return code fails/is missing?

technyon commented 2 months ago

The presence detection was still there, but not active, we've deactivated it using compile time directives, so we can reactivate it just in case. Since this change was accepted without many complaints, I've removed it today, I don't think though that it'll make a big difference for your problem.

mundschenk-at commented 2 months ago

No, that was just to say which version I'm running (since the version string is just 9.01, but there hasn't been an official release yet).

Anyway, I've since factory reset the Opener, but that has not changed anything regarding reading the config, still [14:35:39] Opener config result: undefined(255) every time (or nearly every time).