technyon / nuki_hub

Use an ESP32 as a Hub between a NUKI Lock and your smarthome.
MIT License
479 stars 37 forks source link

Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic. #143

Closed enkama closed 1 month ago

enkama commented 1 year ago

Hey. So my ESP32 board is costantly restarting/crashing.

The current uptime is only 13 minutes even if I didnt do anything to it for like 23 hours.

NUKI Hub version: 8.16 Restart reason FW: RestartOnDisconnectWatchdog Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.

The board is a AZDelivery ESP32 NodeMCU Module WLAN WiFi Dev Kit C Development Board mit CP2102 (ESP32-WROMM-32)

Its powered by usb with 1.8A 5.2V 9W.

Any help with t he problem itself or with debugging it would be really appreciated because i really like the firmware in itself.

technyon commented 1 year ago

It's a bit hard to say what causes the crash without being able to debug. First check some general things:

Although the last reboot was a crash, there's also the restart on disconnect from a previous reboot. Maybe observe if reboots actually come from the ESP disconnection from the MQTT broker ("restart on disconnect"). In that case the ESP restart reason should be "Software reset via esp_restart".

enkama commented 1 year ago

Soo. It stayed up for about 2 hours. Then it restarted again.

This were the reasons. Restart reason FW: UnknownNoRestartRegistered Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.

So I am using another power supply (Both were from echo dots which should be high quality. Also trying one usb cable after another. Not fixed yet..

technyon commented 1 year ago

You you post your "system information" page in full?

Also keep an eye out for the "stack watermarks". If any of those hit 0, you'll have a crash.

Do you have another ESP to try?

enkama commented 1 year ago

Here is the full System-Information stuff

NUKI Hub version: 8.16
run: true
deviceId: 4075873652
mqttbroker: ***.***.***.**
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: true
lockena: true
mqttpath: nuki
openerena: true
mqttoppath: nukiopener
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 2
nwhwdt: 26
rssipb: -1
hostname: nukihub
nettmout: -1
restdisc: true
resttmr: -1
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
regAsApp: false
nrRetry: 0
rtryDelay: 100
crdusr: ***
crdpass: ***
pubauth: false
gpiolck: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.5.11
Lock hardware version: 11.1
Lock paired: Yes
Lock PIN set: No
Opener firmware version: 1.10.1
Opener hardware version: 4.17
Opener paired: Yes
Opener PIN set: Yes
Network device: Built-in Wifi
Uptime: 22 minutes
Heap: 87580
Stack watermarks: nw: 6104, nuki: 568, pd: 240
Restart reason FW: UnknownNoRestartRegistered
Restart reason ESP: ESP_RST_POWERON: Reset due to power-on event.

Also dont have another ESP at home at the moment. Could be worth a try though if i do not find another reason what could be causing it.

technyon commented 1 year ago

Could you try to use "Network timer until disconnect" instead of "Restart on disconnect" ... I see that watchdog was triggered before. Restart on disconnect will immediately reboot if there's no connection to the broker, the other option at least gives it a bit time to reconnect. I think 30 to 60 seconds should be fine.

enkama commented 1 year ago

Sooo.. I changed it but:

NUKI Hub version: 8.16
run: true
deviceId: 4075873652
mqttbroker: ***.***.***.**
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: true
lockena: true
mqttpath: nuki
openerena: true
mqttoppath: nukiopener
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 2
nwhwdt: 26
rssipb: 60
hostname: nukihub
nettmout: 60
restdisc: false
resttmr: -1
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
regAsApp: false
nrRetry: 0
rtryDelay: 100
crdusr: ***
crdpass: ***
pubauth: false
gpiolck: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.5.11
Lock hardware version: 11.1
Lock paired: Yes
Lock PIN set: Yes
Opener firmware version: 1.10.1
Opener hardware version: 4.17
Opener paired: Yes
Opener PIN set: Yes
Network device: Built-in Wifi
Uptime: 7 minutes
Heap: 87776
Stack watermarks: nw: 6104, nuki: 568, pd: 248
Restart reason FW: ConfigurationUpdated
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.

So I guess the only thing left is a new esp32?

technyon commented 1 year ago

It's worth a try.

enkama commented 1 year ago

So yeah.. New one just got here today. Nothing configured yet, just installed the firmware.

DUBEUYEW ESP32

And again this:

NUKI Hub version: 8.17
run: true
deviceId: 2757826553
mqttbroker: 
mqttport: 1883
mqttuser: 
mqttpass: 
mqttlog: false
lockena: true
mqttpath: nuki
openerena: false
mqttoppath: 
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: 
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 2
nwhwdt: 26
rssipb: 60
hostname: nukihub
nettmout: -1
restdisc: false
resttmr: -1
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
regAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: 
crdpass: 
pubauth: false
gpiolck: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: No
Lock firmware version: 
Lock hardware version: 
Lock paired: No
Lock PIN set: -
Network device: Built-in Wifi
Uptime: 53 minutes
Heap: 30792
Stack watermarks: nw: 6184, nuki: 2152, pd: 244
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
achint-s commented 1 year ago

I’m experiencing this as well. I went from a stable 2 percent a day battery drain to 10% per day in the last week since updating to 8.18. I’m getting these panics every 15 or so minutes now.

NUKI Hub version: 8.18
run: true
deviceId: 1657113107
mqttbroker: *********
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: false
lockena: true
mqttpath: nuki
openerena: false
mqttoppath: nukiopener
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 1
nwhwdt: 16
rssipb: -1
hostname: nukihub
nettmout: 120
restdisc: true
resttmr: -1
rstbcn: -1
lockStInterval: 7200
configInterval: 9999
batInterval: 7200
kpInterval: 7200
kpEnabled: false
regAsApp: true
nrRetry: 1
rtryDelay: 1000
crdusr: ***
crdpass: ***
pubauth: false
gpiolck: false
pubdbg: false
prdtimeout: -1
hasmac: true
macb0: 95
macb1: 74
macb2: -52
MQTT connected: Yes
Lock firmware version: 
Lock hardware version: 
Lock paired: Yes
Lock PIN set: Yes
Lock has door sensor: No
Lock has keypad: No
Network device: Built-in Wifi
Uptime: 0 minutes
Heap: 90744
Stack watermarks: nw: 6104, nuki: 856, pd: 280
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
technyon commented 1 year ago

Please provide serial logs.

achint-s commented 1 year ago

Will try and get some logs - just away from home for a bit and I dont believe I can access the serial logs remotely (or can I?).

I am keeping a close eye on this. I did a cold restart and had an uptime of 180 minutes - after which it dropped down and ping-ponged between 15 and 30 minutes. What is fascinating is that the instability seemed to be in the evening once everyone was already home. I'll try and keep the uptime logs running to see whether there's any time related issue over the next few days.

Edit: Is there a reason why lockstateCommandResult is being published on MQTT every 2-3 seconds? I hadn't noticed this in previous releases.

technyon commented 1 year ago

Hi, lockstateCommandResult isn't published every few seconds for me. Are you sure about that?

Getting serial logs remotely is only possible if you've connected your ESP to some computer. MQTT logs would be possible, but of course once the ESP crashes those don't get published anymore.

achint-s commented 1 year ago

Hi, lockstateCommandResult isn't published every few seconds for me. Are you sure about that?

It's what MQTT Explorer was showing as being updated every few seconds (compared to something like uptime that was only being updated every minute).. I can't export the history but you can see the attached screenshot for samples (also notice the Stats -> Messages. It goes up by every few seconds and is up to 7115 (actual count is higher I believe.. this is only for the duration I left MQTT explorer running) In comparison something like lock/state has 151 messages and battery/level has 132 messages.

Screenshot 2023-03-23 at 10 11 04 pm Screenshot 2023-03-23 at 10 12 39 pm
technyon commented 1 year ago

lockstateCommandResult is only updated when the lock state has been queried from the lock. This in turn can be triggered by one of the following conditions:

I'm not sure which of those conditions could be met in your case, your interval is configured to 7200 so that should fine. Is there anything that keeps writing into the "/lock/query/lockstate" node (maybe you can monitor this node). What it also means is that your lock is constantly being queried, you should notice a severe drop in battery performance.

achint-s commented 1 year ago

Thanks.. I am still trying to locate the reason why this is happening.

So far, I have eliminated point 2 (timer to requery the state) and 3 (updates made to write 1 to /lock/query/lockstate).

I did briefly run 8.20, but this resulted in many more restarts so I rolled back to 8.18. Perhaps I need to wipe the ESP32 clean and start again when I return home.

I did see this in my Mosquitto debug logs in HA:

2023-03-31 00:30:26: Sending PUBACK to nukihub (m1396, rc0)
2023-03-31 00:30:34: Received PUBLISH from nukihub (d0, q1, r1, m1397, 'nuki/lock/query/lockstateCommandResult', ... (6 bytes))
2023-03-31 00:30:34: Sending PUBACK to nukihub (m1397, rc0)
2023-03-31 00:30:36: Received PUBLISH from nukihub (d0, q1, r1, m1398, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:36: Sending PUBACK to nukihub (m1398, rc0)
2023-03-31 00:30:41: Received PUBLISH from nukihub (d0, q1, r1, m1399, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:41: Sending PUBACK to nukihub (m1399, rc0)
2023-03-31 00:30:44: Received PUBLISH from nukihub (d0, q1, r1, m1400, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:44: Sending PUBACK to nukihub (m1400, rc0)
2023-03-31 00:30:49: Received PUBLISH from nukihub (d0, q1, r1, m1401, 'nuki/maintenance/uptime', ... (2 bytes))
2023-03-31 00:30:49: Sending PUBACK to nukihub (m1401, rc0)
2023-03-31 00:30:50: Received PUBLISH from nukihub (d0, q1, r1, m1402, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:50: Sending PUBACK to nukihub (m1402, rc0)
2023-03-31 00:30:55: Received PUBLISH from nukihub (d0, q1, r1, m1403, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:55: Sending PUBACK to nukihub (m1403, rc0)
2023-03-31 00:30:58: Received PUBLISH from nukihub (d0, q1, r1, m1404, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:30:58: Sending PUBACK to nukihub (m1404, rc0)
2023-03-31 00:31:08: Received PUBLISH from nukihub (d0, q1, r1, m1405, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:08: Sending PUBACK to nukihub (m1405, rc0)
2023-03-31 00:31:12: Received PUBLISH from nukihub (d0, q1, r1, m1406, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:12: Sending PUBACK to nukihub (m1406, rc0)
2023-03-31 00:31:16: Received PUBLISH from nukihub (d0, q1, r1, m1407, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:16: Sending PUBACK to nukihub (m1407, rc0)
2023-03-31 00:31:19: Received PUBLISH from nukihub (d0, q1, r1, m1408, 'nuki/maintenance/uptime', ... (2 bytes))
2023-03-31 00:31:19: Sending PUBACK to nukihub (m1408, rc0)
2023-03-31 00:31:24: Received PUBLISH from nukihub (d0, q1, r1, m1409, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:24: Sending PUBACK to nukihub (m1409, rc0)
2023-03-31 00:31:27: Received PUBLISH from nukihub (d0, q1, r1, m1410, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:27: Sending PUBACK to nukihub (m1410, rc0)
2023-03-31 00:31:33: Received PUBLISH from nukihub (d0, q1, r1, m1411, 'nuki/lock/query/lockstateCommandResult', ... (6 bytes))
2023-03-31 00:31:33: Sending PUBACK to nukihub (m1411, rc0)
2023-03-31 00:31:40: Received PUBLISH from nukihub (d0, q1, r1, m1412, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:40: Sending PUBACK to nukihub (m1412, rc0)
2023-03-31 00:31:45: Received PUBLISH from nukihub (d0, q1, r1, m1413, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:45: Sending PUBACK to nukihub (m1413, rc0)
2023-03-31 00:31:49: Received PUBLISH from nukihub (d0, q1, r1, m1414, 'nuki/maintenance/uptime', ... (2 bytes))
2023-03-31 00:31:49: Sending PUBACK to nukihub (m1414, rc0)
2023-03-31 00:31:50: Received PUBLISH from nukihub (d0, q1, r1, m1415, 'nuki/lock/query/lockstateCommandResult', ... (7 bytes))
2023-03-31 00:31:50: Sending PUBACK to nukihub (m1415, rc0)
alexdelprete commented 1 year ago

I don't see this behaviour in my setup, it takes some minutes between each update of that topic:

image

You might want to try reflashing the device via serial, erasing everything.

enkama commented 1 year ago

Sooo. After some Time i tried nuki_hub again. Flashed it, without pairing devices. Did run fine for a whole day. No exceptions. Then I added both my Nuki Opener and my Nuki Lock. Since then I get ESP_RST_PANIC yet again.. I guess it wont work for me for some reason..

NUKI Hub version: 8.22
run: true
deviceId: ***
mqttbroker: ***
mqttport: ***
mqttuser: ***
mqttpass: ***
mqttlog: false
lockena: true
mqttpath: nuki
openerena: true
mqttoppath: nukiopener
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 1
rssipb: 60
hostname: nukihub
nettmout: -1
restdisc: false
resttmr: -1
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
regAsApp: false
nrRetry: 0
rtryDelay: 100
crdusr: ***
crdpass: ***
pubauth: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.5.12
Lock hardware version: 11.1
Lock paired: Yes
Lock PIN set: Yes
Lock has door sensor: No
Lock has keypad: No
Opener firmware version: 1.10.1
Opener hardware version: 4.17
Opener paired: Yes
Opener PIN set: Yes
Opener has keypad: Yes
Network device: Built-in Wifi
Uptime: 91 minutes
Heap: 86280
Stack watermarks: nw: 6092, nuki: 484, pd: 236
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
achint-s commented 1 year ago

Hi - I actually wanted to report the same thing. I am having the same issue - and I have tried with a new ESP32 device (different manufacturer) as well and also erasing the previous device and flashing via USB cable.

  1. Would you mind sharing a desensitised version of the Nuki Hub MQTT and Nuki configuration you use yourself? @technyon or @alexdelprete

  2. I am still getting the lockstateCommandResult being updated every 1-3 seconds even with the new board. Makes me wonder if something in Homeassistant is doing the querying.

  3. I did attempt to get some serial logs and saw this whenever an ESP_RST_PANIC is sent: abort() was called at PC 0x401d7edf on core 0 Backtrace: 0x40083d25:0x3ffe2090 0x40093869:0x3ffe20b0 0x400998fd:0x3ffe20d0 0x401d7edf:0x3ffe2150 0x401d7f26:0x3ffe2170 0x401d8101:0x3ffe2190 0x401d81bc:0x3ffe21b0 0x4010a1e5:0x3ffe21d0 0x4010d93a:0x3ffe2210 0x40117f32:0x3ffe2260 0x4011ec8b:0x3ffe22c0 0x4011eac3:0x3ffe2300 0x4011f006:0x3ffe2320 0x4011d269:0x3ffe2340 0x400816a2:0x3ffe2360 0x4010c00e:0x3ffe2380

technyon commented 1 year ago

@achint-s There's no new information that probably explains why your ESP is querying every few seconds: You've enabled "Register as app". If an app queries the status of the lock, it doesn't reset the flag that signals that there's an update. This only happens when you register as a bridge (Register as app disabled). If you don't use a NUKI bridge together with NUKI Hub which is "registered as app", it'll keep querying since that flag doesn't reset (and there's no bridge to reset it either). You should disable "Register as app" and re-pair your lock.

What kind of configuration are you interested in? My MQTT configuration is just the IP address of my broker, nothing too special about that.

achint-s commented 1 year ago

Ah. Interesting! I didnt realise that. Would it be possible to add this to the documentation?

Always thought that app integration was the most flexible (in case I wanted to re-animate the actual Nuki bridge). Will report back after updating the config.

alexdelprete commented 1 year ago

Would it be possible to add this to the documentation? Always thought that app integration was the most flexible

image

achint-s commented 1 year ago

Hi. I meant the additional benefits of using nuki hub as a bridge instead of app. Including the improved battery life of the lock.

alexdelprete commented 1 year ago

Hi, as stated in the docs, NH is a bridge replacement. As an exception, if for some weird reason a user wants to keep using the Nuki Bridge (goes beyond my imagination why that would be the case, it creates only issues, IMHO), you can pair NH as an app.

image

So from the documentation it's pretty clear what is the main use case of NH: it replaces the Nuki Bridge.

Personally I would remove the pairing as app functionality, it creates only issues and confusion. But it's up to @technyon to ultimately decide.

achint-s commented 1 year ago

Just a quick update. I have unpaired and re-paired as bridge. Unfortunately I am still seeing lockstateCommandResult being published on MQTT every 2-3 seconds. I just cannot find the source of this (I do not have multiple bridges, etc.). Time to do some MQTT logging next. I have no doubt I will see a kernel panic soon enough and reduced battery life! Edit: Just did in 20 minutes of restart.

technyon commented 1 year ago

@achint-s Could you retry and erase and re-flash the ESP before re-pairing. The ESP generates a random device ID which is used during pairing and changed only when you erase it. This makes sure the lock registers the ESP as a new device, and accepts it as a bridge. I'm guessing a bit here, but if that's the case, I'd add an option to generate a new device id.

achint-s commented 1 year ago

@technyon This seems to have done the job! Erasing -> re-flashing -> re-paired. It appears to be much more stable and no more flood of lockstateCommandresult. I did also go into homeassistant and deleted all devices/entities (and also topics via MQTT Explorer). Monitoring for stability now, but its the first time I have seen an uptime of 82+ minutes.

technyon commented 1 year ago

ok, thanks I'll have give an option then to change the device id ... maybe do it automatically whenever changing the "Register as app" setting.

alexdelprete commented 1 year ago

give an option then to change the device id ... maybe do it automatically whenever changing the "Register as app" setting.

Jan, I wouldn't put that as a specific option, it could create issues, but as you said, manage it internally when changing the register as app setting.

technyon commented 1 year ago

I think think this actually makes sense when unpairing the device. Delete the old device id, a new one will be generated upon restart.

myreczek commented 10 months ago

Is there some solution to this? My M5 Atom Lite is experiencing random resets:

NUKI Hub version: 8.26
run: true
deviceId: 2722894906
deviceIdOp: 2722894906
mqttbroker: 192.168.0.60
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: false
lockena: true
mqttpath: nuki
openerena: false
mqttoppath: 
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 1
rssipb: 60
hostname: nukihub
nettmout: -1
restdisc: false
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
accLvl: 
regAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: 
crdpass: 
pubauth: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.6.9
Lock hardware version: 2.10
Lock paired: Yes
Lock PIN set: Yes
Lock has door sensor: Yes
Lock has keypad: Yes
Network device: Built-in Wifi
Uptime: 107 minutes
Heap: 93176
Stack watermarks: nw: 6072, nuki: 648, pd: 236
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
technyon commented 10 months ago

Hi,

how often does this happen?

It's hard to say what causes this. First think I'd check the power supply, maybe try a different power supply and/or USB cable.

myreczek commented 10 months ago

Hi,

how often does this happen?

It's hard to say what causes this. First think I'd check the power supply, maybe try a different power supply and/or USB cable.

Several times a day. It's powered using DIN DC adapter Mean Well HDR-15-5, DC wired to M5 Atom DC pins... It outputs exactly 5.05V ...

technyon commented 10 months ago

Just guessing since you're using a DIN adapter. Is the ESP located near or inside the fuse box? ESPs are quite sensitive to electromagnetic intereference. If so, it's possible it picks up interference from the power lines in your fusebox.

myreczek commented 10 months ago

Just guessing since you're using a DIN adapter. Is the ESP located near or inside the fuse box? ESPs are quite sensitive to electromagnetic intereference. If so, it's possible it picks up interference from the power lines in your fusebox.

It crossed my mind just after measuring the voltage from the DIN adapter. I try to move the ESP a bit.

alexdelprete commented 10 months ago

Can you test with a simple usb power adapter / connection? so you can then exclude something.

fir3drag0n commented 10 months ago

I do have the same problem. I use a atom lite as well. The esp is powered by USB directly:

NUKI Hub version: 8.26
run: true
deviceId: 961435431
deviceIdOp: 961435431
mqttbroker: 192.168.1.191
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: true
lockena: true
mqttpath: nuki
openerena: true
mqttoppath: nukiopener
maxkpad: 1
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 1
rssipb: 60
hostname: nukihub
nettmout: 40
restdisc: false
rstbcn: 15
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: true
accLvl: 0
regAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: 
crdpass: 
pubauth: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.6.9
Lock hardware version: 11.1
Lock paired: Yes
Lock PIN set: Yes
Lock has door sensor: No
Lock has keypad: Yes
Opener firmware version: 1.10.1
Opener hardware version: 4.17
Opener paired: Yes
Opener PIN set: Yes
Opener has keypad: Yes
Network device: Built-in Wifi
Uptime: 5 minutes
Heap: 81880
Stack watermarks: nw: 6096, nuki: 664, pd: 272
Restart reason FW: RestartOnDisconnectWatchdog
Restart reason ESP: ESP_RST_SW: Software reset via esp_restart.
technyon commented 9 months ago

@fir3drag0n In your case the restart was intended. The ESP was disconnected from the MQTT broker for longer than the specified interval, so it restarted itself.

Xploder commented 6 months ago

I'm also encountering these restarts on both M5 Atom and an older ESP32 devkit

NUKI Hub version: 8.29
run: true
deviceId: ----
deviceIdOp: -----
mqttbroker: 192.168.----
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: false
lockena: true
mqttpath: nuki
openerena: false
mqttoppath: 
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 1
rssipb: 60
hostname: nukihub_dev
nettmout: -1
restdisc: false
rstbcn: 300
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
accLvl: 0
regAsApp: false
nrRetry: 1
rtryDelay: 100
crdusr: 
crdpass: 
pubauth: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 3.7.7
Lock hardware version: 6.11
Lock paired: Yes
Lock PIN set: No
Lock has door sensor: No
Lock has keypad: No
Network device: Built-in Wifi
Uptime: 52 minutes
Heap: 87892
Stack watermarks: nw: 6104, nuki: 680, pd: 232
Restart reason FW: NotApplicable
Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.
Arn0uDz commented 4 months ago

I'm also having this problem. Don't know if it was always happening but I didn't know about it or if it just started during a recent update. I was on 8.33 and it was stable for 8 days then it resets due to exception/panic and a couple hours later it resets again.

I downgraded to 8.32 and am going to keep downgrading to see when it stops happening. I made a custom MQTT sensor for HA with the uptime that I monitor with an automation that notifies me when it resets.

Log after the downgrade, sharing for my settings:


run: true
deviceId: ----
deviceIdOp: -----
mqttbroker: 192.168.-----
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: false
lockena: true
mqttpath: nuki
openerena: false
mqttoppath: 
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: homeassistant
hassConfigUrl: 
dhcpena: true
ipaddr: 
ipsub: 
ipgtw: 
dnssrv: 
nwhw: 7
nwwififb: true
rssipb: 60
hostname: nukihub
nettmout: -1
restdisc: false
rstbcn: -1
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpEnabled: false
accLvl: 
regAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: 
crdpass: 
pubauth: true
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
MQTT connected: Yes
Lock firmware version: 2.15.3
Lock hardware version: 11.1
Lock paired: Yes
Lock PIN set: Yes
Lock has door sensor: No
Lock has keypad: No
Network device: LilyGO T-ETH-POE
Uptime: 773 minutes
Heap: 111304
Stack watermarks: nw: 6096, nuki: 556, pd: 228
Restart reason FW: OTACompleted
Restart reason ESP: ESP_RST_SW: Software reset via esp_restart.```
N3m3515 commented 2 months ago

i have the same Problem and already reported it in the Discord. Nuki Hub version: 8.33 run: true confVersion: 833 deviceId: 3453607127 deviceIdOp: 3453607127 nukiId: nukidOp: *** mqttbroker: 172.16.30.67 mqttport: 1883 mqttuser: mqttpass: mqttlog: true checkupdates: false lockena: false lockpin: mqttpath: nuki openerena: true openerpin: 1 openercont: false mqttoppath: nukiopener maxkpad: opmaxkpad: mqttca: mqttcrt: mqttkey: hassdiscovery: homeassistant hassConfigUrl: dhcpena: true ipaddr: ipsub: ipgtw: dnssrv: nwhw: 1 nwwififb: false rssipb: 60 nwbestrssi: true hostname: nukihub nettmout: -1 restdisc: false rstbcn: -1 lockStInterval: 1800 configInterval: 3600 batInterval: 1800 kpInterval: 1800 kpCntrlEnabled: true aclConfig: true kpInfoEnabled: false aclLckOpn: tcCntrlEnabled: true tcInfoEnabled: true accLvl: regAsApp: false nrRetry: 0 rtryDelay: 100 crdusr: *** crdpass: *** pubAuth: false pubdbg: false prdtimeout: 60 hasmac: false macb0: macb1: macb2: latest: MQTT connected: Yes Opener firmware version: Opener hardware version: Opener paired: No Opener PIN set: - Opener has keypad: No Opener ACL (Activate Ring-to-Open): Allowed Opener ACL (Deactivate Ring-to-Open): Allowed Opener ACL (Electric Strike Actuation): Allowed Opener ACL (Activate Continuous Mode): Allowed Opener ACL (Deactivate Continuous Mode): Allowed Opener ACL (Fob Action 1): Allowed Opener ACL (Fob Action 2): Allowed Opener ACL (Fob Action 3): Allowed Network device: Built-in Wi-Fi BSSID of AP: 18:E8:29:6D:60:C3 Uptime: 0 minutes Heap: 156616 Stack watermarks: nw: 5512, nuki: 2176, pd: 104 Restart reason FW: NotApplicable Restart reason ESP: ESP_RST_PANIC: Software reset due to exception/panic.

I also opened a Serial Console and have seen this:

` Rebooting... ▒▒▒ESP-ROM:esp32s3-20210327 Build:Mar 27 2021 rst:0xc (RTC_SW_CPU_RST),boot:0x8 (SPI_FAST_FLASH_BOOT) Saved PC:0x403774a2 SPIWP:0xee mode:DIO, clock div:1 load:0x3fce3808,len:0x44c load:0x403c9700,len:0xbd8 load:0x403cc700,len:0x2a80 entry 0x403c98d0 Nuki Hub version 8.33 IP address empty, falling back to DHCP. IP configuration: DHCP Hardware detect : 1 Network device: Wi-Fi only MQTT without TLS. wm:AutoConnect wm:Connecting to SAVED AP: BerndAlina wm:find best RSSI: TRUE wm:6 networks found wm:SSID BerndAlina found with RSSI: -59(82.00 %) and BSSID: 18:E8:29:6D:60:C3 and channel: 6 wm:SSID BerndAlina found with RSSI: -68(64.00 %) and BSSID: 18:E8:29:EA:2D:70 and channel: 6 wm:SSID BerndAlina found with RSSI: -86(28.00 %) and BSSID: 18:E8:29:6D:BD:08 and channel: 6 wm:Trying to connect to SSID BerndAlina found with RSSI: -59(82.00 %) and BSSID: 18:E8:29:6D:60:C3 and channel: 6 wm:connectTimeout not set, ESP waitForConnectResult... wm:AutoConnect: SUCCESS *wm:STA IP Address: 172.16.10.75 Wi-Fi connected: 172.16.10.75 Host name: nukihub MQTT Broker: 172.16.30.67:1883 Nuki Lock disabled Nuki Opener enabled Device id opener: 3453607127 Lock state interval: 1800 | Battery interval: 1800 | Publish auth data: no Presence detection timeout (ms): 60000 Attempting MQTT connection MQTT: Connecting without credentials MQTT connected Nuki opener start pairing Nuki opener paired Querying opener state: locked Querying opener battery state: success Reading opener config. Result: success Reading opener advanced config. Result: success Querying opener time control: success Guru Meditation Error: Core 1 panic'ed (Unhandled debug exception). Debug exception reason: Stack canary watchpoint triggered (nuki) Core 1 register dump: PC : 0x40383f87 PS : 0x00060036 A0 : 0x803822cc A1 : 0x3fcc1c10 A2 : 0x3fc9a610 A3 : 0xb33fffff A4 : 0x0000cdcd A5 : 0x00060023 A6 : 0x00060023 A7 : 0x0000abab A8 : 0xb33fffff A9 : 0xffffffff A10 : 0x3fc9a5e8 A11 : 0x00000001 A12 : 0x00060021 A13 : 0x3fcc1ce0 A14 : 0x02c9a610 A15 : 0x00ffffff SAR : 0x0000001e EXCCAUSE: 0x00000001 EXCVADDR: 0x00000000 LBEG : 0x40055871 LEND : 0x40055882 LCOUNT : 0xfffffff4

Backtrace: 0x40383f84:0x3fcc1c10 0x403822c9:0x3fcc1c50 0x40380778:0x3fcc1c80 0x4038076e:0xa5a5a5a5 |<-CORRUPTED `

i found this: https://stackoverflow.com/questions/56779459/why-do-i-get-the-debug-exception-reason-stack-canary-watchpoint-triggered-main

Arn0uDz commented 2 months ago

I tried downgrading all the way to 8.27 but it happens on every version. Sometimes it runs for weeks without issue and then all of a sudden crashes twice in the same day.

N3m3515 commented 2 months ago

I have a few hours of normal Operation and the a few hours of Boot loop with the error

N3m3515 commented 2 months ago

The new Version 8.34 and setting Presence detection timeout to -1 resolved my issues for the most part. i only get maybe one crash every day

iranl commented 2 months ago

I hope that increasing the stack size for the network and nuki task in 8.34 has mostly fixed these issues. I have a further PR open that increases the stack size even more in regard to updating the nuki lock/opener config.

A reboot every 24h or so is my experience more or less too, which shouldn't be to big of a problem because the hub is up again so quick. I'm keeping more debugging on these panics on my todo list though.

You should be able to enable presence detection if you need it.

In regards to debugging this issue further it is also useful information for us if enabling presence detection on 8.34 again gives you problems.

N3m3515 commented 2 months ago

I dont use Presence detection but i will turn it on and report if the issue gets worst again

N3m3515 commented 2 months ago

Semms to be running stable with presence detection turned on for now

iranl commented 2 months ago

Thanks for the update. Good to know presence detection does not seem to be the problem.

I hope 8.34 and PR #357 further resolves this issue in most cases.

iranl commented 1 month ago

I'm going to close this for now as the original report was over a year ago on a very outdated version of Nuki Hub. Many changes have been made in 8.33/8.34/8.35 that deal with stability. Please open a new issue if you experience frequent (e.g. more than 1-2 times a day) reboots because of exceptions/panic while running 8.34 or 8.35 (set to be released soon)

Arn0uDz commented 1 month ago

@iranl I'm still getting panics with version 8.34 every couple days. When will 8.35 be released? maybe that will fix it.

The most stable version for me was 8.27, on that version I could get 2 to 3 weeks stable without a panic.

iranl commented 1 month ago

@technyon ultimately decides when new versions are released. But imho there is not really anything that stands in the way of a release.

8.35 includes some changes that might have a (positive) effect on stability, but nothing major imho.

8.36 will probably include the move to Arduino Core 3 and ESP-IDF 5.1.4 which could have a significant effect on stability (both negative and positive)

Do note that panics every couple of days are acceptable imho because of the very limited impact (as Nuki Hub will be up again in a couple of seconds). Such panics can and will only be investigated when serial logs are provided which show the backtrace during the crash