Open dkerr64 opened 1 month ago
I've definitely seen mine crash when my HomePod reboots. one time when I was determined to fix it, right after it happened, I went and started a packet capture and then rebooted the home pod over and over and failed to reproduce. It will only happen when you aren't looking for it. Something like a watched pot never boils.... "A watched ratgdo never crashes"
Heisenberg's Uncertainty Principle :-)
Adding stack decode...
Exception Cause: 3 [LoadStoreError: Processor internal physical address or data error during load or store]
0x4010110d: umm_malloc_core at umm_malloc.cpp:?
0x4000b2e1: ?? ??:0
0x40101394: malloc at ??:?
0x4024797d: netif_do_set_ipaddr at /Users/jstroud/git/Arduino/tools/sdk/lwip2/builder/lwip2-src/src/core/netif.c:475
0x40247f17: pbuf_copy_partial_pbuf at /Users/jstroud/git/Arduino/tools/sdk/lwip2/builder/lwip2-src/src/core/pbuf.c:1024
0x4024a973: tcp_write at /Users/jstroud/git/Arduino/tools/sdk/lwip2/builder/lwip2-src/src/core/tcp_out.c:718
0x40228541: ClientContext::_consume(unsigned int) at ??:?
0x40229715: client_send_encrypted_(_client_context_t*, unsigned char*, unsigned int) at ??:?
0x40229864: client_decrypt_(_client_context_t*, unsigned char*, unsigned int, unsigned char*, unsigned int*) at ??:?
0x40102864: pp_post at ??:?
0x4010014c: std::function<void (void const*)>::operator()(void const*) const at ??:?
0x40277e19: system_get_sdk_version at ??:?
0x402299e0: client_send_P(_client_context_t*, char const*) at ??:?
0x40229cb6: send_json_response(_client_context_t*, int, unsigned char*, unsigned int) at ??:?
0x40229c22: send_tlv_error_response(_client_context_t*, int, TLVError) at ??:?
0x4022bc64: homekit_server_close_client(homekit_server_t*, _client_context_t*) at ??:?
0x4022d5a4: arduino_homekit_setup at ??:?
0x40204c32: http_parser_execute at ??:?
Adding in crash reported by @donavanbecker ...
Crash information recovered from EEPROM
Crash # 1 at 540949935 ms
Restart reason: 2
Exception (3):
epc1=0x4010110d epc2=0x00000000 epc3=0x00000000 excvaddr=0x400180e9 depc=0x00000000
>>>stack>>>
ctx: cont
sp: 3fff1d20 end: 3fff2080
3fff1d20: 3fff77e4 0000016a 00000100 3fff1e30
3fff1d30: 3fff6d8c 00000000 00000020 40101394
3fff1d40: 3fff3ed4 000000ff 3fff718c 402296b4
3fff1d50: 401033ef 3ffeec80 00000005 40102864
3fff1d60: 00000000 00000000 00000000 401035cc
3fff1d70: 401033ef 3ffeebe0 3fff1da0 3fff1d90
3fff1d80: 0000014c 3fff77e4 3ffe8f10 40101012
3fff1d90: 40277d31 00000000 00000000 3fff1e30
3fff1da0: 3fff77e4 0000016b 3fff6d8c 402298f8
3fff1db0: 00000000 0000016b 3fff76fc 40229bce
3fff1dc0: 50545448 312e312f 30303220 0d4b4f20
3fff1dd0: 6e6f430a 746e6574 7079542d 61203a65
3fff1de0: 696c7070 69746163 702f6e6f 69726961
3fff1df0: 742b676e 0d38766c 6e6f430a 746e6574
3fff1e00: 6e654c2d 3a687467 0d642520 6e6f430a
3fff1e10: 7463656e 3a6e6f69 65656b20 6c612d70
3fff1e20: 0d657669 000a0d0a 3fff76fc 40229b3a
3fff1e30: 000000e4 ffffffff ffffffff ffffffff
3fff1e40: 3fff6d8c 3fff1e30 000000e4 00000068
3fff1e50: 3fff1e01 f43e8d6a 3fff6d8c 3fff7684
3fff1e60: 3fff5bf4 3fff1f64 3fff6d8c 4022bb7c
3fff1e70: 00000002 38334435 36324333 4632362d
3fff1e80: 34342d46 422d3338 2d453944 36423844
3fff1e90: 44363234 32364242 31f60300 bc7f12f2
3fff1ea0: 29a1594e 54967124 66fdbcbe f54ab707
3fff1eb0: 55406a4c afdc65c7 00000053 00000000
3fff1ec0: 00000000 00000000 00000000 00000000
3fff1ed0: 00000000 00000000 00000000 00000000
3fff1ee0: 00000000 00000000 00000000 00000000
3fff1ef0: 00000000 00000000 00000000 00000001
3fff1f00: f231f603 4ebc7f12 2429a159 be549671
3fff1f10: 0766fdbc 4cf54ab7 c755406a 53afdc65
3fff1f20: 0b1b88a1 adb94d0a 74d5cc83 f705c40f
3fff1f30: 8b90b790 be85524d 80541051 1a30788b
3fff1f40: 00000004 00000000 00000219 00000000
3fff1f50: 00000000 00000000 00000000 00000000
3fff1f60: 00000020 00000010 00000000 00000000
3fff1f70: 01f5d706 c8ee1a8c 00000000 00000000
3fff1f80: 3fff693a 00000006 00000020 00000000
3fff1f90: 3fff693f 3fff718c 3fff6d8c 4022d4bc
3fff1fa0: 3fff693f 0000008c 3fff71c4 40204c32
3fff1fb0: 3fff68b4 3fff0e74 00000000 00000000
3fff1fc0: 00000000 00000000 00000000 00000000
3fff1fd0: 3fff6940 00000001 3fff68b4 3fff7215
3fff1fe0: 00000021 00000030 3fff691f 00000000
3fff1ff0: 3fffdad0 0000009e 00000020 3fff0eac
3fff2000: 3fff68b4 3fff68b4 3fff6d8c 4022be35
3fff2010: 0000008c 00000000 3fff5410 4021d254
3fff2020: 0000008c 3fff0c58 3fff0c1c 3fff20dc
3fff2030: 3fffdad0 3fff2930 3fff20b0 3fff20dc
3fff2040: 3fffdad0 3fff441c 3fff718c 4022c199
3fff2050: 3fffdad0 00000000 3fff20b0 4021eadb
3fff2060: 00000000 00000000 00000001 40234168
3fff2070: feefeffe feefeffe 3fffdab0 401007ad
<<<stack<<<
EEPROM space available: 0x007b bytes
Flash CRC OK
Firmware Version: 1.6.0
TGDO: get target door state: 0
>>> [540590377] RATGDO: get light state: On
>>> [540628843] HomeKit: [Client 1073703340] Get Characteristics
>>> [540689374] HomeKit: [Client 1073703340] Get Characteristics
>>> [540749665] HomeKit: [Client 1073703340] Get Characteristics
>>> [540809787] HomeKit: [Client 1073703340] Get Characteristics
>>> [540839818] HomeKit: [Client 1073703340] Get Characteristics
>>> [540839842] RATGDO: get light state: On
>>> [540849669] HomeKit: [Client 1073703340] Get Characteristics
>>> [540849788] RATGDO: get current door state: 0
>>> [540850990] HomeKit: [Client 1073703340] Get Characteristics
>>> [540851000] RATGDO: get light state: On
>>> [540854537] HomeKit: [Client 1073703340] Get Characteristics
>>> [540854548] RATGDO: get current door state: 0
>>> [540870670] HomeKit: [Client 1073703340] Get Characteristics
>>> [540871933] RATGDO: reader completed packet
>>> [540871934] RATGDO: DECODED 0002388B 000000306E511006 42608181
>>> [540871935] RATGDO: PACKET(0x511006 @ 0x2388B) Status - Status: [DoorState Open, Parity 0x8, Obs 1, Lock 0, Light 1]
>>> [540871944] RATGDO: tgt 0 curr 0
>>> [540919461] HomeKit: [Client 1073703340] Get Characteristics
>>> [540919470] RATGDO: get light state: On
>>> [540929431] HomeKit: [Client 1073703340] Get Characteristics
>>> [540929442] RATGDO: get current door state: 0
>>> [540930586] HomeKit: [Client 1073703340] Get Characteristics
>>> [540930603] RATGDO: get light state: On
>>> [540930710] HomeKit: [Client 1073703340] Get Characteristics
>>> [540935424] HomeKit: [Client 1073703340] Get Characteristics
>>> [540935443] RATGDO: get current door state: 0
>>> [540936594] HomeKit: [Client 1073703340] Get Characteristics
>>> [540936611] RATGDO: get light state: On
>>> [540946437] HomeKit: [Client 1073703340] Get Characteristics
>>> [540946454] RATGDO: get current door state: 0
>>> [540947869] HomeKit: [Client 1073703340] Get Characteristics
>>> [540947925] RATGDO: get light state: On
>>> [540949657] HomeKit: [Client 1073703340] List Pairings
which decodes to...
Exception Cause: 3 [LoadStoreError: Processor internal physical address or data error during load or store]
0x4010110d: umm_malloc_core at umm_malloc.cpp:?
0x400180e9: ?? ??:0
0x40101394: malloc at ??:?
0x402296b4: client_send_encrypted_(_client_context_t*, unsigned char*, unsigned int) at ??:?
0x401033ef: rcReachRetryLimit at ??:?
0x40102864: pp_post at ??:?
0x401035cc: rcReachRetryLimit at ??:?
0x401033ef: rcReachRetryLimit at ??:?
0x40101012: umm_free_core at umm_malloc.cpp:?
0x40277d31: system_get_sdk_version at ??:?
0x402298f8: client_send(_client_context_t*, unsigned char*, unsigned int) at ??:?
0x40229bce: send_tlv_response(_client_context_t*, tlv_values_t*) at ??:?
0x40229b3a: send_tlv_response(_client_context_t*, tlv_values_t*) at ??:?
0x4022bb7c: homekit_server_on_pairings(_client_context_t*, unsigned char const*, unsigned int) at ??:?
0x4022d4bc: homekit_server_on_message_complete(http_parser*) at ??:?
0x40204c32: http_parser_execute at ??:?
0x4022be35: homekit_client_process(_client_context_t*) at ??:?
0x4021d254: comms_loop() at ??:?
0x4022c199: homekit_server_process(homekit_server_t*) at ??:?
0x4021eadb: loop at ??:?
0x40234168: loop_wrapper() at core_esp8266_main.cpp:?
0x401007ad: cont_wrapper at ??:?
we had stormy weather today and I got a bunch of notifications about my HomePod coming and going offline and on again. ultimately a bunch of devices were no response including the ratgdo. had to power cycle the HomePod and everything came back. During that time the HomePod was coming and going, but ratgdo did crash, and its a very similar crash to some of the ones reported in discord
Crashdump: https://gist.github.com/jgstroud/081f5e0ae711776cd4e8e1ffc565a375
Both my ratgdo's crashed at the same time after 17 days running. One of them subsequently crashed a second time 3 mins 28 seconds after its first crash, overwriting the crash log from the first crash. But I believe the first (simultaneous) crashes had the same cause.
The easy one is the 2nd crash... it the MDNSResponder crash we're familiar with.
The first crash is harder. What was I doing at the time? Well I just returned from two weeks away and found that the UPS my AppleTV was connected to had entered some sort of error state. As part of recovering from that I unplugged the AppleTV, and ~5 minutes later plugged it back in. As far as I can tell this triggered some activity on both ratgdo's that caused a crash.
Captured log is...