Closed pritchey closed 5 months ago
sounds like a dup of #171 @dkerr64, did you make any headway there?
Having two of the devices: One works (intermittently). The other not so much. I grabbed a ladder and tried refreshing one of them and the issue remains the sam:, connected over USB it flashes fine, can set the WiFi no problem, I can play with the flashing utility but no web interface after reflashing. Before flashing it I grabbed the logs off of it which don't appear to be of much use:
[ 12577] RATGDO: wifiPower: 20 [ 13589] RATGDO: Registering URI handlers [ 13590] RATGDO: Register: /rest/events/subscribe [ 13590] RATGDO: Register: / [ 13591] RATGDO: Register: /clearcrashlog [ 13595] RATGDO: Register: /crashlog [ 13599] RATGDO: Register: /logout [ 13602] RATGDO: Register: /reboot [ 13606] RATGDO: Register: /reset [ 13609] RATGDO: Register: /auth [ 13613] RATGDO: Register: /setgdo [ 13616] RATGDO: Register: /status.json [ 13620] RATGDO: Register: /settings-sliders.svg [ 13625] RATGDO: Register: /qrcode.svg [ 13629] RATGDO: Register: /style.css [ 13633] RATGDO: Register: /functions.js [ 13637] RATGDO: Register: /favicon.png [ 13641] RATGDO: Register: /apple-touch-icon.png [ 13646] RATGDO: Register: /index.html [ 13650] RATGDO: Register: /garage-car.svg [ 13654] RATGDO: HTTP server started [ 13657] RATGDO: RATGDO setup completed [ 13661] RATGDO: Starting RATGDO Homekit version 1.3.5 [ 13667] RATGDO: SDK:2.2.2-dev(38a443e)/Core:3.2.0-dev=30200000/lwIP:STABLE-2_1_3_RELEASE/glue:1.2-70-g4087efd/BearSSL:b024386 [ 13684] RATGDO: Free HEAP dropped to 23152 IMPROV�IMPROVhttp://192.168.1.131>>> [ 13699] RATGDO: reader completed packet [ 13700] RATGDO: DECODED 00000000 0000000000000000 00000000 [ 13701] RATGDO: PACKET(0x0 @ 0x0) UNKNOWN - Unknown: [000] [ 13704] RATGDO: Support for UNKNOWN packet unimplemented. Ignoring. [ 15064] RATGDO: Free HEAP dropped to 22600 [ 15724] HomeKit: Got new client: local 192.168.1.131:5556, remote 192.168.1.239:49475 [ 15725] HomeKit: Setting Timeout to 500ms [ 15727] RATGDO: Free HEAP dropped to 21920 [ 15741] HomeKit: [Client 1073702132] Pair Verify Step 1/2 [ 16381] HomeKit: Free heap: 20888 [ 16382] RATGDO: Free HEAP dropped to 20936 [ 16402] HomeKit: [Client 1073702132] Pair Verify Step 2/2 [ 16405] HomeKit: [Client 1073702132] Found pairing with 79000FB1-0954-4E82-BCF1-C4D279175013 [ 16555] HomeKit: [Client 1073702132] Verification successful, secure session established [ 16556] HomeKit: Free heap: 21136 [ 16569] HomeKit: [Client 1073702132] Get Accessories [ 16718] RATGDO: get active: 0 [ 16722] RATGDO: get current door state: 0 [ 16725] RATGDO: get target door state: 0 [ 16862] RATGDO: get obstruction: 0 [ 16865] RATGDO: get current lock state: 0 [ 16868] RATGDO: get target lock state: 0 [ 16879] RATGDO: get light state: Off [ 17014] HomeKit: [Client 1073702132] Update Characteristics [ 17078] HomeKit: [Client 1073702132] Update Characteristics [ 17103] HomeKit: [Client 1073702132] Update Characteristics [ 17182] HomeKit: [Client 1073702132] Update Characteristics [ 17207] RATGDO: Free HEAP dropped to 20704 [ 17216] HomeKit: [Client 1073702132] Update Characteristics [ 17239] HomeKit: [Client 1073702132] Update Characteristics [ 17353] HomeKit: [Client 1073702132] Update Characteristics [ 17397] HomeKit: [Client 1073702132] Update Characteristics [ 17452] HomeKit: [Client 1073702132] Update Characteristics [ 17487] HomeKit: [Client 1073702132] Update Characteristics [ 17520] HomeKit: [Client 1073702132] Get Characteristics [ 17831] RATGDO: Free HEAP dropped to 20552 [ 17889] HomeKit: [Client 1073702132] Get Characteristics [ 18338] RATGDO: Free HEAP dropped to 20352 [ 18342] HomeKit: [Client 1073702132] Get Characteristics [ 18538] HomeKit: [Client 1073702132] Get Characteristics [ 18624] HomeKit: [Client 1073702132] Get Characteristics [ 18754] HomeKit: [Client 1073702132] Get Characteristics [ 19027] HomeKit: [Client 1073702132] Get Characteristics [ 19181] HomeKit: [Client 1073702132] Get Characteristics [ 19531] RATGDO: Free HEAP dropped to 19992 [ 20436] HomeKit: [Client 1073702132] Get Characteristics [ 21031] RATGDO: Free HEAP dropped to 19744 [ 21047] RATGDO: Free HEAP dropped to 19632 [ 21087] RATGDO: Free HEAP dropped to 19616 [ 21098] RATGDO: Free HEAP dropped to 19504 [ 22537] HomeKit: [Client 1073702132] Get Characteristics [ 22642] HomeKit: [Client 1073702132] Get Characteristics [ 23299] RATGDO: Free HEAP dropped to 19408 [ 23377] RATGDO: Free HEAP dropped to 19296
you might want to try loading version 1.2.1. We need to figure out what's causing this issues.
I just upgraded from 1.2.1 and mine works fine
I can try downgrading - does the online flash utility let you select a local file to use if you manually download an older version?
I can try downgrading - does the online flash utility let you select a local file to use if you manually download an older version?
Yes it does,
I can’t use the built-in web interface on the device - that’s part of what isn’t working. I can only use the initial flasher web based app you use when initially configuring the device.
Oh, right. You can't select a different version from the usb installer and without a webUI, it's hard to downgrade. Let me see if we can do something to address this for you.
@jgstroud, can we just add old version to the json? and then they would display on the flasher page?
Thank you - standing by.
Sorry, while I'm looking for a solution, can you just make sure it's not a cache issue? Do a hard refresh or maybe try from another browser / device?
@donavanbecker we can, but that would affect everyone. But maybe not a bad idea to just make that the current release for now.
@jgstroud, I am just saying, list all versions in the json. and have a latest that shows as default.
Tied reflagging it and fresh browser window (after quitting) and no web UI.
Ok, well, I just changed the manifest to point to 1.2.1 for now. @pritchey the USB installer should install the tried and true 1.2.1 for now
@donavanbecker, ah, so if I include all of them in the manifest, will the usb flasher allow you to choose?
Awesome - thank you. I’ll try that and confirm.
@jgstroud and I are looking into why the web page is not loading for you. It makes no sense to us. But let's try a few things.
From command line try curl http://<ipaddr>/status.json
and see what comes back. It should be human readable status.
You could also try curl http://<ipaddr>/index.html
but that will be a gzip format, so curl will object that binary output could mess up your terminal... but even that tells us that the web server is responding.
In theory it should be possible to do an OTA update from command line with curl... giving curl a local binary file to upload with a http type of POST. Not for the faint of heart, but doable. I don't have time to test this right now but thought I would mention it.
Downgraded both, back to normal functioning normally. Just need to remove/readd to HomeKit and I'm back to square one. Thank you!!
sounds like a dup of #171 @dkerr64, did you make any headway there?
@donavanbecker, ah, so if I include all of them in the manifest, will the usb flasher allow you to choose?
I am not sure 🤔
That was just a thought that I had.
Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).
Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).
I agree with you, but I think most of those users will just use OTA. And if they need to re-flash should just have something that says you could lose if re-flashing.
Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).
HomeKit pairing should not be lost if you just change firmware versions. They may be lost if you select the "erase" option when uploading firmware using the USB port.
Downgraded both, back to normal functioning normally. Just need to remove/readd to HomeKit and I'm back to square one. Thank you!!
This is really puzzling. I can see no reason for why 1.2.1 works and 1.3.5 doesn't. I'd like to work with you to try and debug. Can't do it right now as traveling, but maybe next week.
Losing HomeKit on "erase" makes sense. Since I had to downgrade over USB port that did wipe the HomeKit, that makes perfect sense.
Yes, I'm open to providing assistance if I can. 1.2.1 has restored everything back to normal operation. The scenario leading into this (with both of my ratgdo boards - i have 2 garage doors):
This is indeed exactly the same as https://github.com/ratgdo/homekit-ratgdo/issues/171 I’m the user who reported that one on discord. I had all the same symptoms, and rebooting/restarting didn’t help. It’s not a cache issue because I get the same issue of UI not loading on private browser. I was able to load the /crashlog URL, so I sent that over on discord. I encourage the OP of this to see if that URL works and report the crash here too.
@jgstroud I would like to work with @pritchey and @mmotwani to try and debug this, but I would like to do it with the codebase in PR #175 because of a couple of fixes and the additional command-line logging. What is the best way to do this... that is to get the firmware binary to them to test with?
Thanks.
I would love to help debug this in any way I can. My ratgdo is still in this state. However I’m traveling and will be back on Tuesday May 14 evening. Let’s have a conversation then after that.
@dkerr64 we can do a pre-release build or I've also shared binaries in DMs on discord. You can also dump a binary in your own fork
Perhaps the same issue in #178
FWIW I had no issue upgrading to 1.3.5, BUT for some reason it forced a new DHCP lease so the IP addresses on my openers changed. Result was that it appeared like the web interface got lost until I was able to find them again. Seems like some people here are chasing something more serious than that, but I thought I'd mention this "failure" path.
In the same vein, how arduous would multicast-DNS support be, so these things could pick up .local
addresses?
They already use mdns as it is an essential element for HomeKit
oh jeez, I knew that, apologies for the brain fart. So all that's required is renaming the ratgdo to something without spaces in it so it can be used as a DNS name in a browser or on command line, and all is well. Thanks!
Seeing others have success, I tried once again updating from 1.2.1 to 1.3.5. I used the built-in web interface to start the firmware upgrade process, it downloaded/installed 1.3.5 then started the countdown reboot and....nothing. I let it sit for quite a while again and it never came back.
I cycled the power to the garage door openers, still nothing. Running a continuous ping to the IP address it's at results in:
Request timeout for icmp_seq 64
Request timeout for icmp_seq 65
Request timeout for icmp_seq 66
Request timeout for icmp_seq 67
Request timeout for icmp_seq 68
Request timeout for icmp_seq 69
64 bytes from 192.168.1.131: icmp_seq=70 ttl=255 time=80.877 ms
64 bytes from 192.168.1.131: icmp_seq=71 ttl=255 time=99.477 ms
64 bytes from 192.168.1.131: icmp_seq=72 ttl=255 time=24.202 ms
64 bytes from 192.168.1.131: icmp_seq=73 ttl=255 time=247.326 ms
64 bytes from 192.168.1.131: icmp_seq=74 ttl=255 time=164.381 ms
64 bytes from 192.168.1.131: icmp_seq=75 ttl=255 time=150.113 ms
Request timeout for icmp_seq 76
Request timeout for icmp_seq 77
Request timeout for icmp_seq 78
Request timeout for icmp_seq 79
Request timeout for icmp_seq 80
Request timeout for icmp_seq 81
Request timeout for icmp_seq 82
64 bytes from 192.168.1.131: icmp_seq=83 ttl=255 time=244.511 ms
64 bytes from 192.168.1.131: icmp_seq=84 ttl=255 time=54.679 ms
Request timeout for icmp_seq 85
Request timeout for icmp_seq 86
64 bytes from 192.168.1.131: icmp_seq=87 ttl=255 time=24.508 ms
64 bytes from 192.168.1.131: icmp_seq=88 ttl=255 time=32.965 ms
64 bytes from 192.168.1.131: icmp_seq=89 ttl=255 time=160.564 ms
64 bytes from 192.168.1.131: icmp_seq=90 ttl=255 time=69.175 ms
Request timeout for icmp_seq 91
Request timeout for icmp_seq 92
64 bytes from 192.168.1.131: icmp_seq=93 ttl=255 time=29.781 ms
Request timeout for icmp_seq 94
Request timeout for icmp_seq 95
Request timeout for icmp_seq 96
Request timeout for icmp_seq 97
Request timeout for icmp_seq 98
Request timeout for icmp_seq 99
Request timeout for icmp_seq 100
Request timeout for icmp_seq 101
Trying to pull up the web interface just shows a continually spinning busy spinner.....eventually I may get a completely blank page with the busy spinner continuing to spin, otherwise I eventually get the "site can't be reached".
Running the script for getting log info just results in this:
./ratgdo_viewlog.sh 192.168.1.131
>>> [ 306944] RATGDO: SSEHandler - Client 192.168.1.156 listening for events on channel 0
which took an excruciatingly long time to get to.
@pritchey are you on Discord? If so I can private message you a binary to try.
ran into this too, then reinstalled the firmware from the web installer, which despite being pinned to 1.2.1 installed 1.3.5.... saw these in my crashlog. not sure if its related, but hope it helps.
Crash information recovered from EEPROM
Crash # 1 at 611669926 ms
Restart reason: 2
Exception (9):
epc1=0x4024e392 epc2=0x00000000 epc3=0x00000000 excvaddr=0x000f8501 depc=0x00000000
>>>stack>>>
ctx: cont
sp: 3fff2020 end: 3fff2400
3fff2020: 00000000 00000218 40003dcc 00000000
3fff2030: 00000000 40003dcc 00000218 00000000
3fff2040: 00000218 00000000 0000028a 1f7ab906
3fff2050: a31194cc 606f0886 6ceb9826 00000002
3fff2060: 00000000 00000000 3fff2e04 40227a05
3fff2070: 0000028a 00000000 00000123 00000000
3fff2080: 00000000 00000000 00000000 40003dcc
3fff2090: 0000028a 3fff78cc 3fff74cc 40228bfd
3fff20a0: 40003dcc 00000002 00000000 3fff21b0
3fff20b0: 3fff74cc 00000000 00000020 40003dcc
3fff20c0: 3fff74cc 3fff7935 3fff78cc 40228d4c
3fff20d0: 40003dce 40004046 3fff0d5c 00000003
3fff20e0: 00000000 00000123 00000000 00000288
3fff20f0: 40100278 000003ff 3fff2120 3fff2110
3fff2100: 00000278 40003dcc 00000278 00000278
3fff2110: 402770de 00000001 00000004 3fff21b0
3fff2120: 40003dcc 00000297 3fff74cc 40228ec8
3fff2130: 00000000 00000297 3fff7ba4 4022919e
3fff2140: 50545448 312e312f 30303220 0d4b4f20
3fff2150: 6e6f430a 746e6574 7079542d 61203a65
3fff2160: 696c7070 69746163 702f6e6f 69726961
3fff2170: 742b676e 0d38766c 6e6f430a 746e6574
3fff2180: 6e654c2d 3a687467 0d642520 6e6f430a
3fff2190: 7463656e 3a6e6f69 65656b20 6c612d70
3fff21a0: 0d657669 000a0d0a 3fff7ba4 4022910a
3fff21b0: 00000210 ffffffff ffffffff ffffffff
3fff21c0: 3fff74cc 3fff21b0 00000210 00000068
3fff21d0: 3fff2101 7fffffff 3fff74cc 3fff6f8c
3fff21e0: 3fff661c 3fff22e4 3fff74cc 4022b148
3fff21f0: 00000006 39463430 46413332 3042342d
3fff2200: 43342d37 422d3442 2d314136 42353842
3fff2210: 30423830 32333041 f6e9a900 ff270a12
3fff2220: 9ffab2b2 a0bb1964 cae6d074 19632680
3fff2230: 0a42a5e5 8e1fe610 00000035 00000000
3fff2240: 00000000 00000000 00000000 00000000
3fff2250: 00000000 00000000 00000000 00000000
3fff2260: 00000000 00000000 00000000 00000000
3fff2270: 00000000 00000000 00000000 00000001
3fff2280: 12f6e9a9 b2ff270a 649ffab2 74a0bb19
3fff2290: 80cae6d0 e5196326 100a42a5 358e1fe6
3fff22a0: b22e1d62 823bfb0d 990e9c93 b6240369
3fff22b0: c50ac1b8 8499be6d 3c8affec 589d23a5
3fff22c0: 00000004 00000000 00000087 00000000
3fff22d0: 00000000 00000000 00000000 00000000
3fff22e0: 00000020 00000010 00000000 00000000
3fff22f0: 52a205b9 d0ed8a74 00000000 00000000
3fff2300: 3fff6902 00000006 00000020 00000000
3fff2310: 3fff6907 3fff78cc 3fff74cc 4022ca7c
3fff2320: 3fff6907 0000008c 3fff7904 40204c32
3fff2330: 3fff687c 3fff11f4 00000000 00000000
3fff2340: 00000000 00000000 00000000 00000000
3fff2350: 3fff6908 00000001 3fff687c 3fff7955
3fff2360: 00000021 00000030 3fff68e7 00000000
3fff2370: 3fffdad0 0000009e 00000020 3fff122c
3fff2380: 3fff687c 3fff687c 3fff74cc 4022b401
3fff2390: 0000008c 00000000 3fff4d48 4021d16c
3fff23a0: 0000008c 000015b4 3fff6494 3fff245c
3fff23b0: 3fffdad0 3fff2ce0 3fff2430 3fff245c
3fff23c0: 3fffdad0 3fff3e24 3fff78cc 4022b759
3fff23d0: 3fffdad0 00000000 3fff2430 4021e65c
3fff23e0: 00000000 00000000 00000001 4023368c
3fff23f0: feefeffe feefeffe feefeffe
Incomplete stack trace saved!
<<<stack<<<
No more EEPROM space available to save crash information!
We have identified what is causing the problem. We do not know why it causes it, but at least we know what.
We are backing out the change, which has consequences on memory usage so we are also scouring the code to minimize our memory heap usage as much as possible.
I'm not sure why the web installer would have installed 1.3.5. Myself and others have used it and it correctly installed 1.2.1. Maybe just something cached in your browser reporting the wrong version? As @dkerr64 said, we have a fix. Hope to get it released soon.
Fixed in v1.4.0
Thanks for all the amazing work on this! Any idea why the firmware update is still calling out 1.3.5 as latest?
V1.4 is marked as pre-release. So will not show as available unless you specifically select pre-release in the update dialog box.
V1.4 is marked as pre-release. So will not show as available unless you specifically select pre-release in the update dialog box.
I updated to the latest (1.3.5), the countdown for the reboot started, hit zero then sat forever. Checking it, the ratgdo appears to function OK (open/close the garage door) but the web interface never loads. I can ping the device over the network so I know it's online and it does still appear in Home app.