ratgdo / homekit-ratgdo

A native HomeKit implementation of a Security+ 2.0 garage door controller based on ratgdo hardware
https://ratgdo.github.io/homekit-ratgdo/
GNU General Public License v3.0
214 stars 21 forks source link

Updated to 1.3.5, no web interface but otherwise appears to work. #173

Closed pritchey closed 5 months ago

pritchey commented 6 months ago

I updated to the latest (1.3.5), the countdown for the reboot started, hit zero then sat forever. Checking it, the ratgdo appears to function OK (open/close the garage door) but the web interface never loads. I can ping the device over the network so I know it's online and it does still appear in Home app.

jgstroud commented 6 months ago

sounds like a dup of #171 @dkerr64, did you make any headway there?

pritchey commented 6 months ago

Having two of the devices: One works (intermittently). The other not so much. I grabbed a ladder and tried refreshing one of them and the issue remains the sam:, connected over USB it flashes fine, can set the WiFi no problem, I can play with the flashing utility but no web interface after reflashing. Before flashing it I grabbed the logs off of it which don't appear to be of much use:

[ 12577] RATGDO: wifiPower: 20 [ 13589] RATGDO: Registering URI handlers [ 13590] RATGDO: Register: /rest/events/subscribe [ 13590] RATGDO: Register: / [ 13591] RATGDO: Register: /clearcrashlog [ 13595] RATGDO: Register: /crashlog [ 13599] RATGDO: Register: /logout [ 13602] RATGDO: Register: /reboot [ 13606] RATGDO: Register: /reset [ 13609] RATGDO: Register: /auth [ 13613] RATGDO: Register: /setgdo [ 13616] RATGDO: Register: /status.json [ 13620] RATGDO: Register: /settings-sliders.svg [ 13625] RATGDO: Register: /qrcode.svg [ 13629] RATGDO: Register: /style.css [ 13633] RATGDO: Register: /functions.js [ 13637] RATGDO: Register: /favicon.png [ 13641] RATGDO: Register: /apple-touch-icon.png [ 13646] RATGDO: Register: /index.html [ 13650] RATGDO: Register: /garage-car.svg [ 13654] RATGDO: HTTP server started [ 13657] RATGDO: RATGDO setup completed [ 13661] RATGDO: Starting RATGDO Homekit version 1.3.5 [ 13667] RATGDO: SDK:2.2.2-dev(38a443e)/Core:3.2.0-dev=30200000/lwIP:STABLE-2_1_3_RELEASE/glue:1.2-70-g4087efd/BearSSL:b024386 [ 13684] RATGDO: Free HEAP dropped to 23152 IMPROV�IMPROVhttp://192.168.1.131>>> [ 13699] RATGDO: reader completed packet [ 13700] RATGDO: DECODED 00000000 0000000000000000 00000000 [ 13701] RATGDO: PACKET(0x0 @ 0x0) UNKNOWN - Unknown: [000] [ 13704] RATGDO: Support for UNKNOWN packet unimplemented. Ignoring. [ 15064] RATGDO: Free HEAP dropped to 22600 [ 15724] HomeKit: Got new client: local 192.168.1.131:5556, remote 192.168.1.239:49475 [ 15725] HomeKit: Setting Timeout to 500ms [ 15727] RATGDO: Free HEAP dropped to 21920 [ 15741] HomeKit: [Client 1073702132] Pair Verify Step 1/2 [ 16381] HomeKit: Free heap: 20888 [ 16382] RATGDO: Free HEAP dropped to 20936 [ 16402] HomeKit: [Client 1073702132] Pair Verify Step 2/2 [ 16405] HomeKit: [Client 1073702132] Found pairing with 79000FB1-0954-4E82-BCF1-C4D279175013 [ 16555] HomeKit: [Client 1073702132] Verification successful, secure session established [ 16556] HomeKit: Free heap: 21136 [ 16569] HomeKit: [Client 1073702132] Get Accessories [ 16718] RATGDO: get active: 0 [ 16722] RATGDO: get current door state: 0 [ 16725] RATGDO: get target door state: 0 [ 16862] RATGDO: get obstruction: 0 [ 16865] RATGDO: get current lock state: 0 [ 16868] RATGDO: get target lock state: 0 [ 16879] RATGDO: get light state: Off [ 17014] HomeKit: [Client 1073702132] Update Characteristics [ 17078] HomeKit: [Client 1073702132] Update Characteristics [ 17103] HomeKit: [Client 1073702132] Update Characteristics [ 17182] HomeKit: [Client 1073702132] Update Characteristics [ 17207] RATGDO: Free HEAP dropped to 20704 [ 17216] HomeKit: [Client 1073702132] Update Characteristics [ 17239] HomeKit: [Client 1073702132] Update Characteristics [ 17353] HomeKit: [Client 1073702132] Update Characteristics [ 17397] HomeKit: [Client 1073702132] Update Characteristics [ 17452] HomeKit: [Client 1073702132] Update Characteristics [ 17487] HomeKit: [Client 1073702132] Update Characteristics [ 17520] HomeKit: [Client 1073702132] Get Characteristics [ 17831] RATGDO: Free HEAP dropped to 20552 [ 17889] HomeKit: [Client 1073702132] Get Characteristics [ 18338] RATGDO: Free HEAP dropped to 20352 [ 18342] HomeKit: [Client 1073702132] Get Characteristics [ 18538] HomeKit: [Client 1073702132] Get Characteristics [ 18624] HomeKit: [Client 1073702132] Get Characteristics [ 18754] HomeKit: [Client 1073702132] Get Characteristics [ 19027] HomeKit: [Client 1073702132] Get Characteristics [ 19181] HomeKit: [Client 1073702132] Get Characteristics [ 19531] RATGDO: Free HEAP dropped to 19992 [ 20436] HomeKit: [Client 1073702132] Get Characteristics [ 21031] RATGDO: Free HEAP dropped to 19744 [ 21047] RATGDO: Free HEAP dropped to 19632 [ 21087] RATGDO: Free HEAP dropped to 19616 [ 21098] RATGDO: Free HEAP dropped to 19504 [ 22537] HomeKit: [Client 1073702132] Get Characteristics [ 22642] HomeKit: [Client 1073702132] Get Characteristics [ 23299] RATGDO: Free HEAP dropped to 19408 [ 23377] RATGDO: Free HEAP dropped to 19296

jgstroud commented 6 months ago

you might want to try loading version 1.2.1. We need to figure out what's causing this issues.

kash04 commented 6 months ago

I just upgraded from 1.2.1 and mine works fine

pritchey commented 6 months ago

I can try downgrading - does the online flash utility let you select a local file to use if you manually download an older version?

kash04 commented 6 months ago

I can try downgrading - does the online flash utility let you select a local file to use if you manually download an older version?

Yes it does, image

pritchey commented 6 months ago

I can’t use the built-in web interface on the device - that’s part of what isn’t working. I can only use the initial flasher web based app you use when initially configuring the device.

jgstroud commented 6 months ago

Oh, right. You can't select a different version from the usb installer and without a webUI, it's hard to downgrade. Let me see if we can do something to address this for you.

donavanbecker commented 6 months ago

@jgstroud, can we just add old version to the json? and then they would display on the flasher page?

pritchey commented 6 months ago

Thank you - standing by.

jgstroud commented 6 months ago

Sorry, while I'm looking for a solution, can you just make sure it's not a cache issue? Do a hard refresh or maybe try from another browser / device?

@donavanbecker we can, but that would affect everyone. But maybe not a bad idea to just make that the current release for now.

donavanbecker commented 6 months ago

@jgstroud, I am just saying, list all versions in the json. and have a latest that shows as default.

pritchey commented 6 months ago

Tied reflagging it and fresh browser window (after quitting) and no web UI.

jgstroud commented 6 months ago

Ok, well, I just changed the manifest to point to 1.2.1 for now. @pritchey the USB installer should install the tried and true 1.2.1 for now

@donavanbecker, ah, so if I include all of them in the manifest, will the usb flasher allow you to choose?

pritchey commented 6 months ago

Awesome - thank you. I’ll try that and confirm.

dkerr64 commented 6 months ago

@jgstroud and I are looking into why the web page is not loading for you. It makes no sense to us. But let's try a few things.

From command line try curl http://<ipaddr>/status.json and see what comes back. It should be human readable status. You could also try curl http://<ipaddr>/index.html but that will be a gzip format, so curl will object that binary output could mess up your terminal... but even that tells us that the web server is responding.

dkerr64 commented 6 months ago

In theory it should be possible to do an OTA update from command line with curl... giving curl a local binary file to upload with a http type of POST. Not for the faint of heart, but doable. I don't have time to test this right now but thought I would mention it.

pritchey commented 6 months ago

Downgraded both, back to normal functioning normally. Just need to remove/readd to HomeKit and I'm back to square one. Thank you!!

dkerr64 commented 6 months ago

sounds like a dup of #171 @dkerr64, did you make any headway there?

171 has a crash and quite different message log than reported here. I have made some progress as documented over there. I have a "next-release" branch on my GitHub. I'll open a PR so you have visibility to the few things in there.

donavanbecker commented 6 months ago

@donavanbecker, ah, so if I include all of them in the manifest, will the usb flasher allow you to choose?

I am not sure 🤔

That was just a thought that I had.

donavanbecker commented 6 months ago

Here: https://github.com/esphome/esp-web-tools/issues/432#issuecomment-1809149855

pritchey commented 6 months ago

Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).

donavanbecker commented 6 months ago

Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).

I agree with you, but I think most of those users will just use OTA. And if they need to re-flash should just have something that says you could lose if re-flashing.

dkerr64 commented 6 months ago

Something to think about though - if a user were to bounce between firmware versions, what's the risk to losing HomeKit setup? In my case, I did - and to me it's easy peasy to set back up but for those that are less technical maybe not. So while I would appreciate/benefit (in this specific instance) being able to select the firmware version I want from the installer, it might cause more issues than needed for others. (And I'm not opposed to using a curl API to downgrade for emergency purposes - it could possibly eliminate the need to get a ladder if it's still accessible over the network).

HomeKit pairing should not be lost if you just change firmware versions. They may be lost if you select the "erase" option when uploading firmware using the USB port.

dkerr64 commented 6 months ago

Downgraded both, back to normal functioning normally. Just need to remove/readd to HomeKit and I'm back to square one. Thank you!!

This is really puzzling. I can see no reason for why 1.2.1 works and 1.3.5 doesn't. I'd like to work with you to try and debug. Can't do it right now as traveling, but maybe next week.

pritchey commented 6 months ago

Losing HomeKit on "erase" makes sense. Since I had to downgrade over USB port that did wipe the HomeKit, that makes perfect sense.

Yes, I'm open to providing assistance if I can. 1.2.1 has restored everything back to normal operation. The scenario leading into this (with both of my ratgdo boards - i have 2 garage doors):

  1. went into the web interface of each board, used the built-in web UI to perform the OTA upgrade from 1.2.1 to the latest 1.3.5.
  2. The upgrade downloaded, installed, then the countdown clock started for the reboot, it hit zero and nothing....countdown dialog stayed up for quite a while and never went away to display the built-in web UI like normal.
  3. After waiting quite a while, physically power cycled the ratgdo (flipped the circuit breaker in the garage).
  4. No web UI still. Web browser seems to indicate it's getting "something" since it displays a completely blank page, but also indicate the page never fully loads based not the progress bar.
  5. On one of the boards, i tried a manual install of the latest firmware over USB. exact same results.
  6. Opened the ticket here, we all started conversing, eventually the manifest file was set back to 1.2.1.
  7. Manually installed the 1.2 1 firmware over USB and full capability was restored.
mmotwani commented 6 months ago

This is indeed exactly the same as https://github.com/ratgdo/homekit-ratgdo/issues/171 I’m the user who reported that one on discord. I had all the same symptoms, and rebooting/restarting didn’t help. It’s not a cache issue because I get the same issue of UI not loading on private browser. I was able to load the /crashlog URL, so I sent that over on discord. I encourage the OP of this to see if that URL works and report the crash here too.

dkerr64 commented 6 months ago

@jgstroud I would like to work with @pritchey and @mmotwani to try and debug this, but I would like to do it with the codebase in PR #175 because of a couple of fixes and the additional command-line logging. What is the best way to do this... that is to get the firmware binary to them to test with?

Thanks.

mmotwani commented 6 months ago

I would love to help debug this in any way I can. My ratgdo is still in this state. However I’m traveling and will be back on Tuesday May 14 evening. Let’s have a conversation then after that.

jgstroud commented 6 months ago

@dkerr64 we can do a pre-release build or I've also shared binaries in DMs on discord. You can also dump a binary in your own fork

jgstroud commented 6 months ago

Perhaps the same issue in #178

cvkline commented 6 months ago

FWIW I had no issue upgrading to 1.3.5, BUT for some reason it forced a new DHCP lease so the IP addresses on my openers changed. Result was that it appeared like the web interface got lost until I was able to find them again. Seems like some people here are chasing something more serious than that, but I thought I'd mention this "failure" path.

In the same vein, how arduous would multicast-DNS support be, so these things could pick up .local addresses?

jgstroud commented 6 months ago

They already use mdns as it is an essential element for HomeKit

cvkline commented 6 months ago

oh jeez, I knew that, apologies for the brain fart. So all that's required is renaming the ratgdo to something without spaces in it so it can be used as a DNS name in a browser or on command line, and all is well. Thanks!

dkerr64 commented 6 months ago

I believe we have traced this problem to our use of IRAM heap. Unfortunately. Combination of this and this backs it out.

pritchey commented 6 months ago

Seeing others have success, I tried once again updating from 1.2.1 to 1.3.5. I used the built-in web interface to start the firmware upgrade process, it downloaded/installed 1.3.5 then started the countdown reboot and....nothing. I let it sit for quite a while again and it never came back.

I cycled the power to the garage door openers, still nothing. Running a continuous ping to the IP address it's at results in:

Request timeout for icmp_seq 64
Request timeout for icmp_seq 65
Request timeout for icmp_seq 66
Request timeout for icmp_seq 67
Request timeout for icmp_seq 68
Request timeout for icmp_seq 69
64 bytes from 192.168.1.131: icmp_seq=70 ttl=255 time=80.877 ms
64 bytes from 192.168.1.131: icmp_seq=71 ttl=255 time=99.477 ms
64 bytes from 192.168.1.131: icmp_seq=72 ttl=255 time=24.202 ms
64 bytes from 192.168.1.131: icmp_seq=73 ttl=255 time=247.326 ms
64 bytes from 192.168.1.131: icmp_seq=74 ttl=255 time=164.381 ms
64 bytes from 192.168.1.131: icmp_seq=75 ttl=255 time=150.113 ms
Request timeout for icmp_seq 76
Request timeout for icmp_seq 77
Request timeout for icmp_seq 78
Request timeout for icmp_seq 79
Request timeout for icmp_seq 80
Request timeout for icmp_seq 81
Request timeout for icmp_seq 82
64 bytes from 192.168.1.131: icmp_seq=83 ttl=255 time=244.511 ms
64 bytes from 192.168.1.131: icmp_seq=84 ttl=255 time=54.679 ms
Request timeout for icmp_seq 85
Request timeout for icmp_seq 86
64 bytes from 192.168.1.131: icmp_seq=87 ttl=255 time=24.508 ms
64 bytes from 192.168.1.131: icmp_seq=88 ttl=255 time=32.965 ms
64 bytes from 192.168.1.131: icmp_seq=89 ttl=255 time=160.564 ms
64 bytes from 192.168.1.131: icmp_seq=90 ttl=255 time=69.175 ms
Request timeout for icmp_seq 91
Request timeout for icmp_seq 92
64 bytes from 192.168.1.131: icmp_seq=93 ttl=255 time=29.781 ms
Request timeout for icmp_seq 94
Request timeout for icmp_seq 95
Request timeout for icmp_seq 96
Request timeout for icmp_seq 97
Request timeout for icmp_seq 98
Request timeout for icmp_seq 99
Request timeout for icmp_seq 100
Request timeout for icmp_seq 101

Trying to pull up the web interface just shows a continually spinning busy spinner.....eventually I may get a completely blank page with the busy spinner continuing to spin, otherwise I eventually get the "site can't be reached".

Running the script for getting log info just results in this:

./ratgdo_viewlog.sh 192.168.1.131
>>> [ 306944] RATGDO: SSEHandler - Client 192.168.1.156 listening for events on channel 0

which took an excruciatingly long time to get to.

dkerr64 commented 6 months ago

@pritchey are you on Discord? If so I can private message you a binary to try.

drewcovi commented 6 months ago

ran into this too, then reinstalled the firmware from the web installer, which despite being pinned to 1.2.1 installed 1.3.5.... saw these in my crashlog. not sure if its related, but hope it helps.

Crash information recovered from EEPROM
Crash # 1 at 611669926 ms
Restart reason: 2
Exception (9):
epc1=0x4024e392 epc2=0x00000000 epc3=0x00000000 excvaddr=0x000f8501 depc=0x00000000
>>>stack>>>

ctx: cont

sp: 3fff2020 end: 3fff2400
3fff2020: 00000000 00000218 40003dcc 00000000 
3fff2030: 00000000 40003dcc 00000218 00000000 
3fff2040: 00000218 00000000 0000028a 1f7ab906 
3fff2050: a31194cc 606f0886 6ceb9826 00000002 
3fff2060: 00000000 00000000 3fff2e04 40227a05 
3fff2070: 0000028a 00000000 00000123 00000000 
3fff2080: 00000000 00000000 00000000 40003dcc 
3fff2090: 0000028a 3fff78cc 3fff74cc 40228bfd 
3fff20a0: 40003dcc 00000002 00000000 3fff21b0 
3fff20b0: 3fff74cc 00000000 00000020 40003dcc 
3fff20c0: 3fff74cc 3fff7935 3fff78cc 40228d4c 
3fff20d0: 40003dce 40004046 3fff0d5c 00000003 
3fff20e0: 00000000 00000123 00000000 00000288 
3fff20f0: 40100278 000003ff 3fff2120 3fff2110 
3fff2100: 00000278 40003dcc 00000278 00000278 
3fff2110: 402770de 00000001 00000004 3fff21b0 
3fff2120: 40003dcc 00000297 3fff74cc 40228ec8 
3fff2130: 00000000 00000297 3fff7ba4 4022919e 
3fff2140: 50545448 312e312f 30303220 0d4b4f20 
3fff2150: 6e6f430a 746e6574 7079542d 61203a65 
3fff2160: 696c7070 69746163 702f6e6f 69726961 
3fff2170: 742b676e 0d38766c 6e6f430a 746e6574 
3fff2180: 6e654c2d 3a687467 0d642520 6e6f430a 
3fff2190: 7463656e 3a6e6f69 65656b20 6c612d70 
3fff21a0: 0d657669 000a0d0a 3fff7ba4 4022910a 
3fff21b0: 00000210 ffffffff ffffffff ffffffff 
3fff21c0: 3fff74cc 3fff21b0 00000210 00000068 
3fff21d0: 3fff2101 7fffffff 3fff74cc 3fff6f8c 
3fff21e0: 3fff661c 3fff22e4 3fff74cc 4022b148 
3fff21f0: 00000006 39463430 46413332 3042342d 
3fff2200: 43342d37 422d3442 2d314136 42353842 
3fff2210: 30423830 32333041 f6e9a900 ff270a12 
3fff2220: 9ffab2b2 a0bb1964 cae6d074 19632680 
3fff2230: 0a42a5e5 8e1fe610 00000035 00000000 
3fff2240: 00000000 00000000 00000000 00000000 
3fff2250: 00000000 00000000 00000000 00000000 
3fff2260: 00000000 00000000 00000000 00000000 
3fff2270: 00000000 00000000 00000000 00000001 
3fff2280: 12f6e9a9 b2ff270a 649ffab2 74a0bb19 
3fff2290: 80cae6d0 e5196326 100a42a5 358e1fe6 
3fff22a0: b22e1d62 823bfb0d 990e9c93 b6240369 
3fff22b0: c50ac1b8 8499be6d 3c8affec 589d23a5 
3fff22c0: 00000004 00000000 00000087 00000000 
3fff22d0: 00000000 00000000 00000000 00000000 
3fff22e0: 00000020 00000010 00000000 00000000 
3fff22f0: 52a205b9 d0ed8a74 00000000 00000000 
3fff2300: 3fff6902 00000006 00000020 00000000 
3fff2310: 3fff6907 3fff78cc 3fff74cc 4022ca7c 
3fff2320: 3fff6907 0000008c 3fff7904 40204c32 
3fff2330: 3fff687c 3fff11f4 00000000 00000000 
3fff2340: 00000000 00000000 00000000 00000000 
3fff2350: 3fff6908 00000001 3fff687c 3fff7955 
3fff2360: 00000021 00000030 3fff68e7 00000000 
3fff2370: 3fffdad0 0000009e 00000020 3fff122c 
3fff2380: 3fff687c 3fff687c 3fff74cc 4022b401 
3fff2390: 0000008c 00000000 3fff4d48 4021d16c 
3fff23a0: 0000008c 000015b4 3fff6494 3fff245c 
3fff23b0: 3fffdad0 3fff2ce0 3fff2430 3fff245c 
3fff23c0: 3fffdad0 3fff3e24 3fff78cc 4022b759 
3fff23d0: 3fffdad0 00000000 3fff2430 4021e65c 
3fff23e0: 00000000 00000000 00000001 4023368c 
3fff23f0: feefeffe feefeffe feefeffe 
Incomplete stack trace saved!
<<<stack<<<
No more EEPROM space available to save crash information!
dkerr64 commented 5 months ago

We have identified what is causing the problem. We do not know why it causes it, but at least we know what.

We are backing out the change, which has consequences on memory usage so we are also scouring the code to minimize our memory heap usage as much as possible.

jgstroud commented 5 months ago

I'm not sure why the web installer would have installed 1.3.5. Myself and others have used it and it correctly installed 1.2.1. Maybe just something cached in your browser reporting the wrong version? As @dkerr64 said, we have a fix. Hope to get it released soon.

jgstroud commented 5 months ago

Fixed in v1.4.0

drewcovi commented 5 months ago

Thanks for all the amazing work on this! Any idea why the firmware update is still calling out 1.3.5 as latest? image

dkerr64 commented 5 months ago

V1.4 is marked as pre-release. So will not show as available unless you specifically select pre-release in the update dialog box.

dkerr64 commented 5 months ago

V1.4 is marked as pre-release. So will not show as available unless you specifically select pre-release in the update dialog box.