Closed logjames closed 5 months ago
was able to perform a update over USB to restore it
We've added all kinds of checks to make sure the OTA update is good, but I also hit this or at least a similar issue. Turns out I was getting a WiFi authentication error after the upgrade.
Maybe I shouldn’t have closed the issue. The OTA update didn’t work for me. I needed to update via USB. I am up and running again after redoing the update from a laptop. On May 22, 2024, at 1:15 PM, jgstroud @.***> wrote: We've added all kinds of checks to make sure the OTA update is good, but I also hit this or at least a similar issue. Turns out I was getting a WiFi authentication error after the upgrade.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you modified the open/close state.Message ID: @.***>
I wonder if we should add some additional logging specifically around the OTA update code and save it to a file just before we reboot so we can later look at it.
As if right now there is no way to view activity during updates without a usb cable connected.
Looks like the same thing happened with 1.41 OTA. I will try again with the USB cable later today.
I needed to use the USB update to recover my device. It's now on 1.42 updated via USB. I suspect it could be the wifi issue mentioned, since I am not changing any settings, just using the USB update.
I would like to know if you downgrade to 1.4.1 and then try an OTA to 1.4.2 if you have success. Another user having this issue consistently said the upgrade to 1.4.2 went smoothly
Actually, I just made a 1.4.3 build. Give that a try
Same result going to 1.4.3 from 1.4.2…no response after OTA install.
On May 24, 2024, at 12:14 PM, jgstroud @.***> wrote:
Actually, I just made a 1.4.3 build. Give that a try
— Reply to this email directly, view it on GitHub https://github.com/ratgdo/homekit-ratgdo/issues/184#issuecomment-2130022522, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPP3WLO5TTJ43FSRV57N5DZD5YQZAVCNFSM6AAAAABIEELVACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZQGAZDENJSGI. You are receiving this because you modified the open/close state.
Same for me. 1.3.5 to 1.4.1 went fine, but 1.4.1 to 1.4.3 no response after update
Get Outlook for iOShttps://aka.ms/o0ukef
From: logjames @.> Sent: Friday, May 24, 2024 1:28:00 PM To: ratgdo/homekit-ratgdo @.> Cc: Subscribed @.***> Subject: Re: [ratgdo/homekit-ratgdo] 1.4.0 OTA bricked my RatGDO v2.5 (Issue #184)
Same result going to 1.4.3 from 1.4.2…no response after OTA install.
On May 24, 2024, at 12:14 PM, jgstroud @.***> wrote:
Actually, I just made a 1.4.3 build. Give that a try
— Reply to this email directly, view it on GitHub https://github.com/ratgdo/homekit-ratgdo/issues/184#issuecomment-2130022522, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPP3WLO5TTJ43FSRV57N5DZD5YQZAVCNFSM6AAAAABIEELVACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZQGAZDENJSGI. You are receiving this because you modified the open/close state.
— Reply to this email directly, view it on GitHubhttps://github.com/ratgdo/homekit-ratgdo/issues/184#issuecomment-2130310163, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAESEMER4T3YWLKE5NY3TADZD6PFBAVCNFSM6AAAAABIEELVACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZQGMYTAMJWGM. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Sorry. I'm stumped on these OTA update failures at the moment. We've added verification to the upload. I just don't know why this happens to some people.
I had the OTA update succeed from 1.2.1 to 1.3.X to 1.4.1 to 1.4.3, but then after a few days, the led was off and I needed to flash via USB cable.
The reason I first updated via OTA from 1.2.1 to 1.3.X was after a few weeks, it would stop responding and needed a power cycle (unplug and plug back in). Then after a few weeks at 1.3.X, I needed to power cycle again due to non-responsiveness and OTA updated to 1.4.1. The slow update of door closed status made me update to 1.4.3 without a power cycle. Then a few days later, it was non-responsive even through a power cycle and I had to flash via USB cable.
So my only OTA update that eventually failed was without a power cycle right before updating. Maybe this might help track down the problem. (I did not set a reboot interval because it seemed excessive to reboot every few days when it only becomes non-responsive after a few weeks)
thanks for the update. I have seen a few users report the device going unresponsive after a long period of time. It would be really nice to see if there is anything on the serial console when that happens. I don't know if the code is getting corrupt in flash or just the wifi settings or something. We've been trying hard to track down exactly what causes the OTA update failures and hopefully narrowing in. maybe ultimately the same root cause, but hard to say at the moment.
I just started a viewlog.sh and hopefully will see something if it happens again. Unfortunately the log I downloaded before upgrading the firmware via USB was empty.
I updated today from v1.2.1 to v1.4.3. It appears to be bricked (i.e., no response), but I'll check later today if there is a WiFi configuration issue.
Question though: how would I check if there was a WiFi config issue-- i.e., would I just use the web installer, here: https://ratgdo.github.io/homekit-ratgdo/flash.html?
That page will let you connect to the serial console. from there you can issue a reboot and monitor the serial output.
Sorry for the trouble, but fixing these flash issues is our number 1 priority right now.
It's all good-- I'm just happy you guys have created this hardware and software! Happy to deal with a little trial and error and testing to make it better!
I don't know if i am going through the same issue. but I noticed my RatGDO was unresponsive and the blue led was not on. I tried doing a power cycle but the blue led briefly lit up but didn't stay on so i ended up reflashing to 1.4.3. It was fine for a few minutes then shut down (or bricked?) and blue led was off. I reflashed again but to 1.4.1, so far so good.
might help? crashlog from exactly the situation you described @jgstroud. Not identical to my previous crashlogs
Crash information recovered from EEPROM
Crash # 1 at 400109340 ms
Restart reason: 2
Exception (3):
epc1=0x401070b6 epc2=0x00000000 epc3=0x00000000 excvaddr=0x40012f9a depc=0x00000000
>>>stack>>>
ctx: cont
sp: 3fff1c20 end: 3fff1f80
3fff1c20: 3fff63ec 00000296 00000100 3fff1d30
3fff1c30: 3fff767c 00000000 00000020 401013b8
3fff1c40: 3fff396c 000000ff 3fff7a7c 40228ef8
3fff1c50: 40103413 3ffeead8 50afc405 4010042c
3fff1c60: 00000000 00000000 00000000 401035f0
3fff1c70: 3ffeb9c0 00000000 3fff1ca0 3fff1c90
3fff1c80: 00000278 3fff63ec 3ffe8f10 40101036
3fff1c90: 40277472 7fffffff 0000000d 3fff1d30
3fff1ca0: 3fff63ec 00000297 3fff767c 4022913c
3fff1cb0: 00000000 00000297 3fff7b4c 40229412
3fff1cc0: 50545448 312e312f 30303220 0d4b4f20
3fff1cd0: 6e6f430a 746e6574 7079542d 61203a65
3fff1ce0: 696c7070 69746163 702f6e6f 69726961
3fff1cf0: 742b676e 0d38766c 6e6f430a 746e6574
3fff1d00: 6e654c2d 3a687467 0d642520 6e6f430a
3fff1d10: 7463656e 3a6e6f69 65656b20 6c612d70
3fff1d20: 0d657669 000a0d0a 3fff7b4c 4022937e
3fff1d30: 00000210 3fff65fc 00000020 3fff6a44
3fff1d40: 3fff767c 3fff1d30 00000210 00000068
3fff1d50: 3fff1d00 ebc59422 3fff767c 3fff6a44
3fff1d60: 3fff564c 3fff1e64 3fff767c 4022b3bc
3fff1d70: 0000000f 44313045 36423730 3146452d
3fff1d80: 35342d38 382d3235 2d393132 32354636
3fff1d90: 43303345 45384238 c2ca4600 28c59cbc
3fff1da0: e8f2dfdd 2a566bed 901a15df eecc59fb
3fff1db0: 46011521 179040d0 00000092 00000000
3fff1dc0: 00000000 00000000 00000000 00000000
3fff1dd0: 00000000 00000000 00000000 00000000
3fff1de0: 00000000 00000000 00000000 00000000
3fff1df0: 00000000 00000000 00000000 00000000
3fff1e00: bcc2ca46 dd28c59c ede8f2df df2a566b
3fff1e10: fb901a15 21eecc59 d0460115 92179040
3fff1e20: 8e689290 5388e891 704260b3 177c8f26
3fff1e30: 66d54da6 19f55823 605fc919 923578e3
3fff1e40: 00000004 00000000 00000089 00000000
3fff1e50: 00000000 00000000 00000000 00000000
3fff1e60: 00000020 00000010 00000000 00000000
3fff1e70: 7ce6cc7f 6186313c 00000000 00000000
3fff1e80: 3fff692a 00000006 00000020 00000000
3fff1e90: 3fff692f 3fff7a7c 3fff767c 4022ccf0
3fff1ea0: 3fff692f 0000008c 3fff7ab4 40204c32
3fff1eb0: 3fff68a4 3fff0d7c 00000000 00000000
3fff1ec0: 00000000 00000000 00000000 00000000
3fff1ed0: 3fff6930 00000001 3fff68a4 3fff7b05
3fff1ee0: 00000021 00000030 3fff690f 00000000
3fff1ef0: 3fffdad0 0000009e 00000020 3fff0db4
3fff1f00: 3fff68a4 3fff68a4 3fff767c 4022b675
3fff1f10: 0000008c 00000000 3fff58a8 4021d16c
3fff1f20: 0000008c 3fff0b70 3fff0b34 3fff1fdc
3fff1f30: 3fffdad0 3fff2830 3fff1fb0 3fff1fdc
3fff1f40: 3fffdad0 3fff3edc 3fff7a7c 4022b9cd
3fff1f50: 3fffdad0 00000000 3fff1fb0 4021e87b
3fff1f60: 00000000 00000000 00000001 40233908
3fff1f70: feefeffe feefeffe 3fffdab0 401007d1
<<<stack<<<
EEPROM space available: 0x007b bytes
Firmware Version: 1.4.1
Now flashed to 1.4.3 on the web installer. gonna see if it lasts.
That page will let you connect to the serial console. from there you can issue a reboot and monitor the serial output.
Sorry for the trouble, but fixing these flash issues is our number 1 priority right now.
I did attempt to do this, but unfortunately immediately following an install of 1.4.3 on ratgdo v2.5, I only get the blue flash once and no output from the console. I have since OTA flashed 1.4.1 after using the web installer for 1.4.3 since it only fails on power cycle.
I did attempt to do this, but unfortunately immediately following an install of 1.4.3 on ratgdo v2.5, I only get the blue flash once and no output from the console. I have since OTA flashed 1.4.1 after using the web installer for 1.4.3 since it only fails on power cycle.
Did the same thing here. Flashed back to 1.4.1 since the chip fails on power cycle
TLDR: updated to 1.4.4 after failed 1.4.3 update and all now good.
Longer version: Mine broke (see above) earlier this week with the 1.4.3 update, similar to other's experience. I updated this morning to the 1.4.4 pre-release version and it seemed to work fine after the update, but then I unplugged my computer and re-connected power via USB. It didn't seem to come back... but I might not have waited long enough or maybe used the wrong IP (DOH!)
Following the thought process of someone else above, I tried updating to 1.4.4 and then immediately updating to 1.2.1 (I apparently mis-remembered 1.4.1 as 1.2.1 in my head). That gave me a CRC didn't match error, so that didn't work.
Then updated to 1.4.4 and was more patient this time and it seems to work fine (well, it's been up for 0:00:12:02 anyway).
Note: happy to play around, re-boot, install different firmware, etc. if you need other things tested. Let me (and probably others) know! Thank you for your efforts!
We have found a bug that was overwriting flash memory that we shouldn't have been. We're working on a fix. I suspect that the "OTA bricked my ratgdo" is not quite accurate... the flash had already been corrupted and a reboot even without an OTA upgrade would have failed.
The reason that everything seemed to be fine is because the part of the flash that was written over is in the first 4KB which is reserved for the bootloader... and of course it only ever runs at boot. But it was in the HomeKit server, so it was writing (actually erasing) a part of memory that it shouldn't... and the part that should have been erased was not. This likely led to problems with HomeKit server operations.
Time will tell if we have finally found the root cause of our stability problems.
This is so exciting. Thank you for tracking this down!
Fixed in 1.5.0
Fixed in 1.5.0
Works here
Performed the OTA update, waited about 5 minutes, and the device is no longer responsive.
After waiting, I attempted to power cycle by disconnecting and then reconnecting. No Change