openshwprojects / OpenBK7231T_App

Open source firmware (Tasmota/Esphome replacement) for BK7231T, BK7231N, BL2028N, T34, XR809, W800/W801, W600/W601 and BL602
https://openbekeniot.github.io/webapp/devicesList.html
1.39k stars 241 forks source link

Stability issues [BK7321T Hangs around 3.03 AM] #617

Open NielsPiersma opened 1 year ago

NielsPiersma commented 1 year ago

I've 40 BK7321T in the office. Roughly 16 are set up for ceiling lights, six are set up for switching desks, and the others are lying in our meeting room, dormant but powered on and connected.

Almost every night at 3.03 AM, about 3 to 6 plugs go offline (except the dormant ones in the meeting room). Disconnecting and reconnecting the plug resolves the issue. But I need help finding an actual root cause.

1) The devices are randomly halting, but only the ones connected and have a load. The devices I am not using actively, stay connected. 2) It happens almost always at 3.03 (it could be 3.00 AM, but the reports show 3.03) 3) they are connected to different Unify APs 4) I've updated the firmware daily in the last week, but no improvement.

For now, I'll schedule a reboot of the device at 02:00 pm and see if that mitigates the issue, but better would be if the device doesn't stop,

NIels

openshwprojects commented 1 year ago

Hey, what kind of device, maybe it has a low quality power supply and you need a [b]PowerSave[/b] command in short startup command? If not, you can try checking the release that was tested many times and is deemed as very stable: image

Build on Nov 25 2022 10:36:03 version 1.15.77

Or in general, try to narrow down, maybe one of the releases has a stability loss?

NielsPiersma commented 1 year ago

Thanks, I'll try the power save first. As Build on Nov 25 2022 10:36:03 version 1.15.77 is lacking a lot of features I need for management that won't be a feasible solution.

NIels

openshwprojects commented 1 year ago

It would be also way easier if you manage to narrow down a commit which breaks things for you. I also remember that there was some small stability issue caused by logging, I am not sure when it was, few weeks ago? @btsimonh solved that. Or it was a month ago...

Which version do you have @NielsPiersma ?

NielsPiersma commented 1 year ago

Guys (and girls), I likely found the root cause. Our Unify network controller is executing channel optimization precisely at 03.00 am. As it is too big of a coincidence that some BK devices lose their connections and don't reconnect at 03.04 am.

I'll disable nightly optimization and see if they stay connected.

https://community.home-assistant.io/t/how-unifi-nightly-optimizations-ruined-my-life/455825

Niels

image AP excluded from Channel Optimization.

openshwprojects commented 1 year ago

It sounds like a non-OBK issue. Did turning off optimizations help?

NielsPiersma commented 1 year ago

@openshwprojects, yes, turning it of resolved the issue. I won't necessarily say it is not an OBK issue. Other connected devices don't have issues with Channel Optimization and reconnect fine after selecting a new channel with less interference.

Wifi Optimization is used by multiple vendors, such as Cisco, Aruba, Unify, and Axiros.

For us, disabling this feature works fine. Many home users will not have this issue anyway.

I would recommend having a look, but no rush at this moment.

Niels

openshwprojects commented 1 year ago

@NielsPiersma ok ok, I still have your Counter request on TODO list as well. I was busy with WEMO and BL602 OTA and ioBroker fixes (now we are REALLY Tasmota compatible, very high compatibility level), but you probably saw it in commit log anyway, right? Very busy days, but progress is extremely quick.

Being able to run firmware on Windows (with MQTT) is a game changer.

NielsPiersma commented 1 year ago

No worries. I am fully aware all is work in progress. We know what is causing it and how to resolve it. For now that is fine. Focus on those other things. This is a minor issue.

At least we know it is there and will save others time when they are confronted with it.

Cheers and keep up the good work.

Niels


From: openshwprojects @.> Sent: Wednesday, January 25, 2023 12:33:02 PM To: openshwprojects/OpenBK7231T_App @.> Cc: NielsPiersma @.>; Mention @.> Subject: Re: [openshwprojects/OpenBK7231T_App] Stability issues [BK7321T Hangs around 3.03 AM] (Issue #617)

@NielsPiersmahttps://github.com/NielsPiersma ok ok, I still have your Counter request on TODO list as well. I was busy with WEMO and BL602 OTA and ioBroker fixes (now we are REALLY Tasmota compatible, very high compatibility level), but you probably saw it in commit log anyway, right? Very busy days, but progress is extremely quick.

Being able to run firmware on Windows (with MQTT) is a game changer.

— Reply to this email directly, view it on GitHubhttps://github.com/openshwprojects/OpenBK7231T_App/issues/617#issuecomment-1403470745, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQYDR57224KW7EXI63POWEDWUEFO5ANCNFSM6AAAAAAUAC2CZA. You are receiving this because you were mentioned.Message ID: @.***>

jf-wm commented 1 year ago

I happened to see NielsPiersma's solution to his problem, as I also have an OBK device that crashes at night at irregular intervals. For me, too, this problem is apparently triggered by the Wifi channel optimization of my router (FritzBox). However, a Tasmota device (ESP8266EX) right next to the OBK is not affected by Wifi channel optimization, it has never crashed.

NielsPiersma commented 1 year ago

@jf-wm , As far as I can diagnose at this moment it is the optimizing process causing the obk getting confused and not reconnecting. In my setup esp82xx are not affected.

Thanks for confirming this.

Niels

openshwprojects commented 1 year ago

@NielsPiersma what about using Ping Watchdog with scriptable event to force device reboot?