openshwprojects / OpenBK7231T_App

Open source firmware (Tasmota/Esphome replacement) for BK7231T, BK7231N, BL2028N, T34, XR809, W800/W801, W600/W601 and BL602
https://openbekeniot.github.io/webapp/devicesList.html
1.34k stars 228 forks source link

BK7231N with the bl9037 energy metering smart plug freezes after a few hours and needs a hard reboot #203

Open yipperr opened 1 year ago

yipperr commented 1 year ago

Firstly thank you for this project, didn't think this would be possible when i opened this smart plug and saw a tuya cb2s module with a chinese beken soc on it, onto the problem at hand, its very consistent so must be repeatable for anyone, I have the smart switch configured to report voltage current and power to home assistant via mqtt , it always crashes a few hours in,

I don't know how to retrieve logs when this happens as the webui and webapp both are unresponsive, and also the mqtt broker gets no updates from the smart plug as well, however, after a power cycle it resumes to normal operation , the plug doesn't respond to ping commands as well when its in this bricked state, but the wireless router shows that smart plug is associated with it on wifi , but none of the ping command gets a reply

if there is a way to get logs in this state let me know and i will find it

thanks for your time

DSchndr commented 1 year ago

It seems like the app crashes somewhere since it has a rudimentery watchdog that reboots it when wifi is missing and other things (well it does not work when the code hangs apparently)

When the memory usage gets bigger over time there is probably a memory leak, does it?

Since everything in it is probably mains referenced my best bet would be to wire up uart2 (@openshwprojects does it print logs on it? ) as well as 3.3v (DO NOT CONNECT MAINS) and letting it run till it crashes and hopefully spits out why...

valeklubomir commented 1 year ago

Hello. I am new to this project. Found Smart Plug with BL0937 and observed same behaviour with connected wires and independent power supply. Log shoved following:

Info:MAIN:Time 2539, free 89456, MQTT 1, bWifi 1, secondsWithNoPing 1, socks 2/38 sta: 1, softap: 0, b/g/n sta:rssi=-42,ssid=DAPB,bssid=1e:74:0c:c3:f1:08 ,channel=1,cipher_type:CCMP Info:MAIN:Time 2540, free 89456, MQTT 1, bWifi 1, secondsWithNoPing 1, socks 2/38 Info:GEN:dhcp=0 ip=192.168.64.184 gate=192.168.64.1 mask=255.255.255.0 mac=38:1f:8d:1b:2b:e4 Info:MAIN:Time 2541, free 89456, MQTT 1, bWifi 1, secondsWithNoPing 1, socks 2/38

sm_deauth_handler sm_deauth_handler reason=15,vif=0 sm_disconnect_process me_set_ps_disable:840 0 0 1 0 11

And device was frozen. Looking for this handler.

Lubomit

openshwprojects commented 1 year ago

@yipperr does it still happen with latest great help and changes made by @valeklubomir ? With LWIP update?

valeklubomir commented 1 year ago

@openshwprojects There is last pending pull request for OpenBK7231N (not yet applied). Which also helps bit. With this and my last pull requests to OpenBK7231T_App (already applied) should be stable enough. My forked Repositories. My device runs: 1 days, 13 hours, 19 minutes. Build on Oct 3 2022 20:36:38.

MQTT State: connected RES: 0(ERR_OK) MQTT ErrMsg: MQTT Stats: CONN: 1 PUB: 72582 RECV: 64261 ERR: 5

I keep it runnung it this state few more days. Then complete the device connect to MAINS AC and test few more days. Working on new device BK7231N + BL0942, Customized PCB. Examining pin connections.

openshwprojects commented 1 year ago

@valeklubomir sorry, I must have missed that. Those are good changes, thanks.

If you are researching a new device, could you make a short teardown and photos and post them on our forums? https://www.elektroda.com/rtvforum/forum507.html

Have you seen my already done teardowns on this forum? Maybe they can help a bit. I've done a lot of them already.

valeklubomir commented 1 year ago

@openshwprojects I did not documented teardowns of both devices but I can still make a summary. All your teardowns helped me with starting this project. There were some road bumps on the way.

yipperr commented 1 year ago

@openshwprojects @valeklubomir

i apologize for the delay in replies, i am flashing the latest build as of writing this and will report back with testing information if there is any difference

coraldrum commented 1 year ago

I have the same issue. The version is 1.14.13. I've tested some releases but after a few hours it needs a hard reboot.

mariusbach commented 1 year ago

I have the same issue with BK731T and BL902. Freezes, needs hard reboot. No ping or anything possible. Running on 1.14.52.

mariusbach commented 1 year ago

One more data point: The issue happened so far only on plugs with enabled BL942 drivers (I only have these). I haven't had enabled Periodic Statistics and NTP before, so the bug is present without them being enabled.

I have one plug where the power metering doesn't work (guess it's a hardware failure because 3 other identical plugs report power and energy); but this one stayed connected since a few days.

edit: Now this one also lost connection twice on the same day.

So I guess it's somewhere in the BL-driver? Not sure anymore where to look or how to document the hangup.

I'm on 1.14.88 now, still the same.

valeklubomir commented 1 year ago

I am trying to bring up more devices, to monitor behaviour. 5 running at moment. Unfortunately I can only 1 device connect to UART logger, other are monitored by network logging only. One device crashed today after 2 days working (no UART log). It was hard frozen. Net log did not show anyhing. past week anytime when device lost conenction to MQTT of WEB UI. Device it self was working, because button was able to toggle relay. Setting up more devices. 8 new smart switches and smart power monitor (Mini Smart Switch or Aubess power monitor switch). Now I have to wait till any hint is catched. So far I determined, that is may be issue with device being forced to reconnect to WiFi. Many functionalities work perfectly when started first time, but when forced to shutdown and restart, then it does not work without errors.

mariusbach commented 1 year ago

It seems to become more frequent lately (I update to all releases as they become available): today one of the plugs had to be pulled from the wall socket 3 times after hanging up. There's no other way to reset it. I admit I didn't test via the button if the device itself is still running, will do next time.

mariusbach commented 1 year ago

The one plug with no drivers enabled and only using MQTT on 1.14.105 is quite stable. The ones with drivers crash as before.

mariusbach commented 1 year ago

I didn't test via the button if the device itself is still running, will do next time.

For me, with a BK7231T plug/socket, the devices freezes in such a way, that the hardware button does not do anything.

mariusbach commented 1 year ago

I've been on version 1.14.124 for a few days now and the freezing stopped on all devices. Was there a fix somewhere in the changes vs. 1.14.105?

Perhaps it's too early to call it fixed, but I haven't experienced any lockups anymore. I don't use the extra watchdog ping in the menu, if that is interesting to know.

Again, just to note, I'm using a smart socket with BK7231T and BL0942.

valeklubomir commented 1 year ago

I have 8 devices running (BK7231N/T, BL0937/BL0942, PWM). Some of them are stable since 6 days, one device freezed every day, but totaly no reaction to button whatever. Today it died completely. The 230V PSU is broken, powering from lab supply, module works, BK7231N and relay have high consumtion. Higher than previously. Before idle counter was implemented module with relay had consumption 88 mA now whole module has 160 mA. If idle counter was implemented instead for placing CPU into low power mode. This could mean some troubles, because the 230V PSU is not designed to draw too much current. Device which I use has IC BP2525 PWM controller which is rated to 150mA.