visualapproach / WiFi-remote-for-Bestway-Lay-Z-SPA

Hack - ESP8266 as WiFi remote control for Bestway Lay-Z spa Helsinki
GNU General Public License v3.0
297 stars 74 forks source link

WiFi not working most of the time / ESP restarting each few seconds #709

Closed nr001 closed 6 hours ago

nr001 commented 4 months ago

Describe the bug Built the project (everything soldered to the pump unit now), however i couldn't get a stable WiFi connection. Sometimes the Webinterface can be accessed, but most times is wasn't working. Also when i ping the ESP i always get errors. I thought the problem is the WiFi connection, but even if i put the device 2 meter beside the router the WiFi connection still doesn't work reliably. Then checked the WiFi router logging and there i saw that the connection is dropping each ~15 seconds. So i checked the Terminal in VisualStudio Code which seems to be showing that the ESP8266 is restarting for any reason about each 15 seconds. So i guess the WiFi itself isn't even the problem, but for any reason the ESP is constantly restarting.

Terminal log: Start WiFi > using WiFi configuration with SSID "mywifinetwork" WiFi > Trying to connect ...got IP: 192.168.1.88 start NTP ............WS IRamheap 6216 IRamheap 4968 startmqtt Failed to read mqtt.json. Using defaults. 192.168.1.88 End of setup() 37016 ␔#-a|(␕␗�P␚v␛X�~h~u␗Xh 4!$v'��r(␑,;,#A␑␑@�␕4nzAl )z�~,: �r(␗%␑�@#A.,4nzA, )z�~,PZ8�~,PZ�P,U/4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start 4�-a|(␕␗�P␚v␛X�>h~u␗Xh 4!�$v'��r(␑,+,#A␑␑@�␕4nzAl )z�~,j �2(␗%␑␙@#A.�:,4nzAl )z�~,PZ8�~,PZ�P,�/4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start 4#-a|(��r�dNX�~h~u␗Xh 4!5$v'��r(�,+,#A␑␑@�>�4nzAl )z�~,j �r(␗%␑␙@�A.,4nzAl )z�~,PZ8�~,PZ�P,U/4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start WiFi > using WiFi configuration with SSID "mywifinetwork" WiFi > Trying to connect ...got IP: 192.168.1.88 start NTP ............WS IRamheap 6216 IRamheap 4968 startmqtt Failed to read mqtt.json. Using defaults. 192.168.1.88 End of setup() 37016 4�-a|(␕␗�P␚v␛X�~h~u␗Xh 4!$v'��r(␑,+,#A␑␑@�␕4nzAl )z�~, �r(␗%␑␙@#A.,4nzAl )z�~,PZ8�~,PZ�P,U/4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start WiFi > using WiFi configuration with SSID "mywifinetwork" WiFi > Trying to connect ...got IP: 192.168.1.88 start NTP ............WS IRamheap 6216 IRamheap 4968 startmqtt Failed to read mqtt.json. Using defaults. 192.168.1.88 End of setup() 37016 4#-a|(␕␗�P␚v␛X�~h~u␗Xh 4!$v'��r(␑,+,#A␑␑@��4nzAl )z�~,j �r(␗%␑␙@�A.,4nzAl ):�~,PZ8�~,PZ�P,U/4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start

Expected behavior Working WiFi / ESP not resetting itself each ~15 seconds

Hardware (please complete the following information):

Software (please complete the following information):

Additional context I'm trying to do some debugging. All of the following things didn't help: -Change CPU speed 80 / 160mhz -WiFi.setAutoReconnect(false); -Comment all ESP.restart() lines in the code -Powering the ESP directly from USB

What did help is commenting the 2nd line here (of course the pump cannot be controlled then). Then no restarts are done anymore, so the problems comes from here: // Fiddle with the pump computer bwc->loop();

Checked this function deeper, in that one the restarting problem is caused by this line:

cio->handleToggles(); //transmits to cio if serial received from dsp

Again if this one gets commented, the ESP is not restarting anymore. Any help is very appreciated, i guess this could be the same as issues: https://github.com/visualapproach/WiFi-remote-for-Bestway-Lay-Z-SPA/discussions/672 https://github.com/visualapproach/WiFi-remote-for-Bestway-Lay-Z-SPA/discussions/466 https://github.com/visualapproach/WiFi-remote-for-Bestway-Lay-Z-SPA/issues/504

nr001 commented 4 months ago

Update: Did some more debugging and finally nailed the restart problem down. During debugging suddenly i got following output on the serial monitor:

startmqtt Failed to read mqtt.json. Using defaults. 192.168.1.88 End of setup() 37016

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

Soft WDT reset

Exception (3): epc1=0x401039e0 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

LoadStoreError: Processor internal physical address or data error during load or store epc1=0x401039e0 in wDev_MacTim1Arm at ??:?

stack>>>

ctx: cont sp: 3fff1d10 end: 3fff1f70 offset: 0160 3fff1e70: 00000000 00001f89 14fdf3b6 401085ac 3fff1e80: 00000007 00000000 ffffff76 40107c7a 3fff1e90: 00000001 000000ff 4010845b 00000001
3fff1ea0: 00000001 40108454 00000001 402427f4 3fff1eb0: 401083dc 00000020 3fff2a74 40242468 3fff1ec0: 401083dc 401082d4 401081d4 4022c161 3fff1ed0: 3fff2a74 3fff2a7f 3fff2a7f 3ffefbd4 3fff1ee0: 00000000 3fff0e78 3ffef78c 3ffefbd4 3fff1ef0: 00000000 00000000 3ffef78c 4021c62b 3fff1f00: 401081d4 3ffefa44 401081d4 4021b7b4 3fff1f10: 00000000 000f000f 00000000 00000000 3fff1f20: 00000000 5801a8c0 feefeffe feefeffe 3fff1f30: 401081d4 feefeffe feefeffe 3ffefbd4
3fff1f40: 3fffdad0 00000000 3ffefba8 3ffefbd4 3fff1f50: 3fffdad0 00000000 3ffefba8 40230f2c 3fff1f60: feefeffe feefeffe 3fffdab0 401014a1 <<<stack<<<

0x40107c7a in EspSoftwareSerial::UARTBase::write(unsigned char const*, unsigned int, EspSoftwareSerial::Parity) at ??:? 0x402427f4 in DSP_4W::setSerialReceived(bool) at ??:? 0x40242468 in CIO_4W::setSerialReceived(bool) at ??:? 0x4022c161 in BWC::loop() at ??:? 0x4021c62b in loop at ??:? 0x4021b7b4 in setup at ??:? 0x40230f2c in loop_wrapper() at core_esp8266_main.cpp:? 0x401014a1 in cont_wrapper at ??:?

--------------- CUT HERE FOR EXCEPTION DECODER --------------- N'��r(␑,+,#A␑␑@�␕4nzAl )z�~,j �r(␗%␑␙@#A.,4nzAl )z�~,PZ8�~,PZ�P,],4/'�rf cal sector: 1020 freq trace enable 0 rf[112] : 0� Start WiFi > using WiFi configuration with SSID "mywifinetwork" WiFi > Trying to connect ...got IP: 192.168.1.88 start NTP ............WS IRamheap 6240 IRamheap 4992 startmqtt

Continued debugging and found out that following function in CIO_4W.cpp is causing the problem: / bwc is telling us that it's okay by dsp to transmit / void CIO_4W::setSerialReceived(bool txok) { / Don't forget to reset after transmitting / _readyToTransmit = txok; }

If i comment the line in the function, restart problem is gone. Any idea what the problem is here and how to fix this?

visualapproach commented 4 months ago

Some bugs like memory and interrupts are hard to trace. Both are highly possible here. It says software wtd which indicates something took too long before releasing control to the OS. Then it complains about a memory operation. Memory errors (out of, corrupt etc) can appear to disappear when you make an insignificant change to the code which leads you to suspect a certain line. I don't say it's not that code, but it's a simple instruction that in itself doesn't take any time in this context. If you want you can add a ESP.feed() before the instruction. If it helps it may be a solution OR we just moved the real issue to where it doesn't hurt as much. The crashes could potentially also be caused by periferals. Like a hanging signal line or too many state changes or something. Or just plain bugs in my code. But it seems to work for several other people so it must be a sneaky bug in that case.

nr001 commented 4 months ago

Ok, thx for your answer. I tried most of the available versions starting from the most recent one. The first that i could get working was 3.4.0. Here everything works fine except that nothing happens if i press any button on the pump display.

visualapproach commented 3 months ago

See if latest dev branch is working for you

nr001 commented 3 months ago

Sorry for the delay, took a while to be able to try this.

Anyway i installed 4.3.1 now and it's working much better now, thx a lot for this. WiFi connection seems to be a lot more stable now!

Sometimes i experienced that the pump is stopping for a few seconds and then autoamtically starting again. Need to gain more experiences however with this version until i can reliably say more.

nr001 commented 2 months ago

Just installed v4.4.1. With this one it seems the WiFi problems are back again. ESP seems to be restarting each few seconds

visualapproach commented 2 months ago

Sorry, I can't tell why. It works for me and at least a few others. I uploaded a new dev version now that took care of WDT resets after wifi drop outs. Don't know if that will help but worth a try.

nr001 commented 2 months ago

Thx, i tried now with latest dev branch with following outcome: First ~hour after installing the ESP again was constantly rebooting each few seconds. Then left it running without touching or changing anything and just waited.. After some time then it got "stable" and no rebootings anymore for quite an amount of time. After some time then the frequent rebootings starting again.

See below the bootlog.txt file: {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:34:44"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:38:43"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:40:19"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:41:11"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:42:35"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:43:27"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:45:08"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:45:51"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:47:05"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:48:55"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:49:38"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:50:27"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:50:58"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:52:05"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:52:47"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:53:28"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:54:05"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:54:36"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:55:12"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:56:41"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:57:15"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:58:10"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 12:58:46"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:00:03"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:01:07"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:01:56"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:02:30"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:04:03"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:04:45"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:05:31"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:06:24"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:08:18"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:09:30"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:11:04"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:11:45"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:12:51"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:13:29"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:15:55"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:16:33"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:17:02"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:18:19"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:19:13"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:19:53"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:21:58"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:22:58"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:23:28"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:25:17"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:26:00"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:26:49"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:27:57"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:29:19"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:30:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 13:53:40"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 15:35:06"} {"BOOTINFO":"Software Watchdog 2024-07-10 16:06:06"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:00:37"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:05:13"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:13:17"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:16:56"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:18:00"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:19:33"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:21:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:23:34"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:25:01"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:26:04"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:26:34"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:27:57"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:28:30"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:30:10"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:30:43"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:31:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:32:49"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:33:33"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:34:49"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:35:24"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:36:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:37:28"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:38:09"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:39:03"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:40:06"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:41:30"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:42:21"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:42:53"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:43:46"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:45:09"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:46:57"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:48:10"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:49:29"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:50:05"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:50:35"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:51:11"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:52:10"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:53:13"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:53:55"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:56:32"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:57:07"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:57:49"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 17:58:54"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:00:49"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:02:04"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:02:35"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:04:45"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:06:55"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:12:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:14:21"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:14:56"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:15:38"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:16:30"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:17:00"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:18:34"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:22:59"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:31:04"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:32:15"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:33:06"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:37:50"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:39:20"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:40:03"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:42:51"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:46:27"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 18:50:05"} {"BOOTINFO":"Hardware Watchdog 2024-07-10 19:05:25"}

visualapproach commented 2 months ago

can you post a screenshot of main page?

nr001 commented 2 months ago

Unbenannt of course, here it is. Firmware: 2024-07-09-2306

visualapproach commented 2 months ago

The NTP seems to have failed. It says 1970... Maybe that's where to find the problem

visualapproach commented 2 months ago

Also the rssi is a bit low. Could possibly be a factor.

nr001 commented 2 months ago

The NTP seems to have failed. It says 1970... Maybe that's where to find the problem

I think that's because of the frequent rebootings. When starting usually it took some seconds to get the time via NTP. But currently it's restarting each ~15 seconds.

visualapproach commented 2 months ago

Sometimes it works if you make a clean build in platformio. Also restart your router.

nr001 commented 2 months ago

Can you explain what has been changed from firmware "2024-07-15-1308" in comparison to the master branch version "2024-07-18-0900" of today?

I installed 2024-07-18-0900 today, and everything seems to be smoothly working now. Had no restarts now for ~25 minutes (earlier i had each ~15 seconds) and the web interface also seems to be much more responsive. I will check this further, but already big thanx for your work!

visualapproach commented 2 months ago

You are welcome 😃

image

nr001 commented 2 months ago

It's crazy. I thought it's working now as everything running fine for some hours. Now i'm back on constant restarts each ~30 seconds.

nr001 commented 2 months ago

Is there a way to 'deactivate' the NTP getting time stuff completely? I'd like to see if this causes the problems

visualapproach commented 2 months ago

Better to run it on the bench and see the serial monitor

nr001 commented 2 months ago

The problem is when running without powering the pump with 230v the reset don't seem to happen Plugging via USB and the pump at the same time i guess is no good idea so cannot really debug this way

visualapproach commented 2 months ago

Check if you have any shorts between solders. Flux can sometime make a connection.

github-actions[bot] commented 1 week ago

Stale issue message