Open jclsn opened 10 months ago
I would suspect WiFi issues (channel overlap, interference, etc). The new Arduino core used in 0.14+ for ESP8266 is more susceptible to WiFi issues. Try scanning for least used channel and force your AP to use that channel. Some settings like BSS transition and/or fast roaming are known to cause issues as well.
Same issue with nodemcu. 14.0 works fine. Upgrade brakes wifi connectivity (using Unifi network).
I will describe the issue I had with one of ESP8266 devices which was sitting 30cm from access point and was unreachable over WiFi. There was another ESP32-S2 with similar issues though that one was far from any AP (>10m outdoor).
With 0.13.x the ESP was reachable but was occasionally dropping in and out of WiFi. That was observable in UniFi controller (I use Ubiquiti UniFi with several APs and switches). The drops were rare and didn't last long.
After I updated it to 0.14 (somewhere in July 2023) I immediately noticed that the device was visible and connected to the network but I was unable to connect to UI. It would just stall loading it. I went ahead an purchased another AP (UAP-AC-M) and installed it about 30cm from the device as it was in an awkward spot where WiFi signal was poor. It didn't help. I have several other WLED devices, one of them being the sync master. Whenever sync master sent notification packet the problematic device picked it immediately and all timed presets were triggering normally. That was a clear indication that the device itself was receiving WiFi signal (including NTP reponses) and when UI would load I could see signal strength to be almost perfect 100%. As it was an outdoor device (which participated in Christmas display successfully) it wasn't until last week that I finally delved into the problem and solved it. Not by changing WLED code but by reconfiguring WiFi.
It turned out that I had to fix APs to have channel allocation permanent and dispersed as far as possible (the two APs on the same channel had to be physically furthest from each other), I also disabled BSS transitions and Fast roaming though these two didn't seem t have any real effect in my set-up.
After all my APs were set to fixed channels (1, 6 and 11) and APs using the same channel were separated as far as possible, all connectivity problems went away. Immediately.
I have the same or a similar problem and it is definitely not the same as the 14.0 Wifi issues.
With 14.0 I had connection problems that I thought came from a new device (which I had not used with WLED < 14), but maybe it's the new Arduino core then. Anyway I changed my Wifi settings to less crowded channels and it helped to some extent.
However, immediately after the update to 14.1 the device (now hanging on the wall controlling an LED strip, 50cm to the AP, working flawlessly so far) became unresponsive. The UI would load slowly or not at all. I needed to change something in LED settings and it was impossible to get there.
In the browser I was able to directly open the update section after a reboot and install 14.0. It took pretty long but eventually worked. After reboot everything worked again. So my problem is definitely something in 14.1.
My device is NodeMCU v2 (ESP8266)
For list of changes and possible causes see changelog and explanations in #3685.
As far as I can see there were no changes (except an option to force G mode on ESP8266) to networking. And the change of Arduino core happened somewhere in between 0.14.0-b1 and 0.14.0-b2 and not in 0.14.1.
@sl1txdvd what you are describing was exactly my case. Solved by WiFi reconfiguration. Unfortunately I do not know if we have any option to change WiFi operation from within WLED as we use standard approach which hasn't changed in a few years.
As I said, that solved my problems introduced by 14.0. Which works for me. But it does not solve the problems introduced by 14.1. As you say, Arduino core change was in 14.0. The same problem would not return in 14.1 then under the exact same circumstances while 14.0 is not affected. It's a different issue.
FWIW I just tried 14.0 working fine 14.0 to 14.1-b2 fast flash, fast reboot 14.1-b2 generally working, two timeouts 14.1-b2 to 14.1-b3 slow flash, one failed, the other was "successful" according to the UI 14.1-b3 device does not work at all, had to flash via USB
Back to 14.0, everything's fine.
Unfortunately I have no resolution for you but to stay on 0.14 or earlier. FYI 0.15 builds on top of 0.14.1.
@sl1txdvd Would you mind sending me a copy of your configuration, from the 'backup configuration' feature? I haven't been able to replicate your issue here on my test setup. I'd like to try with your specific settings.
another one with the same issue. also tried a reflash but still failed. 4 other devices running 14.1 are ok. but one doesnt like it.
@willmmiles I try to attach the config here. I changed the ssid of my wifi. I run a strip with 60 WS2812b LED on a 2.4A driver if that's relevant.
Thanks, I think that's helped pin it down, I've been able to reproduce the OTA issue here.
@sl1txdvd does anything change if you use DHCP?
0.14.1 is buggy No Wireless
As of 0.14.1 the ESP's with WLED are no longer able to connect to my UniFi AP's. They are configured to broadcast 2.4 and 5GHz
Reverting to 0.13.3 fixes the issue
Could this be a problem with UniFi AP's. in general? Or perhaps with having multiple UniFi AP's?
Because I seeing the same thing with versions >13.1 on all esp8266's . And I also have UniFi AP's. 2 older UAP-AC. Edit: The AP's are 14 meter's apart -/+, channel's 6 And 11 with same SSID.
Although I managed one good build with 14.0 for a esp8266. But cannot I reproduce the build now for some reason.
Running UniFi too. I don’t recall any issues on 0.14.0, but I’ve just reverted to it from 0.14.1 because of this thread, so I’ll take note from here on.
In my situation the ESP32 just disappears from my network after approximately 10 hours. It does not create its own network even after I changed it to create it on network disconnect and not just on boot w/o network.
There are certainly a lot of UniFi networks in this thread. I have 4 APs in my network. Two of them are definitely in range.
Bonus: i have three wled devices which should all be ESP32’s. Only one is experiencing issues on 0.14.1 (which is now downgraded to 0.14.0). Scratch that - I just checked my unifi logs, and it does not seem to indicate that the other two 0.14.1's aren't having problems. While they reconnect, shortly after, my problem device doesn't.
Unifi logs. In this view, wled-light-1-6
is my now downgraded device. wled-light-50-2
and wled-light-5-3
are "fine" on 0.14.1, as in, they reconnect. It does not seem that either of the devices had any issues prior to January 14th, where 0.14.1 was released. Iirc, I updated on the 15th.
Dumping more info because I have a working/not working situation, so maybe something is useful.
0.14.1
, now: 0.14.0
0.14.1
0.14.1
This issue is not isolated to UniFi, I am experiencing the same and I am running eero (6 Pro x2).
No issues with esp32 however all esp8266's required rolling back to 0.14.0.
Same here with mikrotik network
Same bug here, WLED 14.1 is unusable using a hw-622 based relay board with official binary:
version 14.1:
root@ap-01:~# ping 192.168.1.170 PING 192.168.1.170 (192.168.1.170): 56 data bytes 64 bytes from 192.168.1.170: seq=0 ttl=255 time=191.526 ms 64 bytes from 192.168.1.170: seq=3 ttl=255 time=28.830 ms 64 bytes from 192.168.1.170: seq=4 ttl=255 time=100.870 ms 64 bytes from 192.168.1.170: seq=5 ttl=255 time=155.209 ms 64 bytes from 192.168.1.170: seq=6 ttl=255 time=205.415 ms 64 bytes from 192.168.1.170: seq=7 ttl=255 time=44.909 ms 64 bytes from 192.168.1.170: seq=9 ttl=255 time=456.499 ms 64 bytes from 192.168.1.170: seq=11 ttl=255 time=96.057 ms 64 bytes from 192.168.1.170: seq=12 ttl=255 time=169.681 ms 64 bytes from 192.168.1.170: seq=13 ttl=255 time=222.849 ms 64 bytes from 192.168.1.170: seq=14 ttl=255 time=275.832 ms 64 bytes from 192.168.1.170: seq=17 ttl=255 time=80.973 ms 64 bytes from 192.168.1.170: seq=18 ttl=255 time=133.357 ms 64 bytes from 192.168.1.170: seq=19 ttl=255 time=186.415 ms ^C --- 192.168.1.170 ping statistics --- 28 packets transmitted, 14 packets received, 50% packet loss round-trip min/avg/max = 28.830/167.744/456.499 ms
version 14.0:
root@ap-01:~# ping 192.168.1.170 PING 192.168.1.170 (192.168.1.170): 56 data bytes 64 bytes from 192.168.1.170: seq=0 ttl=255 time=5.305 ms 64 bytes from 192.168.1.170: seq=1 ttl=255 time=3.813 ms 64 bytes from 192.168.1.170: seq=2 ttl=255 time=3.317 ms 64 bytes from 192.168.1.170: seq=3 ttl=255 time=4.153 ms 64 bytes from 192.168.1.170: seq=4 ttl=255 time=2.788 ms 64 bytes from 192.168.1.170: seq=5 ttl=255 time=10.441 ms 64 bytes from 192.168.1.170: seq=6 ttl=255 time=4.197 ms 64 bytes from 192.168.1.170: seq=7 ttl=255 time=14.618 ms 64 bytes from 192.168.1.170: seq=8 ttl=255 time=2.988 ms 64 bytes from 192.168.1.170: seq=9 ttl=255 time=22.101 ms 64 bytes from 192.168.1.170: seq=10 ttl=255 time=11.045 ms 64 bytes from 192.168.1.170: seq=11 ttl=255 time=5.834 ms 64 bytes from 192.168.1.170: seq=12 ttl=255 time=5.880 ms 64 bytes from 192.168.1.170: seq=13 ttl=255 time=3.900 ms 64 bytes from 192.168.1.170: seq=14 ttl=255 time=90.145 ms 64 bytes from 192.168.1.170: seq=15 ttl=255 time=4.071 ms 64 bytes from 192.168.1.170: seq=16 ttl=255 time=6.595 ms 64 bytes from 192.168.1.170: seq=17 ttl=255 time=9.376 ms ^C --- 192.168.1.170 ping statistics --- 18 packets transmitted, 18 packets received, 0% packet loss round-trip min/avg/max = 2.788/11.698/90.145 ms
Chiming in to say I've lost contact with my now 0.14.0
device. So for those reporting 0.14.0
to also be an issue, I'm onboard there. I still can't say at present why my two 0.14.1
's aren't showing the same inability to stay online. :/
I've just observed this issue with a D1 Mini NodeMCU based on the ESP8266-12F. Fine on 0.14.0, unusable on 0.14.1. The device connects to my Unifi U6-Pro, but I can get close to no data through. It works intermittently for a few seconds, then nothing at all, this then repeats. It appears that turning off Fast BSS Transition helps a little, but it does not fully resolve the issue.
Interestingly, I have another non-Mini NodeMCU which initially showed the same issue on a Unifi U6-Mesh, but that cleared up after power cycling the device and the access point. I still see frequent reconnects, about every 30 minutes, but it's usable.
Chiming in. All my WLED's were on 13.3 - no issues.
Updated all of them to 14.1 today
They now ALL fail %100. All offline. An unplug/plug will have them connect for a few minutes, but then they all drop off again shortly.
Reflashing these will be a nightmare where some are located. :\
Reflashing these will be a nightmare where some are located. :\ @roninniagara
If you've enabled the self-hosted AP, you could disable your own wifi for a while while you reboot them and then flash them via a phone/laptop.. I dread this situation :O
Please read the whole thread but pay attention to this post: https://github.com/esp8266/Arduino/issues/8950#issuecomment-1872329949
Please read the whole thread but pay attention to this post: esp8266/Arduino#8950 (comment)
good post - but also 2 friends of mine updated using different boards and had to also roll back (we popped 14.1 on them to test)
same result.
And I'm sure lots of "set it and forget it" boards would have the same issue but won't be updated at all.
The ones i'm using have been stable for a VERY long time. zero drops. they also have antennas that clip on to improve signal.
Multiple people in here with various boards all having the exact same issue. I don't buy the "it's weak antenna" reasoning when those same boards worked fine for a very long time, then failed after update, to only work again perfectly once rolled back.
edit: taking a peek at other issues open, seems like MANY others are having many of the same issues with 14.1 - not just this thread.
Please read the whole thread but pay attention to this post: esp8266/Arduino#8950 (comment)
Please read the whole thread but pay attention to this post: esp8266/Arduino#8950 (comment)
confused, the entire thread is full of people telling you they are having problem solely with 0.14.1, yet you keep pointing out wifi issues ?
There were no changes to networking in 0.14.1 but we did switch ESP8266 platform for 0.14.0 as required with newer NeoPixelBus (requiring newer C++ compiler). As mentioned with above thread older cores (platform) allowed faulty hardware to perform adequately (don't ask me why as IDK) while it may have issues with newer core.
@doronazl there were no changes to wifi implementation in 0.14.1 compared to 0.14.0. Not a single one.
Anyone wanting to help, please revert 0.14.1 commit by commit and find the commit that causes issues with your particular set-up.
There were no changes to networking in 0.14.1 but we did switch ESP8266 platform for 0.14.0 as required with newer NeoPixelBus (requiring newer C++ compiler). As mentioned with above thread older cores (platform) allowed faulty hardware to perform adequately (don't ask me why as IDK) while it may have issues with newer core.
@doronazl there were no changes to wifi implementation in 0.14.1 compared to 0.14.0. Not a single one.
there may not been any INTENDED changes, but the fact is something broke. i can 100% confirm the issue is with the latest version. i ran into this problem since yesterday, ran around in circles but finally decided to rollback just to test i have 7 running wled, 6 of them i rolled back to 0.14 and i left one with 0.14.1 on purpose. guess what, that one i left went offline while the others staying online -rock solid
i
Anyone wanting to help, please revert 0.14.1 commit by commit and find the commit that causes issues with your particular set-up.
id love to help , but id need specific instructions as i dont know much about github and coding. or maybe someone else more in the know can help instead
You will need a local clone of WLED repository and then perform git checkout <sha>
and compile using PlatformIO.
<sha>
will be commit's SHA hash visible in Github.
FYI I have 10 ESP8266 devices (ESP01 and Wemos D1 mini) all performing flawlessly and connecting w/o issues, using Ubiquiti UniFi network consisting of 5 APs (UAP-AC-Lite, UAP-AC-LR, UAP-AC-M) and two additional Apple Airport Express (2nd gen and 1st gen (draft N)).
Maybe esp32s are having issues? Mine are all esp32 wroom 38u, with unifi udm pro, usw24, 5x u6LR
My ESP32 count is at about 20 and ESP32-S2 at 5 and ESP32-C3 at 2. No issues with WiFi. My total wireless client count is beyond 120 with multiple VLANs and same SSIDs on 2.4 and 5GHz.
All with wled or in general?
On Fri, 26 Jan 2024 at 15:31 Blaž Kristan @.***> wrote:
My ESP32 count is at about 20 and ESP32-S2 at 5 and ESP32-C3 at 2. No issues with WiFi. My total wireless client count is beyond 120 with multiple VLANs and same SSIDs on 2.4 and 5GHz.
— Reply to this email directly, view it on GitHub https://github.com/Aircoookie/WLED/issues/3690#issuecomment-1911664531, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWQAOZCCVFEMA6Z3TRDHCLYQNSUJAVCNFSM6AAAAABB3TJ3JSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGY3DINJTGE . You are receiving this because you were mentioned.Message ID: @.***>
Those are only WLED devices. Other ESPs have various firmwares, not counted above.
Those are only WLED devices. Other ESPs have various firmwares, not counted above.
Then great, the solution exist 🤞 Cant wait till its found. Btw in my case, it would go offline on the wled-native up. But when looking at the unifi it would show online... Not sure if that matters.
Just checked on my 2x Athom.tech RGB controllers (where one of them drops permanently on 0.14.x). Inside they're ESP32-WROOM-32E, and they do not have any pigtail connector for an external antenna. So I can't even test if adding an antenna would help. :/
I reverted my one dropping device from 0.14.1 to 0.14.0, and then after 2 days of connectivity it just dropped out and didn't come back.
So I'm actually more partial to bad wifi signals being a factor here, than any change in firmware. My experience on this device was that it was solid on 0.14.0, then bad on 0.14.1, and now it remains bad on 0.14.0. Solid here being, I didn't have to reboot it to get it online, I am unaware of any short drops in connections.
I'll try the "leave AP on always" as I can see it can host an AP while also being connected to my own Wifi. Then when it drops out again, I should be able to connect to it and maybe see what's up.
Just checked on my 2x Athom.tech RGB controllers (where one of them drops permanently on 0.14.x). Inside they're ESP32-WROOM-32E, and they do not have any pigtail connector for an external antenna. So I can't even test if adding an antenna would help. :/
I reverted my one dropping device from 0.14.1 to 0.14.0, and then after 2 days of connectivity it just dropped out and didn't come back.
So I'm actually more partial to bad wifi signals being a factor here, than any change in firmware. My experience on this device was that it was solid on 0.14.0, then bad on 0.14.1, and now it remains bad on 0.14.0. Solid here being, I didn't have to reboot it to get it online, I am unaware of any short drops in connections.
I'll try the "leave AP on always" as I can see it can host an AP while also being connected to my own Wifi. Then when it drops out again, I should be able to connect to it and maybe see what's up.
All my esp32 installed with external antenna, so don't think its a signal problem, especially when a version rollback fixes the problem 🫤
After I tried one beta at a time and crashed with 0.14.1-b3, I had to flash 0.14.0 via USB. I kept having problems though, so I did a factory reset and everything was back to normal.
I tried to reproduce the issue but couldn't. I tried 0.14.0 to 0.14.1 and also 0.14.0 to 0.14.1-b1, b2 and b3. The controls load slower from b1 on, but so far I have not seen anything like the problems I had when I upgraded to 0.14.1 the first time.
I suspect that certain configurations get corrupt during the upgrade. It may be related to segments in presets, at least I saw something odd when I tried to reproduce the issue.
If somebody with this issue has a backup of their 0.14.0 config files from before the upgrade broke their WLED that would probably be helpful to look into.
All my esp32 installed with external antenna, so don't think its a signal problem, especially when a version rollback fixes the problem
It may still be a wifi configuration problem. ESPs do not support BSS transitions (fast roaming) and other fancy stuff.
All my esp32 installed with external antenna, so don't think its a signal problem, especially when a version rollback fixes the problem
It may still be a wifi configuration problem. ESPs do not support BSS transitions (fast roaming) and other fancy stuff.
A. Not using any of that. They connect to specifically created 2.4ghz ssids of the nearest AP. B. The fact is the moment i went back to 0.14.0, it stopped going offline, that is the main factor here. 0 connection issues right now, if i upgrade to 0.14.1 the problem comes back.
In a hail mary, I'm trying to run 0.14.1 on my problem device, and seeing that the problem persists (it disconnected after 24 hours 26 minutes) and then resetting everything to defaults and reconfiguring. I've done so now, so we'll see if it keeps disconnecting after 12-24 hours.
Update: No dice.
Update: I'm retrying the variant where I enable the local AP on the device, so that when it does disconnect I can ascertain if it's even responsive at that point. I previously had it running with a local AP "if disconnected", but that didn't come up. If I can connect and f.ex. scan networks, we know Wifi works - if that AP also dies, then I'm not sure what to do (sans figuring out a debug port).
Update: I'm retrying the variant where I enable the local AP on the device, so that when it does disconnect I can ascertain if it's even responsive at that point. I previously had it running with a local AP "if disconnected", but that didn't come up. If I can connect and f.ex. scan networks, we know Wifi works - if that AP also dies, then I'm not sure what to do (sans figuring out a debug port).
I did this - and I found that I cannot connect to the exposed AP. So while the ESP32 isn't on my own wifi, I too can't connect to it. There is a pushbutton on the Athom Controller, which still works, even in this state, so I can turn the relay on/off.
So the chip is operating, but the wifi bits are "dead" - except for broadcasting the SSID.
I rebooted the device, and now I can connect to the AP w/o issues.
Possibly the same/similar issue here. Single UniFi AP, 3 WLED ESP32's working flawlessly on 14.0, now on 14.1 can connect fine but sync completely broken. The instance list on each sometimes empty, sometimes 1 or 2 of the other WLED controllers will show momentarily. No changes to AP location. Same subnet and gateway. Reverting to 14.0 fixes the issue.
@blazoncek
As far as I can see there were no changes (except an option to force G mode on ESP8266) to networking. And the change of Arduino core happened somewhere in between 0.14.0-b1 and 0.14.0-b2 and not in 0.14.1.
Can you point me in the direction of the core change? .. I checked the diff in wled for these versions, but I can't find anything hinting at this change. Maybe I don't know what I'm looking for :|
Mike.
Look at the changelog.
The change was in platformio.ini
file.
EDIT: The last ESP8266 core change was https://github.com/Aircoookie/WLED/commit/c04c73bbd7f448312c368e33773e510d4e8324cf
Oh shoot. I was looking at 0.14.1-b2/3
.. My bad. :/.
Thanks!
Update. I found these dependencies that were updated as a result of this change.
espressif8266@4.2.0
(side note.. versioning is not at all confusing here). It in turn updates the arduino core to 3.1.2
(as the comment says, here.But, I did find one noteworthy change:
It seems a bit of the error handling in esp8266 was changed, notably here in cores/esp8266/core_esp8266_postmortem.cpp
. If any of these have unfotunate side effects, I suppose the chip could stall entirely?.. It does look like this code only runs when the chip is already stopping or resetting, so maybe it has no effect on this issue. I'm always skeptical of these types of changes, because they are so deep in the core of the system and therefore could have long reaching side effects :/.
I'm also curios about the downgrade of the arduino core.. Isn't it odd to go from 4.1.0
to 3.1.2
(judging from the names in the platformio.ini
file in this project) ?
I'm also experimenting a bit on my own network with wifi specifically to see if other combinations of settings on the Unifi end has any effect. I'm still unsure if this has anything to do with wifi at all as we have the different scenarios, all on the same firmwares:
0.14.1
's running just fine0.14.1
which disconnects after 12-24 hours - and this device is the only one of the three with clear LoS to an AP.0.14.1
's which disconnect "immediately" (within minutes)My best bet is some type of exception or fault. But I'll have to see if I can attach a serial port somehow to my device, which unfortunately is screwed in place :/.
For anyone interested. I've forked this project and am tweaking with build params to see what sticks. My first test is reverting the arduino core change. I've hacked the github workflow to produce binaries so that I don't have to setup a local environment.
You're welcome to, at your own peril of course, try out these builds. The binaries are available as artifacts of each of these actions.
What happened?
After the upgrade to 14.1 WLED wasn't accessable anymore. Erasing the flash and flashing 14.0 solved the issue. Another upgrade to 14.1 resulted in broken wifi again. If you flash 14.1 manually, you can connect to the WLED-AP, but after you configure the router connection, things break.
To Reproduce Bug
Install 14.1 and connnect to existing network
Expected Behavior
WLED should connect
Install Method
Binary from WLED.me
What version of WLED?
14.1
Which microcontroller/board are you seeing the problem on?
ESP8266
Relevant log/trace output
No response
Anything else?
No response
Code of Conduct