esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.09k stars 13.32k forks source link

Adafruit neopixel library doesn't work well with 2.4.0 #4219

Closed judge2005 closed 6 years ago

judge2005 commented 6 years ago

Basic Infos

Hardware

Hardware: ESP8285 Core Version: 2.4.0

Description

I recently updated to 2.4.0 from 2.3.0 and I noticed that my neopixels started glitching. I put a probe on the data line and sometimes a state is held for too long in the bit stream - i.e. a zero bit or a one bit is held for a longer period of time than it should be (I made sure that the same values were being written to the neopixels each time I updated them, so the bit streams should have remained identical).

To confirm, I down-graded to 2.3.0 and the problem went away.

The neopixel library disables interrupts with noInterrupts() before writing the bit stream, perhaps this is no longer working? Or perhaps there are other background tasks that happen in 2.4.0 that noInterrupts() does not prevent?

Settings in IDE

Module: Generic ESP8285 Flash Size: 1MB CPU Frequency: 80Mhz Flash Mode: Can't specify this for ESP8285 Flash Frequency: Can't specify this for ESP8285 Upload Using: SERIAL Reset Method: nodemcu

Sketch

Sorry, my sketch is massive. I haven't tried creating a specific testcase since there is an obvious difference in running using either 2.3.0 or 2.4.0.

Makuna commented 6 years ago

The Adafruit library support for esp8266 is based on my original code for my library (NeoPixelBus). That original work (bitbang method) is not stable and as you have found, no longer works reliably. There is no solution other than having Adafruit update their library to use hardware features of the esp8266.

As an alternative, you can use NeoPixelBus (check library manager or github). It was updated when this was discovered (18 months ago?) to use two alternate hardware features, DMA, and or the UART. Both have pin limitations (each hardware solution uses a different pin and only one pin).

Further, I believe FastLed has a staged "PULL" request that also uses DMA features; but I have not tracked if it has been merged in yet.

judge2005 commented 6 years ago

OK. Unfortunately the PCBs I just had made use GPIO0 for the NeoPixels, so I am out of luck. Just curious: What is the change that has made bit-banging stop working? i.e. what is in 2.4.0 that is not in 2.3.0 or that doesn't work properly in 2.3.0?

Just to document here: DMA uses GPIO3 (aka RX0) and UART uses GPIO2 (aka TX1). Obviously, both of these are used for serial comms/flashing too. I guess RX0 is the least likely to be in use generally.

I might try bit-banging in timer0.

ondabeach commented 6 years ago

@judge2005, if your application is commercial I'd stay with 2.3. 2.4 has a lot of kinks to be ironed out yet from I've seen.

Makuna commented 6 years ago

@judge2005 Well, it doesn't really work reliably even on 2.3.0. The original problem was found on that version. It just seems to be more sensitive on 2.4.0 than earlier version.

At the core, the issue is that a interrupt required to keep WiFi operational is not block-able (by any normal means) and thus you have a chance of your code being interrupted. This interrupt is really short but it is enough to cause problems if it happens in the middle of the sending the time sensitive data out by bitbang. Now blocking it by severe means can cause the esp8266 to reset; so that's not an option.

Now, why you may not see it as often as others is due to several factors.
1) How often you update your NeoPixels. The more data you try to send the worse the problem. One pixel that you update onece a second and you may not see anything. 10 pixels you update once a second you may see an issue once or twice a day. And this is with minimal WiFi traffic. 2) How much WiFi traffic you have. If you don't send and receive much; and your WiFi doesn't have a bunch of other devices on it broadcasting, you may not see it as often. Heavy traffic and you may see it every time you update the pixel state.

And BTW, I tried using Timer0 (I was the one who exposed it for Arduino on Esp8266) and you run into the same problem.

judge2005 commented 6 years ago

@Makuna Interesting. I have a small number of neopixels (tested up to 10), minimal WiFi traffic and maybe 2 or 3 updates per second. Performance is good enough with 2.3.0, but with 2.4.0, one NeoPixel updated once a second shows glitches roughly every 10 seconds. A very significant difference.

Thanks for letting me know about timer0. That saves me some wasted time.

What is the situation with the ESP32? I.e. is it easier to get real-time performance? There are also some clocked NeoPixels, do you have any experience with those?

ondabeach commented 6 years ago

@judge2005, as far as I know there is nothing that can be done about this. If the interrupt is fired while the WiFi is busy eg. sending a UDP packet for example, then the WiFi stack sill crash and often causes a reset.

I have a sketch that uses softserial and I get the same thing. The way I get around it by counting the number of UDP packet send fails and after 10 fails, which tells me that the WiFi stack has crashed, I reboot the esp. Because WiFi devices can auto reconnect the momentary loss of WiFi gets handled automatically and everything just keeps going. This works well enough for commercial deployment as I can live with the occasional lost UDP packet. This is what it looks like:

... UDP.println(HWinStr); if (!UDP.endPacket()){ errCount++; } else{ errCount=0; } } if (errCount>=10){ ESP.restart(); }...

Makuna commented 6 years ago

@judge2005 esp32 is a different beast. Bitbang is also problematic but much less so. Many find it stable if you use less than about 30 pixels. There is a very nice hardware feature on the Esp32 named RMT, which allows for almost arbitrary pin selection and has 8 channels! But the current low level support is problematic as it expects your buffer that it feeds by its interrupt to be the same as the RMT hardware needs, which is VERY verbose for NeoPixels (each NeoPixel bit to send requires 4 bytes in the RMT buffer). I have an open bug to have it changed and it has been marked for implementation; but it still is not done. So, I have yet to add the hardware support for Esp32 to my library.

Hiddenvision commented 6 years ago

I use two seperate sets of 60 pixels and have never noticed any issues using 2.3 on ESP8266. Even while running Web requests, WebSockets, acting as both STA & AP. Clocking the ESP at 160Mhz.

I'm using the WS2812 lib from https://github.com/kitesurfer1404/WS2812FX

Not had the chance to try them with 2.4 I just lost 40k of space updating to 2.4 so I may have to revert.

But I WANT the NEW _client connect timeout feature, I'm gonna have to paste that bit into 2.3 libs I thinks.

judge2005 commented 6 years ago

I'm still curious what the difference is between 2.3.0 and 2.4.0 that causes the issue to be so much more visible in 2.4.0. Basically I'm wondering a couple of things:

  1. Maybe I can modify 2.4.0 to operate more like 2.3.0 in this regard.
  2. Perhaps there is an issue in 2.4.0 that is either causing more unmaskable interrupts, or is spending more time in an unmaskable interrupt handler.

I guess I will have to write a test sketch to try to narrow it down.

timkay commented 6 years ago

@judge2005 Have you made any progress on this issue?

judge2005 commented 6 years ago

I tried the latest release (2.4.1, not sure) and didn’t see any problems at all.

I still need to run a full test of my application, but so far it looks promising.

On Jul 15, 2018, at 10:56 AM, Timothy Kay notifications@github.com wrote:

@judge2005 Have you made any progress on this issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

timkay commented 6 years ago

I found that the Adafruit library doesn't work at all well on ESP8266 (really I'm using a NodeMCU with ESP8285). I switched to NeoPixelBus, and it's rock solid. This library offers several different methods, including DMA, UART, and bit bang. I'm doing OTA updates and running 256 NeoPixels, and it's rock solid.

More recently, I wrote my own library that is dramatically smaller and simpler = easy to port, which I did to Renesas RX. I don't see any issues such as those mentioned in this thread.

devyte commented 6 years ago

Closing this in view of previous comment.