Makuna / NeoPixelBus

An Arduino NeoPixel support library supporting a large variety of individually addressable LEDs. Please refer to the Wiki for more details. Please use the GitHub Discussions to ask questions as the GitHub Issues feature is used for bug tracking.
GNU Lesser General Public License v3.0
1.18k stars 265 forks source link

Add dithering with constant bus output #539

Open swifty99 opened 2 years ago

swifty99 commented 2 years ago

NOTE: If you are seeking help or have questions, this is NOT the place to do it. For questions and support, jump on Gitter and ask away.
Gitter

Is your feature request related to a problem? Please describe. For 8bit PWM LEDS, which are all addressable LED strips I know of, it is very hard or impossible to do accurate color mixing at low brightness levels. Also the PWM chips do not incorporate any gamma conversion. PWM controller in the addressable chips operate linear, means double the control value the absolute light output will be doubled. However, this is not the way our eyes work. Many solutions have been developed to address this, gamma conversion in computer display are the most common one. Perceptual quantifier like in dolby vision displays the more fancy one.

Describe the solution you'd like A project called Fadecandy solved many of these problems. Internally it increased the resolution by dithering. Dithering means, the LED will be turned off and on overlaying a PWM with a larger timescale on the whole LED. With this blinking the resolution can be increased by up to 2-4 bits. I have been using fadecandy in the past and adopted this to arduino libraries. It looks super smooth and nice, even low brightness color temperatures can be mixed. Unfortunately the project seems to be abandoned.

What is the downfall: Due to the needed of quite high refresh rates, a maximum of 64 LEDs per line should be used. With DMA access more than 64 LEDs need to be addressed in parallel.

Describe alternatives you've considered So far I use my own code, however missing all the add on features like WLED. So I will definitely go NeoPixelBus because it is great. Improving color range would be awesome.

Additional context I would be glad to help improve LED color science of NeoPixelBus.

Makuna commented 2 years ago

NOTE, there are 16 bit per color element LEDs out now like the UCS8903 and UCS8904, and my library supports them.

I don't know what you mean by lookups. Could you be more specific as to what you mean, or this pertains to.

Gamma is already present. See https://github.com/Makuna/NeoPixelBus/wiki/NeoGamma-object and there is also an example demonstrating its use. But this will reduce the actual count of discreate levels below the 8bit 256 levels to fit the gamma curve.

I am unlikely to add a complete end to end dithering solution (has been brought up many times) as dithering is a complex solution that is not universally achievable on poorly controlled timing platforms like those built onto top of Arduino platform.
To dither with 4 bits, this means you are updating the strip(s) 4 times per "frame". A frame represents a single state in an animation, like a single frame of a video. 30fps means the need to run at 120 strip updates per second to account for 4 dithering per frame. To achieve a correct dithered color, each of these "updates" to be "visible" the same amount of time during that frame display time. 30fps = 33.3333ms per frame, 8.3333ms per update. So, a single update needs to be displayed (stay present in the LEDs) for near 8ms (not much longer or shorter) or otherwise the dithered color doesn't match the expectation. A lost frame or two every once in while isn't the concern, it's a consistent poor timing that is, otherwise why do dithering at all. To achieve this timing within the authors sketch but leave them time to process other work, is a solution the author needs to provide. Just like today how often they call Show() is left up to the author to manage, the same continues for the timing for any dithering calls to Show() would need to be. There is no universal Arduino scheduling that isn't done by Timer Interrupts, which can't be used to call Show() due to ISRs need to be short and Show() is not short enough in some cases and when it is it will call hardware APIs that can't be used inside interrupts.

With this said, then this just leave the actual work (calculations) to help "dither" a color for sub-frame Show().

embedded-creations commented 2 years ago

Good timing for me to be browsing the issues here. I'm also a big fan of Fadecandy, and want to add support for Fadecandy-style dithering in WLED. I would limit this support to the ESP32 - in the future possibly the ESP8266 though I'm not that familiar with it - as with a DMA-based I2S Parallel driver and multitasking it should be possible to refresh the LEDs with a consistent frame rate.

I don't know what you mean by lookups.

I believe he's referring to Fadecandy's "Gamma and color correction with per-channel lookup tables". (Search for "lookup" in the old README for a bit more info)

@Makuna would you want to add a dithering feature that was limited to ESP32 and I2S parallel output? If not it may be easier to solve this problem in another class outside of NeoPixelBus, and focus on using NeoPixelBus for shifting out the pixels.

@swifty99 If you want to join the WLED Discord, I've been discussing some related topics in the #2d channel, and that's probably the best place to continue the discussion on the parts of this feature request that are WLED-specific.

Makuna commented 2 years ago

The timing is definitely not something the NeoPixelBus would support. As I commented above, this is left to the sketch author to manage per their platform choice as it is very platform specific. @embedded-creations NOTE, RTOS (multitasking on ESP32) has a task switch tick of 1ms. So, variance of when your task to call Show() happens will vary by this amount. The more tasks, the more the tasks do, the worse this variance can get for a single show thread. A well designed and managed sketch/library like WLED is the place to do it.

LookUp: Current Gama support has both a table and equation implementation methods. If you need a per color element table, it is easy enough to do if you look at the implementation in NeoGamma.h and NeoGamma.cpp. But that's another 512 bytes you are going to consume. Note, back when I put this in, sensor readings of many current strips showed the variance per color really wasn't that far off the curve, but this is really an issue of which LEDs are used and the separate driver chips and LEDs were the issue. Also, the table really needs to be user supplied if you are going to that much detail of per channel curves. If this specific feature is required, then a separate issue on this specific request should be generated for tracking purposes and I can provide that. Thus this part can be removed from this issue.

Gamma: Already present other than the above Lookup.

This leaves this issue to track Dithering. I will change the title soon to be more clear as to what this is tracking.

swifty99 commented 2 years ago

Hello everyone,

thanks for taking this seriously. First I would like to clarify what I did not describe completely: Dithinering of 2 Bits would indeed request a 4 times higher refresh than display rate. 4 bits would need 16 times more often update. The implementation I did years ago on arduino does not scale. It is limited to 64 LEDs and does not make sense anymore. Fadecandy used the teensy platform with a good DMA architecture. This is not available on most arduino processors. Dithering needs super accurate timing, as you mentioned. So an imprecise call to show() would mess it up. The show() needs to be called by an ISR (or start an ISR) with DMA output. This is not possible for all supported processors by Neopixelbus. However I believe a lot of people will and would use a DMA capable platform to increase color range with cheap Pixelstrips.

16 bit is maybe somewhat of an overkill. I have built RGBW 3channel controllers with 16 bits, the lower brightness levels are rarely used, even on very powerful lights. It uses a lot of bandwidth to apply a gamma curve which runs pretty decent on 12 bits. (and should be in the LED HW anyway). Gamma on 8Bit makes color mixing even more frustrating, as on low brightness levels on control side would very often lead to same output hence no change and keeping bad colors.

As for lookup: Long term I want use my lights with perceptual quantizer. I assume this could be done with lookups. This would also need to reference/calibrate to absolute brightness levels and need at the very least 10bits of linear brightness levels. Maybe this too far, so changing the topic to dithering would fit.

To sum it up: 8 bit with gamma leads nowhere, 8 bit colors generally is super imprecise if you do more than rainbow effects. More accurate colors are possible with cheap HW. I would be super happy to help implementing dithering to this library as it is really good.

embedded-creations commented 2 years ago

@swifty99 Makuna likes to keep Issues on topic here, so I created a issue in WLED where we can discuss the other parts of your feature request that won't be incorporated into NeoPixelBus:

https://github.com/Aircoookie/WLED/issues/2416

embedded-creations commented 2 years ago

I somewhat agree that 8 bit -> 8 bit gamma goes nowhere, but it can be better than nothing in avoiding washed out colors.

The color correction (handled outside of NeoPixelBus) could map an 8 bit color onto a 16 bit color space (or 12 bit, but it needs to be stored in a multiple of 8 bits for efficiency, so 16 bits would be used).

I'm trying to picture how dithering would be a part of NeoPixelBus, if color correction and calling Show() is handled by the application. Perhaps we store our 8 bit source colors in a NeoBuffer object, and use a custom shader object to apply color correction and dithering, and write an 8 bit subframe to the LEDs?

Even if we're sitting in a tight loop calling Render() and Show(), without double (or more than 2x) buffering and DMA-based output I don't think most platforms can keep a high enough frame rate to take advantage of dithering.

The timing is definitely not something the NeoPixelBus would support. As I commented above, this is left to the sketch author to manage per their platform choice as it is very platform specific. ... A well designed and managed sketch/library like WLED is the place to do it.

I don't agree with this, as if the interface with the hardware peripheral including interrupt handling is inside of NeoPixelBus, there's only a limited amount of control the sketch has in keeping consistent timing. At a minimum, there should be double buffering of pixel data, so the next sub-frame's data can be queued up by the sketch and ready to send out as soon as the previous transfer is complete. The queue would be most effectively handled by an ISR or in a hardware queue if the peripheral supports it, and that would need to be implemented in the NeoPixelBus method, not in a sketch.

If there's just double and not more than 2x buffering, each new sub-frame still needs to be calculated in roughly 2.5ms (I'm assuming the 1/400FPS used by Fadecandy). That's some pretty tight timing required for the sketch, and there's likely to be some dropped/late frames when switching tasks or using other interrupts.

If we only implement dithering for specific methods, then we can integrate dithering and multiple buffering into the method itself. At the cost of more memory, for a 4x sub-frame dithering implementation we could store 4x sub-frames that are continuously refreshed to the pixels, and another 4x sub-frames that are filled by the sketch. Upon calling Show(), the method switches which 4x sub-frames are being refreshed to the LEDs, and the sketch is free to fill up the other 4x again, without time pressure.

@Makuna I'm guessing this isn't something you want to integrate into NeoPixelBus, but I don't see how it could be done otherwise, without the sketch author needing to design their whole sketch around the timing for dithering. I'm open to other ideas though.

swifty99 commented 2 years ago

I do not have a complete understanding how NeoPixelBus is structured internally. If you there is good (written) starting point, please let me know. I agree with @embedded-creations, it has to be done on Bus level and not application. Timing is hairy, not many people would benefit.

How could the SW layout organized (with my limited knowledge, this is very drafty)? Here are my thoughts:

a call to show(); Would update the requested color (from WLED or whatever) per pixel on 8 bit level. Nothing more than copy or point memories.

a to be defined lookup(); would convert the memory to a 16 bit space with gamma or whatever. create a new memory buffer 16bit. A double buffer may be needed, as reading and writing at the same time could happen. This could flicker. Timing could be less strict, however should keep within update rate. could be called by show();

a to be defined timing ditherISR() called about all 8ms or shorter. for every bit more in dithering this time is halved. here the dithering takes place. one of a double buffer id filled with actual output. And marked as ready If it is not called on time low brightness LEDs will flicker (the dithered). Fadecandy has a nice implementation about storing dithered "parts".

a DMA ISR this is definitely processor depended. it is outputting the other double buffer. After output is done and it will switch the double buffer, mark the other as to be filled. In the past I was not able to sync these two good enough an used a triple buffer. One used, one to be written, one ready to use. DMA and dither ISR could be done in one, if computing power is high enough.

What HW could deliver: On AVR 8 bit -> no way, at least for more than 64 LEDs. I think I had 50 running and were at every limit of RAM and CPU. I think these are not used for new projects anyway. On ARM if there is a DMA, ideally multiport. For a more than a few LEDs parallel DMA is a must. The ATSAMD21G18A has 12 DMA channels. On ESP32 (my favorite in the moment), as far as I know there is some kind of DMA output already. Not the best implementation I think but possible.

What your thoughts?

Makuna commented 2 years ago

@swifty99 Constant output to support dithering is beyond the intent of this library at this time. If FadeCandy does what you need, why not branch it and update/fix it?

NOTE: ESP32 I already support RMT output (the buffers for DMA are too small for most users so I don't support it directly) and I2S (DMA supported). ARMs DMA support is inconsistent in its support across the chips and while I have a test version written it doesn't work across others that support DMA so I haven't merged it in yet.

Here is my take (for future me as well as others to understand):

swifty99 commented 2 years ago

Thanks for your thoughts.

I do not believe Fadecandy is "fixable". It is solitary solution for a certain HW/SW combination. The color science and implementation however is impressive. NeoPixelBus has the interface to a lot of HW LED types and CPU platforms supported and works stable. Kudos for that. I have not gone through the tedious work maintaining a stable driver in changing IDEs. Using the Arduino IDE in the old days was horrible enough with breaking libraries every day. It makes perfectly sense then to not include delicate SW solutions which might break easyly. Still I believe the outcome would be worth it. I will build some WLED projects next year. Maybe I will find some time to do a dithering demo SW.

Cheers