DeqingSun / ch55xduino

An Arduino-like programming API for the CH55X
GNU Lesser General Public License v2.1
449 stars 87 forks source link

Refactor WS2812 code to support other frequencies #101

Closed serisman closed 1 year ago

serisman commented 1 year ago

I re-wrote/optimized the WS2812 bitbang asm code to reduce its size a bit and allow it to support other clock frequencies.

Some more testing might be warranted, but the cycle timing follows the requirements specified by the following deep dive articles:

I have tested this @ 24 MHz, 16 MHz, and 12 MHz on CH551, but it should support (in theory) up to 56 MHz (for the CH559), and down to 6 MHz (or 4 MHz were that possible with these MCUs) whenever support for that clock speed is added in wiring.c and boards.txt.

I also added the 12 MHz option for CH551 in boards.txt that was already there for the CH552. Seems to work ok with my limited testing.

serisman commented 1 year ago

@DeqingSun - Did you have any feedback on this PR?

I have done some additional testing (@ 24 MHz, 16 MHz, and 12 MHz on a CH551), and so far haven't found any issues with all the various WS2811, WS2812, WS2812B, and clones thereof that I have.

DeqingSun commented 1 year ago

sorry I was busy these days and I have no access to hardware. The introduction of DELAY is awesome, thank you.

One thing I may want to mention is to add 12M support for ch552 as well.

Also it seems the bit0 high and bit1 low is pretty short. I'm worrying such a short pulse may be a problem with high speed clock and high cable capacitance situation, the short pulse may not be catched by the target chip?

serisman commented 1 year ago

sorry I was busy these days and I have no access to hardware. The introduction of DELAY is awesome, thank you.

No problem. I mostly just wanted to make sure you saw the PR in case you weren't notified.

One thing I may want to mention is to add 12M support for ch552 as well.

CH552 should already have 12M support. I copied it's config in boards.txt for CH551.

Also it seems the bit0 high and bit1 low is pretty short. I'm worrying such a short pulse may be a problem with high speed clock and high cable capacitance situation, the short pulse may not be catched by the target chip?

bit0 high needs to be extremely short. That is the most significant timing constraint for these LEDs. It can't be more than about 500 nS. We could slow it down a bit for the higher clock speeds, but that makes the code a bit messier. I haven't seen any issues so far, but granted I've only tested with about 15cm of cable. I believe the first LED (and each one after) will buffer the signal anyway, so I think it's just a potential issue for the length of cable to the first LED. According to the first article I linked too, high pulses of at least 62.5 nS were still picked up, so we might be ok already.

bit1 low doesn't seem to have any particular timing requirements. Again, making it longer (for faster CPU clocks) might help with cable issues, but introduces messier code. And, again, it looks like low pulses of at least 62.5 nS were still honored, so we might be ok already.

serisman commented 1 year ago

Also it seems the bit0 high and bit1 low is pretty short. I'm worrying such a short pulse may be a problem with high speed clock and high cable capacitance situation, the short pulse may not be catched by the target chip?

bit0 high needs to be extremely short. That is the most significant timing constraint for these LEDs. It can't be more than about 500 nS. We could slow it down a bit for the higher clock speeds, but that makes the code a bit messier. I haven't seen any issues so far, but granted I've only tested with about 15cm of cable. I believe the first LED (and each one after) will buffer the signal anyway, so I think it's just a potential issue for the length of cable to the first LED. According to the first article I linked too, high pulses of at least 62.5 nS were still picked up, so we might be ok already.

bit1 low doesn't seem to have any particular timing requirements. Again, making it longer (for faster CPU clocks) might help with cable issues, but introduces messier code. And, again, it looks like low pulses of at least 62.5 nS were still honored, so we might be ok already.

Just checked, and currently the new code is using 4 cycles for bit0 high, and (at least) 7 cycles for bit1 low. @12/16MHz this is 333/250 nS and 584/437 nS which should be totally fine. @24 MHz this is 166.6 nS and 292 nS, which is still probably fine. @56 MHz, we might be going a bit too fast with 71 nS and 125 nS. So, maybe introduce some additional delays just for 56 MHz?

I was trying to make the code adapt to all sorts of frequencies automatically regardless of what boards.txt actually allows. I'll have to think of if there is an easy way to introduce some additional delays for the faster frequencies without loosing too much of that benefit.

DeqingSun commented 1 year ago

I'll test the code with CH559 at 56M and my favorite SM16703P when I get a chance. The SM16703P was the only chip (or brand) that can output full high when brightness is set to 255, as far as I tested.

I'm not an expert with preprocessor. But I was wondering, is it possible to change:

           "    mov r1,#"_STR(WS2812_DELAY)"        \n" // [2,2]  delay a bit
           "bitDelay$:                              \n"
           "    djnz r1,bitDelay$                   \n" // [2,2/4|5|6]

into something like "WS2812_DELAY", while we declare WS2812_DELAY with different number of nop in the h file? So the main code will be clean and precise at the same time.

serisman commented 1 year ago

I'll test the code with CH559 at 56M and my favorite SM16703P when I get a chance. The SM16703P was the only chip (or brand) that can output full high when brightness is set to 255, as far as I tested.

Sounds good. I don't have that particular LED driver, and I don't have a spare CH559 board at the moment, to that testing is most welcome!

I'm not an expert with preprocessor.

Yeah, me either. I burned quite a few hours trying to get it to do the delays in a better fashion, but ended up having to give in and implement the less than ideal WS2812_DELAY.h file to get it to actually work.

We probably don't need any more nop `s, but maybe could have a conditional bit0 high delay loop and bit1 low delay loop for some of the fastest frequencies.

DeqingSun commented 1 year ago

@serisman

I would say there are 2 issues I can see for now.

the ".even \n" // [bytes, cycles] was removed. It is necessary to make the branching timing predictable.

Referring to NeoPixels Revealed: How to (not need to) generate precisely timed signals , making

bit '0' high:       4
bit '1' low:        7

violated

There is a minimum width of a 0-bit pulse (T0H) to ensure it is detected
There is a minimum time the level must stay low between consecutive bits (TxL), which ensures that the chip sees  separate bits rather than one long bit.

For CPU speed equal or higher than 24M, the timing would be out of spec.

So a conditional nop/loop selection may be a better choice. I'll check the best way to do it.