mrcodetastic / ESP32-HUB75-MatrixPanel-DMA

An Adafruit GFX Compatible Library for the ESP32, ESP32-S2, ESP32-S3 to drive HUB75 LED matrix panels using DMA for high refresh rates. Supports panel chaining.
MIT License
939 stars 208 forks source link

Clash with WIFI - Adafruit Matrix Portal S3 #570

Closed netmindz closed 8 months ago

netmindz commented 8 months ago

When I add this library to WLED, then I see real issues with the network stack. Pings are intermittent and very high when you see any reply at all, like 5000ms +

If I swap to using the SPIRAM_FRAMEBUFFER build flag, then network functions return to normal, but I get no output if double buffer is disabled and with it enabled then begin fails

mrcodetastic commented 8 months ago

Does reducing the clock frequency help?

The reality is this library uses the DMA engine quite heavily and there may not be enough bandwidth (on the soc) for all the other stuff your application is trying to do.

netmindz commented 8 months ago

Yeah that seems to fix it. Thank you. Interesting that despite being dual-core that high refresh rates impact code running on different core

netmindz commented 8 months ago

Possibly slightly too quick to say fixed. Reverted change and didn't immediately see the WiFi issue, so might be other factors so possibly didn't run for long enough to confirm fix

Now have unit in state unable to download

mrcodetastic commented 8 months ago

Now have unit in state unable to download

What does this mean? On boot?

netmindz commented 8 months ago

Had issue where it wouldn't boot unless usb serial monitor was connected. In trying to fix that, I then got into state that wouldn't let me download new image.

Luckily found way to force round that.

As the issue is a little intermittent, testing is tricky, but was looking last night like lowering was helping.

I'll try and run some more tests with different values and see if there is a correlation

mrcodetastic commented 8 months ago

Check the GPIOs you are using. You could be using a bootstrapping pin that's impacting the ability to boot etc.

netmindz commented 8 months ago

Nah it's definitely the config relating to usb serial. The S3 is very weird about how it works. Getting the option of Serial but not preventing boot until connected is tricky. I'm back in now, so I'll test the i2s rate over the weekend

netmindz commented 8 months ago

https://thingpulse.com/usb-settings-for-logging-with-the-esp32-s3-in-platformio/

mrcodetastic commented 8 months ago

Thanks for this. Very interesting.

Do you know of a good guide on how to use JTAG debugging? I would love to be able to run a sketch and see the changes at a hardware register level, breakpoints etc.

Have never seemingly been able to get it to work.

netmindz commented 7 months ago

Gone back and tested again, still having issue and the change of speed did not help. Going to do some more digging

mozzhead164 commented 4 months ago

Did you get anywhere with this @netmindz? I'm having similar issues with an ESP32-S3 (MatrixPortal) with intermittent Wifi connections and dropping in/out of MQTT Broker

BlueAndi commented 2 months ago

I have serious wifi connection problems too with the ESP32-S3 MatrixPortal too. If I don' start theMatrixPanel_I2S_DMA lib by not calling MatrixPanel_I2S_DMA::begin() in my app, the problem is gone.

@netmindz @mozzhead164 Any news or hints?

@mrcodetastic Any advice what I could try?

netmindz commented 2 months ago

I was really really struggling and nothing I did seemed to help, there is an issue I opened on GitHub with some of the issues I saw.

I then left the project I was working on and when I came back to it all seems to be, until then it wasn't

I am unsure what exactly triggers it, but it's super annoying that it's been back again

netmindz commented 2 months ago

https://www.esp32.com/viewtopic.php?t=4770

netmindz commented 2 months ago

https://github.com/mrcodetastic/ESP32-HUB75-MatrixPanel-DMA/discussions/258

mrcodetastic commented 2 months ago

Can you try hacking around these lines of file gdma_lcd_parallel16.cpp and overriding the _div_num to something larger.

Set it to like 126 or something which will result in a VERY flickery 1.2Mhz output. If the issue still persists, then it must be electrical noise related, opposed to something to do the WiFI module being starved of DMA bandwidth or something (I think this is unlikely - never been an issue on older ESPs).

Note: The _div_num divides the 160Mhz clock signal.

netmindz commented 2 months ago

Changing that to 126 makes very flickery, but the network response is still poor

PING 192.168.178.117 (192.168.178.117) 56(84) bytes of data. 64 bytes from 192.168.178.117: icmp_seq=1 ttl=255 time=1308 ms 64 bytes from 192.168.178.117: icmp_seq=2 ttl=255 time=2054 ms 64 bytes from 192.168.178.117: icmp_seq=3 ttl=255 time=2204 ms 64 bytes from 192.168.178.117: icmp_seq=4 ttl=255 time=1733 ms 64 bytes from 192.168.178.117: icmp_seq=5 ttl=255 time=1967 ms 64 bytes from 192.168.178.117: icmp_seq=6 ttl=255 time=2829 ms 64 bytes from 192.168.178.117: icmp_seq=7 ttl=255 time=2813 ms 64 bytes from 192.168.178.117: icmp_seq=8 ttl=255 time=2994 ms 64 bytes from 192.168.178.117: icmp_seq=10 ttl=255 time=2710 ms 64 bytes from 192.168.178.117: icmp_seq=11 ttl=255 time=2117 ms 64 bytes from 192.168.178.117: icmp_seq=12 ttl=255 time=1587 ms 64 bytes from 192.168.178.117: icmp_seq=13 ttl=255 time=1652 ms 64 bytes from 192.168.178.117: icmp_seq=14 ttl=255 time=1693 ms 64 bytes from 192.168.178.117: icmp_seq=15 ttl=255 time=1338 ms 64 bytes from 192.168.178.117: icmp_seq=16 ttl=255 time=2579 ms 64 bytes from 192.168.178.117: icmp_seq=17 ttl=255 time=2550 ms 64 bytes from 192.168.178.117: icmp_seq=18 ttl=255 time=2122 ms ^C --- 192.168.178.117 ping statistics --- 20 packets transmitted, 17 received, 15% packet loss, time 19145ms rtt min/avg/max/mdev = 1308.327/2132.315/2994.434/521.775 ms, pipe 3

netmindz commented 2 months ago

For comparison - ping without this driver running

PING 192.168.178.117 (192.168.178.117) 56(84) bytes of data. 64 bytes from 192.168.178.117: icmp_seq=1 ttl=255 time=536 ms 64 bytes from 192.168.178.117: icmp_seq=2 ttl=255 time=163 ms 64 bytes from 192.168.178.117: icmp_seq=3 ttl=255 time=56.5 ms 64 bytes from 192.168.178.117: icmp_seq=4 ttl=255 time=165 ms 64 bytes from 192.168.178.117: icmp_seq=5 ttl=255 time=5.33 ms 64 bytes from 192.168.178.117: icmp_seq=6 ttl=255 time=26.7 ms 64 bytes from 192.168.178.117: icmp_seq=7 ttl=255 time=300 ms 64 bytes from 192.168.178.117: icmp_seq=8 ttl=255 time=262 ms 64 bytes from 192.168.178.117: icmp_seq=9 ttl=255 time=12.3 ms 64 bytes from 192.168.178.117: icmp_seq=10 ttl=255 time=64.6 ms 64 bytes from 192.168.178.117: icmp_seq=11 ttl=255 time=289 ms 64 bytes from 192.168.178.117: icmp_seq=12 ttl=255 time=44.8 ms 64 bytes from 192.168.178.117: icmp_seq=13 ttl=255 time=112 ms 64 bytes from 192.168.178.117: icmp_seq=14 ttl=255 time=155 ms 64 bytes from 192.168.178.117: icmp_seq=15 ttl=255 time=34.7 ms ^C --- 192.168.178.117 ping statistics --- 15 packets transmitted, 15 received, 0% packet loss, time 14017ms rtt min/avg/max/mdev = 5.330/148.368/535.882/141.502 ms

BlueAndi commented 2 months ago

Can you try hacking around these lines of file gdma_lcd_parallel16.cpp and overriding the _div_num to something larger.

Set it to like 126 or something which will result in a VERY flickery 1.2Mhz output. If the issue still persists, then it must be electrical noise related, opposed to something to do the WiFI module being starved of DMA bandwidth or something (I think this is unlikely - never been an issue on older ESPs).

Note: The _div_num divides the 160Mhz clock signal.

I changed it to 80, by setting the _div_num and removing the if/else construction. This improved it definitly. For me seems like a load problem. If I decrease the _div_num (increasing the freq), the connection throughput will get worse.

@netmindz Did you only change the value, but kept the if/else part after that?

BlueAndi commented 2 months ago

My ping times with _div_num = 80 are in range starting from 70ms to 290ms.

BlueAndi commented 2 months ago

@mrcodetastic I have now found a good working setup for me, by using lower i2s clock speed + additional change of _div_num.

const HUB75_I2S_CFG             Display::MATRIX_CFG  =
{
    CONFIG_LED_MATRIX_WIDTH,    /* Panel width */
    CONFIG_LED_MATRIX_HEIGHT,   /* Panel height */
    CONFIG_HUB75_CHAIN_LENGTH,  /* Chain length */
    I2S_PINS,                   /* Pin mapping */
    CONFIG_HUB75_DRIVER,        /* Driver */
    false,                      /* Use DMA double buffer */
    HUB75_I2S_CFG::HZ_8M,       /* I2S clock speed */
    DEFAULT_LAT_BLANKING,       /* How many clock cycles to blank OE before/after LAT signal change. */
    CONFIG_HUB75_CLOCK_PHASE,   /* Clock phase */
    60U,                        /* Min. refresh/scan rate */
    8U                          /* Pixel color depth bits, e.g. 8 bits means 8 bit per color, therefore 24 bit for RGB. */
};
      auto  freq     = (_cfg.bus_freq);
      auto  _div_num = 20;  //<----------------- my additional change
      if (freq <= 10000000L) {      
      } else if (freq < 20000000L) {
            _div_num = 10; // 16Mhz
      } else {
            _div_num = 7; // 22Mhz --- likely to have noise without a good connection         
      }

Unfortunately I still need to hack the sources. Any other way?

mrcodetastic commented 2 months ago

I don't expose the div_num directly but can put a compile time override in that file at some point.

For now, you'll need to hack the source file. To be honest, this library hardly changes in any case so having your own static copy of this library with custom modifications shouldn't be an issue.

BlueAndi commented 2 months ago

I don't expose the div_num directly but can put a compile time override in that file at some point.

For now, you'll need to hack the source file. To be honest, this library hardly changes in any case so having your own static copy of this library with custom modifications shouldn't be an issue.

Would be nice to see in the future a compile time option, but as you suggested, I will fork for now and adapt it. Thanks for pointing out the possibility to knock down the issue!

mrcodetastic commented 2 months ago

Would be nice to see in the future a compile time option, but as you suggested, I will fork for now and adapt it. Thanks for pointing out the possibility to knock down the issue!

I've added the compile time define override S3_LCD_DIV_NUM

So compile with -DS3_LCD_DIV_NUM=80 for example.

BlueAndi commented 2 months ago

@mrcodetastic Successful tested, thanks for the very quick support!

netmindz commented 2 months ago

80 is very flickery, but 20 doesn't have noticeable flicker and acceptable network. Still reduced, but an acceptable level