olikraus / u8g2

U8glib library for monochrome displays, version 2
Other
5.02k stars 1.04k forks source link

u8g2 and ChRT #971

Closed bk4nt closed 4 years ago

bk4nt commented 5 years ago

Hello,

For the time being, I didn't succeed to use u8g2 with ChRT, located here: https://github.com/greiman/ChRt

After minutes, or seconds, with other tasks running (I assume they are clean), my U8G2_SSD1327_WS_128X128_F_4W_HW_SPI gets less then more corrupted and ends blanked. Is this any known issue? Is this difficult to debug?

I'm using a Teensy 3.2, with Arduino IDE. I hadn't issues with basic Arduino code, then first big troubles with a non preemptive Scheduler and other tasks running aside. Now much more quicker crashes with ChRT, first pixels popping on randomly, then sometime inversed screen, later full corruption then black screen forever.

Since more than 6 hours, I'm now using this much more basic driver (plus exactly the same other tasks I had ran with u8g2 loaded), and I didn't notice similar issues, nor crashes, neither any pixel randomly popping on: https://github.com/hexaguin/SSD1327

My SSD1327 is on a shared SPI interface. But I'm using a semaphore to lock the SPI whilst it is busy. I'll try again with u8g2, but I had already tried u8g2 with a semaphore for SPI access.

Best regards

Edit. I'm now back to u8g2 again. With a semaphore. I forced SPI to higer 24M speed (in U8x8lib.cpp, under 4 WIRE HARDWARE SPI section). After this, buffer was pushed in some 17ms instead of 30ms. That thread hasn't the higuest priority. With this higer SPI speed, the display of blinking "Hello !" was perfect during 30 minutes. Since, it crashed again (black screen, but the data still gets pushed out to SPI port).

I saw a yield() and some delay()s in u8g2 source code. Can u8g2 code timeout somewhere, whilst the thread is inactive? I'm now testing with "chSemWait(&spiFree); chSysLock(); u8g2.sendBuffer(); chSysUnlock(); chSemSignal(&spiFree);" but my RT threads get affected by this.... The 8k buffer is so pushed out at 1358us to 2573us.

Best regards

olikraus commented 5 years ago

Difficult to say. U8g2 has been used on many systems, so I guess it should be clean regarding memory issues. With spi, i just call the Arduino functions

bk4nt commented 5 years ago

It could be related to task switching. Worked much better whilst I used chSysLock();/chSysUnlock(); but only tweaking tasks priorities and a semaphore didn't help.

I tested many things inside U8x8lib.cpp, removing and adding lines, still saw my display going away. Now I'm trying to tweak task priorities and SPI speed also (another task is using that same SPI port).

I keep the semaphore, forced the SPI speed to 24M:

SPI.beginTransaction(SPISettings(/*u8x8->bus_clock*/ 24000000, MSBFIRST, internal_spi_mode));

Added also minimal Tx buffering:

data = (uint8_t *)arg_ptr;
if (arg_int == 1) {     
    SPI.transfer((uint8_t)*data);
} else if (arg_int <= 32){
    uint8_t buffer[32];
    memcpy(&buffer, data, arg_int);
    SPI.transfer(buffer, arg_int);
} else {
      while( arg_int > 0 )
      {
    SPI.transfer((uint8_t)*data);
    data++;
    arg_int--;
      }

So I went down from 25702us, 27392us min/max transfers time to 10098us, 13512us. Display remains clean since half an hour. But I think I'll have to wait some hours now, to see if this really helps.

olikraus commented 5 years ago

Hmmm I wonder how you take care on the cs signal. Basically you must not interrupt the u8g2 spi transfer as long as CS of your display is low.

In u8g2 the transfer start will pull CS low until transfer end message.

olikraus commented 5 years ago

Here are my thoughts: The SPI transfer itself is handled by Arduino SPI functions, however the chip select signal (CS) is under u8g2 control. You can study this here: https://github.com/olikraus/u8g2/blob/master/csrc/u8x8_byte.c#L150 There will always be a U8X8_MSG_BYTE_START_TRANSFER message, followed by a U8X8_MSG_BYTE_END_TRANSFER message. In between these messages, there might be multiple SPI transfers. However: The SPI transfers between these messages must not be interrupted (altered) without considering the CS signal for the display, otherwise the display will receive extra data which might affect the content.

In your case (HW SPI Arduino), the actual code is here: https://github.com/olikraus/u8g2/blob/master/cppsrc/U8x8lib.cpp#L761 This means, in case of an Arduino HAL, the U8X8_MSG_BYTE_START_TRANSFER message will issue the BeginTransaction and the U8X8_MSG_BYTE_END_TRANSFER will issue the EndTransaction. Additionally, the CS signal is pulled low (or high, depending on the display): https://github.com/olikraus/u8g2/blob/master/cppsrc/U8x8lib.cpp#L787

In between these two messages, multiple writes (SPI.transfer) may happen, but again, you must not interrupt these writes.

For u8g2/u8x8 it is ensured, that for each START_TRANSFER a corresponding END_TRANSFER will follow. Additionally, these transfers are not nested. Unfortunately neither U8g2/U8x8 nor Arduino SPI provide a documented/reliable way to detect whether the code is in transfer or not.

So, the only way to solve this problem is to modify the original u8g2 code:

If you want to occupy the SPI resource (sempaphore), it needs to happen here: https://github.com/olikraus/u8g2/blob/master/cppsrc/U8x8lib.cpp#L762 You need to block u8g2 library until SPI becomes available, the occupy the SPI and release it here. https://github.com/olikraus/u8g2/blob/master/cppsrc/U8x8lib.cpp#L800

Another option might be to disallow SPI interrupts when calling u8g2.begin() and u8g2.sendBuffer(). Actually, these two functions should be the only function which talk to the display.

bk4nt commented 5 years ago

Thanks for all those inputs.

With the changes described above (faster SPI speed, buffered u8g2 output, higher priority for display handling), I now got my display working nice for several hours. Now I did remove those changes, BUT keeping the semaphore to lock the SPI access. And saw again pixels poping on... when I power up a distant Arduino with a RF head.

without considering the CS signal for the display, otherwise the display will receive extra data which might affect the content.

Almost sure this is the issue. This is also why I already added a semaphore. The other task accessing the SPI is based on RF24 RF24network, I'm also locking its SPI access with the same semaphore, I thought so. Currently, I do have the semaphore placed around u8g2.begin(); displayRotate(); u8g2.sendBuffer(); or what sends data out to the display. Placed also for my RF24 RF24network function calls.

So I'll now have to dig into RF24 RF24network to check there where any extra SPI access could occur.

My poor workaround (faster SPI speed, buffered u8g2 output, higher priority for display handling) seems only to free up faster the SPI port, hiding any real issue, some concurent SPI access in the background.

Edit... Seems easy to locate there either. Thanks also for those nice written codes :-)

    #if defined (RF24_SPI_TRANSACTIONS)
    _SPI.beginTransaction(SPISettings(RF24_SPI_SPEED, MSBFIRST, SPI_MODE0));
    #endif
    csn(LOW);
  }
bk4nt commented 5 years ago

Hello, I think it is fixed now,

Best regards