stm32duino / Arduino_Core_STM32

STM32 core support for Arduino
https://github.com/stm32duino/Arduino_Core_STM32/wiki
Other
2.86k stars 980 forks source link

Slow SPI speed with SPI library #898

Closed stas2z closed 4 years ago

stas2z commented 4 years ago

Im trying to work with several SPI LCD displays and found SPI transfer made by SPI library is slower then for example ESP32 SPI library running the same code. It looks like SPI working at lower speed than initialized.

Im not completely sure, but seems like i found a problem causing this issue (at least it can be one of it)

All SPI library methods use calls to internal spi_transfer routine, which also calls HAL_SPI_TransmitReceive to do the job. But LCDs are not answer to master, so seems like its waiting for reply timeout. As a workaround ive changed all (except on which called directly with answer buffer) spi_transfer calls to spi_send calls and it improved throughput dramatically (flickering i experience is almost gone).

here is my diff

diff --git a/libraries/SPI/src/SPI.cpp b/libraries/SPI/src/SPI.cpp
index e162338e..e3d0a74e 100644
--- a/libraries/SPI/src/SPI.cpp
+++ b/libraries/SPI/src/SPI.cpp
@@ -278,7 +278,7 @@ byte SPIClass::transfer(uint8_t _pin, uint8_t data, SPITransferMode _mode)
     digitalWrite(_pin, LOW);
   }

-  spi_transfer(&_spi, &data, &rx_buffer, sizeof(uint8_t), SPI_TRANSFER_TIMEOUT);
+  spi_send(&_spi, &data, sizeof(uint8_t), SPI_TRANSFER_TIMEOUT);

   if ((_pin != CS_PIN_CONTROLLED_BY_USER) && (_mode == SPI_LAST) && (_spi.pin_ssel == NC)) {
     digitalWrite(_pin, HIGH);
@@ -330,7 +330,7 @@ uint16_t SPIClass::transfer16(uint8_t _pin, uint16_t data, SPITransferMode _mode
     digitalWrite(_pin, LOW);
   }

-  spi_transfer(&_spi, (uint8_t *)&data, (uint8_t *)&rx_buffer, sizeof(uint16_t), SPI_TRANSFER_TIMEOUT);
+  spi_send(&_spi, (uint8_t *)&data, sizeof(uint16_t), SPI_TRANSFER_TIMEOUT);

   if ((_pin != CS_PIN_CONTROLLED_BY_USER) && (_mode == SPI_LAST) && (_spi.pin_ssel == NC)) {
     digitalWrite(_pin, HIGH);
@@ -379,7 +379,7 @@ void SPIClass::transfer(uint8_t _pin, void *_buf, size_t _count, SPITransferMode
     digitalWrite(_pin, LOW);
   }

-  spi_transfer(&_spi, ((uint8_t *)_buf), ((uint8_t *)_buf), _count, SPI_TRANSFER_TIMEOUT);
+  spi_send(&_spi, ((uint8_t *)_buf), _count, SPI_TRANSFER_TIMEOUT);

   if ((_pin != CS_PIN_CONTROLLED_BY_USER) && (_mode == SPI_LAST) && (_spi.pin_ssel == NC)) {
     digitalWrite(_pin, HIGH);
stas2z commented 4 years ago

By the way, related issue exists with curren spi implementation - SPI.transfer(buffer, size) will modify your buffer unexpectedly.

stas2z commented 4 years ago

Seems like nobody cares.

Yes, i understand that my workaround can't be a solution (thats why ive not made a PR), but speed issue is huge, using HAL_SPI_Transmit instead of HAL_SPI_TransmitRecieve boosts SPI transmit speed about 4+ times. It's a pity, but in the current state its preffered to use atmega328p for SPI LCD cuz it's working faster than my stm32 F4 (401&405)

stas2z commented 4 years ago

here is my suggestion, it doubles SPI speed (but at least twice slower than transmit w/o recieve)

spi_status_e spi_transfer(spi_t *obj, uint8_t *tx_buffer, uint8_t *rx_buffer,
                          uint16_t len, uint32_t Timeout) {
  spi_status_e ret = SPI_OK;
  uint32_t tickstart;
  SPI_TypeDef *_SPI = obj->handle.Instance;

  if ((obj == NULL) || (len == 0)) {
    return SPI_ERROR;
  }
  tickstart = HAL_GetTick();
  while (len--) {
    while (!LL_SPI_IsActiveFlag_TXE(_SPI))
      ;
    LL_SPI_TransmitData8(_SPI, *tx_buffer++);
    while (!LL_SPI_IsActiveFlag_RXNE(_SPI))
      ;
    *rx_buffer++ = LL_SPI_ReceiveData8(_SPI);
    if ((((HAL_GetTick() - tickstart) >= Timeout) &&
         ((Timeout != HAL_MAX_DELAY))) ||
        (Timeout == 0U)) {
      ret = SPI_TIMEOUT;
      break;
    }
  }
  return ret;
}

what do you think?

fpistm commented 4 years ago

Hi @stas2z this will be checked, no worries. This is already an opened issue: #257 Anyway, this will require some times to properly invest this. The SPI is functional, slow but functional.

stas2z commented 4 years ago

probably we need an extra flag for spisettings to skip recieve if slave is never answer? it will double (4x atm, but with optimized spi_transfer it will double) spi transactions speed for silent slaves like SPI LCDs and will not break default behaviour and will be compatible with multiple slaves on the same bus usage case

stas2z commented 4 years ago

@fpistm a little offtopic, which tool do you use to format a code for this core? my ide formatted it with clang-format and i don't want to reformat it back manually in case of making pull requests etc

fpistm commented 4 years ago

https://github.com/stm32duino/wiki/wiki/Astyle

ABOSTM commented 4 years ago

Hi @stas2z, I review your PR Thanks. By the way I have 2 remarks:

stas2z commented 4 years ago

Im using my own library for st7735 and ili9341 (only init and setting orientation different) written from scratch and sure i use buffered transfers to speed up lcd operations also im using fastest spi clock available for my mcu (84mhz/2 for f401 currently)