combined read/write methods for an improvement in performance

eringerli commented 2 years ago

Code

If the code doesn't run on an 8bit-AVR, the data is prepared by combining the reads and writes into one buffer, which is then transfered as one transaction on the bus. A constant decides if the buffer is allocated on the stack or from the heap, further reducing the overhead for typical small reads and writes without limiting the transactions in size. This is a tradeof between RAM and CPU: on the 'bigger' arduino platforms the CPU is a lot faster than the SPI, so using more RAM/CPU cycles and then letting the DMA do its work is the way to go.

I tested it on an ESP32 on a 10Mhz bus: The overhead for an transaction was about 0.8us/1.0us (begin/end), by combining reads and writes into a newly allocated buffer it rises to 1.0us/1.0us (begin/end). But as most of the typical transactions are combined like getting the register contents from a chip (1 byte to send, multiple to read), combining cuts out 1.8us but adds 0.2us, which results in a net gain of 1.6us per transaction. All in all this results in a gain in typical usecases of around 15%.

Logic analyser trace

These are two traces of the data read out of a LIS3MDL over SPI with 10Mhz data rate. The difference is clear: with these changes a transaction takes only 85% of the time as without.

Before

Screenshot_20220418_184113

After

Screenshot_20220418_185123

eringerli commented 2 years ago

I reorganised my repo, so this pull-request ist a copy of #87

ladyada commented 2 years ago

ok lets split this again into a PR for begin/end transaction changes and then the buffering changes!

eringerli commented 2 years ago

I submitted a seperate pull request for the transaction management (#90). I will rebase this one if you decide to accept #90.

eringerli commented 2 years ago

A huge gain in performace is already in #90: instead of reading/writing each byte individually in a loop, do it in one go. This PR adds the combination of reading/writing to it, which removes the additional round trip of transation management in between.

ladyada commented 2 years ago

ok can ya keep this PR to just the transaction+CS assert/desassert changes? i am going very carefully :)

eringerli commented 2 years ago

see #90 :wink:

ladyada commented 2 years ago

that PR still has both CS / begin transaction changes mixed with the transfer-buffering changes. it'd be easier for us to do one at a time

eringerli commented 2 years ago

This is because the history looks like this: Screenshot_20220421_062914

PR #90 is for the branch streamlined-transaction-management, PR #88 is for the branch read-write-combine, which is based on streamlined-transaction-management.

It is too much work for me too back port the combining stuff without the CS/transaction stuff, as this just makes no sense. I suggest first merging #90, after that #88.

eringerli commented 2 years ago

The buffering stuff looks like this:

    size_t lenBuffer = prefix_len + len;
    if (lenBuffer < minBufferSizeToMalloc) {
      uint8_t tmpBuffer[lenBuffer];

      memcpy(tmpBuffer, prefix_buffer, len);
      memcpy(tmpBuffer + prefix_len, buffer, len);

      write_and_read(tmpBuffer, lenBuffer);
    } else {
      auto tmpBuffer = new uint8_t[lenBuffer];

      memcpy(tmpBuffer, prefix_buffer, len);
      memcpy(tmpBuffer + prefix_len, buffer, len);

      write_and_read(tmpBuffer, lenBuffer);

      delete buffer;
    }

The CS/transaction like this:

  beginTransactionWithAssertingCS();
  transfer(buffer, len);
  endTransactionWithDeassertingCS();

So no, #90 doesn't contain any buffering changes or changes which impact the memory footprint of Adafruit_BusIO.

eringerli commented 2 years ago

@ladyada Did you have the chance to look into this? Is there still something which blocks this PR?

ladyada commented 2 years ago

id like the two conceptual changes: buffering the tx's / changing the CS pin and beginTransaction, to be seperated into two PRs so they can be reviewed and merged sepearately in case we need to revert changes!

eringerli commented 2 years ago

As I said before, it already is. This PR just has the commit for transaction/CS management (3c18659) too, as it is based on it. I cannot split it further. If you merge the other one, I can rebase this PR on master again, but not before. The buffering changes are in 0caf969.

ladyada commented 2 years ago

not sure whats up with the two PRs but we see changes to spi data/transactions in https://github.com/adafruit/Adafruit_BusIO/pull/90/files

maybe check the branches?

ladyada commented 2 years ago

(and data changes in this one too)

eringerli commented 2 years ago

Yes, #90 is about transactions, #88 is about data changes. The numbers are bit confusing, as #88 is based on #90, but that was just lazy me that forcepushed and rewrote a PR. I split the transaction stuff out and made a new PR, which got number 90

ladyada commented 2 years ago

right, please start with a pr that is just changing CS and begintransaction in all locations and we'll go from there

eringerli commented 2 years ago

Hehe, that's kind of the point of refacturing, isn't it? 😉

ladyada commented 2 years ago

correct, and refactoring is super fun and often introduces bugs because so much has changed!

eringerli commented 2 years ago

Should I split the other PR further into logical commits? Would that be a sensible compromise?

adafruit / Adafruit_BusIO