Closed alx-uta closed 1 year ago
Interesting idea
extra parameter for writeReg an readReg is not needed as _hwspi is available in both functions.. Extra peam might be cause of slower swspi.
Need to check increase of codebase, esp AVR
Drawback is that it "breaks" the architectures of keeping the SPI commands only in lowest level.
From design point I would create a (private) writeReg16 an a readReg16 function. That would allow this optimization too, does not break the architecture, and would make (public) 16 bit functions simpler.
Had a look into the datasheet this morning. There exist a sequential write mode which allows to write multiple bytes after each other. (read idem). This would be ideal for the 16 bit interface and should be tested extensively.
In the proposal the SW SPI got (1) two extra calls and (2) extra parameter which slows things down. Other functions have the same overhead when SW SPI is used.
bool MCP23S17::pinMode16(uint16_t value)
{
MCP23S17::beginSpiTransaction(); <<<<<<<<< when using SW SPI this call is added.
writeReg(MCP23S17_DDR_A, value >> 8, !_hwSPI); <<< extra parameter
writeReg(MCP23S17_DDR_B, value & 0xFF, !_hwSPI); <<< extra parameter
MCP23S17::endSpiTransaction(); <<<<<<<<< when using SW SPI this call is added.
_error = MCP23S17_OK;
return true;
}
The proposed solution spreads SPI.beginTransaction()
calls over the code on different levels.
The function call is extracted from writeReg()
for 16 bit but still in it for 8 bit. (readReg idem).
Although the performance gain is substantial this makes the code harder to debug, so there is a red flag.
The possible gain is still very interesting, and we need to investigate how it can be coded while keeping the architecture clean. Having a readReg16() and writeReg16() is imho the way as that would "localize all the problems"
beginTransaction()
The proposed solution spreads
SPI.beginTransaction()
calls over the code on different levels. The function call is extracted fromwriteReg()
for 16 bit but still in it for 8 bit. (readReg idem). Although the performance gain is substantial this makes the code harder to debug, so there is a red flag.The possible gain is still very interesting, and we need to investigate how it can be coded while keeping the architecture clean. Having a readReg16() and writeReg16() is imho the way as that would "localize all the problems"
This is a good idea, and it should keep the code cleaner.
I'll do those changes and see how it look :)
You need to merge the latest master branch of the library first as I have merged the other PR and created a new release
Note: the registers for the 16 bit calls are always N and N+1 (see registers.h file). I think you should not make a loop out of it as it only affects 2 registers.
You need to merge the latest master branch of the library first as I have merged the other PR and created a new release
Sounds good, thx.
I'll let you know when is ready.
Note: the registers for the 16 bit calls are always N and N+1 (see registers.h file). I think you should not make a loop out of it as it only affects 2 registers.
I'm not exactly sure what loop, sorry.
Also, I think that I can see a possible problem with the Software SPI.
Now, instead of a single if() else..
we have a specific condition at the top, one in the middle and a second one at the end.
Test results at 10 MHz using Software SPI:
Using master:
true
HWSPI: 0
time in microseconds
TEST digitalWrite(0, value): 62.88
TEST digitalWrite(pin, value): 45.63
TEST digitalRead(pin): 31.87
TEST write8(port, mask): 34.00
TEST read8(port): 34.00
TEST write16(mask): 32.50
TEST read16(): 33.00
VAL1: 0
VAL8: 170
VAL16: 43690
done...
Wrote 285360 bytes (155944 compressed) at 0x00010000 in 2.6 seconds (effective 887.9 kbit/s)...
Using the current branch:
true
HWSPI: 0
time in microseconds
TEST digitalWrite(0, value): 62.81
TEST digitalWrite(pin, value): 45.63
TEST digitalRead(pin): 31.87
TEST write8(port, mask): 33.50
TEST read8(port): 34.00
TEST write16(mask): 35.50
TEST read16(): 35.50
VAL1: 0
VAL8: 170
VAL16: 43690
Wrote 285776 bytes (156133 compressed) at 0x00010000 in 2.6 seconds (effective 890.3 kbit/s)...
Test results at 10 MHz using Hardware SPI: Using master:
true
HWSPI: 1
time in microseconds
TEST digitalWrite(0, value): 35.31
TEST digitalWrite(pin, value): 25.69
TEST digitalRead(pin): 18.06
TEST write8(port, mask): 20.00
TEST read8(port): 20.50
TEST write16(mask): 18.50
TEST read16(): 18.50
VAL1: 0
VAL8: 170
VAL16: 43690
Using the current branch:
true
HWSPI: 1
time in microseconds
TEST digitalWrite(0, value): 35.38
TEST digitalWrite(pin, value): 34.56
TEST digitalRead(pin): 18.12
TEST write8(port, mask): 19.50
TEST read8(port): 20.50
TEST write16(mask): 16.50
TEST read16(): 15.50
VAL1: 0
VAL8: 170
VAL16: 43690
Think I explained myself not well enough,
This is what I meant, that a writeReg16() exists side by side with the writeReg() See picture below:
Think I explained myself not well enough,
This is what I meant, that a writeReg16() exists side by side with the writeReg()
Sorry, I'm confused, just had a long day.
I thought that you wanted to have the writeReg16
in the private:
section of MCP23S17.h with writeReg
.
as in:
private:
// access to low level registers (just make these two functions public).
// USE WITH CARE !!!
bool writeReg(uint8_t reg, uint8_t value);
bool writeReg16(uint8_t reg, uint8_t value);
uint8_t readReg(uint8_t reg);
uint8_t readReg16(uint8_t reg);
Is the problem the fact that we have beginSpiTransaction()
with the public pinMode16()
? :)
@alx-uta
I created a develop branch with the readReg16() and writeReg16() side by side with the 8 bit low level functions. Can you give it a try with your hardware?
The code is more within my architectural ideas, all low level SPI hidden for the user. (and the 8 bits interface stays nearly compatible with MCP23S08 too). All 16 bit functions now call readReg16() or writeReg16(), please note it is a single call with one register. Remarks welcome,
Tests here shows reasonable gains for my ESP32 for the 16 bit calls (tested without hardware)
Sounds good, I'll try to have a look and run a few tests!
Thanks
Replaced by PR #24
After running multiple tests and trying to figure out what else I could improve for the data read I realised that we could avoid multiple SPI channels when reading or writing all the ports.
I'm not exactly sure why the software SPI went up because nothing really changed for that one.
It looks like now 4 MHz can compete with the old 10MHz.