Closed alx-uta closed 1 year ago
is this toggle needed? as you are still in the same SPI transaction, I assume it is not.
// Toggle CS only if there are more channels to read
if (i < numChannels - 1) {
digitalWrite(_select, LOW);
digitalWrite(_select, HIGH);
}
(update) if it is needed, it could be unconditional?
What is the performance loss, if readADC()
is implemented as a wrapper around readADCMulti()
(core code is identical so expect it is not very much)
I'll be more than happy to do any adjustments, mainly because my programming skills in C/C++ are a bit limited.
Thanks for this enhancement. Good opportunity to improve your skills :) Reading all the channels at once is faster (I know), and it might be interesting to add it.
First thing is please add in the demo sketch the ratio of improvement. ==> time (1.0*analogRead) / (1.0*time multiRead()
)
Looking at the output this varies roughly from a 0.25 - 3 which makes me suspicious as I would expect roughly a constant factor.
Have to review the code. you're using ESP32?
(pending post, too much windows open error :)
Something is pretty interesting in your log.
Some faster SPI speeds are slower in analogRead()
, it seems to happen when speed is not a whole divider of FCPU.
is this toggle needed? as you are still in the same SPI transaction, I assume it is not.
// Toggle CS only if there are more channels to read if (i < numChannels - 1) { digitalWrite(_select, LOW); digitalWrite(_select, HIGH); }
(update) if it is needed, it could be unconditional?
We need to toggle the CS line, if not it won't read the next channel :(. I only added the condition so we won't do a toggle unless is required.
readADCMulti
I'll have a look and I'll run a few tests.
I only added the condition so we won't do a toggle unless is required.
but as it is required for all channels you want to read, so it could be done unconditionally.
I'll be more than happy to do any adjustments, mainly because my programming skills in C/C++ are a bit limited.
Thanks for this enhancement. Good opportunity to improve your skills :) Reading all the channels at once is faster (I know), and it might be interesting to add it.
First thing is please add in the demo sketch the ratio of improvement. ==> time (1.0_analogRead) / (1.0_time multiRead()) Looking at the output this varies roughly from a 0.25 - 3 which makes me suspicious as I would expect roughly a constant factor.
Have to review the code. you're using ESP32?
(pending post, too much windows open error :)
Thanks. I'm mainly a Python/PHP dev, and I only used C/C++ with Arduino and ESP32.
I'll try to update it later today, if not it will be during the week.
Yes, I'm using ESP32 WROVER.
variant for the toggle, at begin and at end of loop.
void MCP_ADC::readADCMultiple(uint8_t channels[], uint8_t numChannels, int16_t readings[]) {
if (_hwSPI) {
mySPI->beginTransaction(_spi_settings);
}
for (uint8_t i = 0; i < numChannels; i++) {
digitalWrite(_select, LOW);
uint8_t data[3] = {0, 0, 0};
uint8_t bytes = buildRequest(channels[i], true, data);
if (_hwSPI) {
for (uint8_t b = 0; b < bytes; b++) {
data[b] = mySPI->transfer(data[b]);
}
} else {
for (uint8_t b = 0; b < bytes; b++) {
data[b] = swSPI_transfer(data[b]);
}
}
if (bytes == 2) {
readings[i] = ((256 * data[0] + data[1]) & _maxValue);
} else {
readings[i] = ((256 * data[1] + data[2]) & _maxValue);
}
digitalWrite(_select, HIGH);
}
}
if (_hwSPI) {
mySPI->endTransaction();
}
}
I only added the condition so we won't do a toggle unless is required.
but as it is required for all channels you want to read, so it could be done unconditionally.
This is required for all except for the last one, mainly because after the last channel read there's nothing else there to read.
But, if we'll find out that there's no improvement in the read speed even if we have the extra toggle, we could just remove it.
variant for the toggle, at begin and at end of loop.
void MCP_ADC::readADCMultiple(uint8_t channels[], uint8_t numChannels, int16_t readings[]) { if (_hwSPI) { mySPI->beginTransaction(_spi_settings); } for (uint8_t i = 0; i < numChannels; i++) { digitalWrite(_select, LOW); uint8_t data[3] = {0, 0, 0}; uint8_t bytes = buildRequest(channels[i], true, data); if (_hwSPI) { for (uint8_t b = 0; b < bytes; b++) { data[b] = mySPI->transfer(data[b]); } } else { for (uint8_t b = 0; b < bytes; b++) { data[b] = swSPI_transfer(data[b]); } } if (bytes == 2) { readings[i] = ((256 * data[0] + data[1]) & _maxValue); } else { readings[i] = ((256 * data[1] + data[2]) & _maxValue); } digitalWrite(_select, HIGH); } } if (_hwSPI) { mySPI->endTransaction(); } }
This should work. I don't have that much experience with SPI. Do you think it will be safe to end the spi transaction while the CS line is off?
Do you think it will be safe to end the spi transaction while the CS line is off?
That thought underlies the code so yes. the communication is during the transfer function. But the test with real hardware is the ultimate test.
Thanks for looking at this :)
Just did some tests and I added the new results.
Personally, I think that anyone who needs to read more than one channel should use .analogReadMultiple(). It doesn't make sense to read a channel one by one and add extra delay due to the SPI transition start/end.
I think that anyone who needs to read more than one channel should use .analogReadMultiple(). It doesn't make sense to read a channel one by one and add extra delay due to the SPI transition start/end.
Makes sense, e.g when the timestamps should be as simultaneous as possible. The user might have arguments to do individual calls, design/architecture readability erc, so keeping that option is important.
The readDACmultiple()
call also allows to read a single channel n times or two channels multiple times alternatingly etc. So it has broader application than at first sight.
The performance numbers are impressive, Might be interesting to add the gain for 2, 3, 4, 5, 6, 7 and 8 channels in the performance sketch. E.g with a fixed SPI speed of 8 MHz.
I expect it will show gain for all cases.
The
readDACmultiple()
call also allows to read a single channel n times or two channels multiple times alternatingly etc. So it has broader application than at first sight.
That's actually a really good point. I never thought of it as doing a double call and get the average reading.
Might be interesting to add the gain for 2, 3.4.5.6, 7 and 8 channels in the performance sketch.
I'll have a look and I'll try to update it
This is quite interesting.
mcp28.analogRead() 2: 42
mcp28.analogReadMultiple() 2: 27
analogRead() time / analogReadMultiple() time 1
mcp28.analogRead() 3: 59
mcp28.analogReadMultiple() 3: 34
analogRead() time / analogReadMultiple() time 1
mcp28.analogRead() 4: 78
mcp28.analogReadMultiple() 4: 41
analogRead() time / analogReadMultiple() time 1
mcp28.analogRead() 5: 98
mcp28.analogReadMultiple() 5: 49
analogRead() time / analogReadMultiple() time 2
mcp28.analogRead() 6: 118
mcp28.analogReadMultiple() 6: 55
analogRead() time / analogReadMultiple() time 2
mcp28.analogRead() 7: 137
mcp28.analogReadMultiple() 7: 62
analogRead() time / analogReadMultiple() time 2
mcp28.analogRead() 8: 157
mcp28.analogReadMultiple() 8: 70
analogRead() time / analogReadMultiple() time 2
Based on this we can see that's not actually the reading time that's slow, but the SPI start/end communication.
mcp28.analogRead() 7: 137
mcp28.analogReadMultiple() 7: 62
analogRead() time / analogReadMultiple() time 2
mcp28.analogRead() 8: 157
mcp28.analogReadMultiple() 8: 70
analogRead() time / analogReadMultiple() time 2
(note: the figures above are platform dependent, still they are very informative) The delta for an extra analogRead is about 20 us The delta for an extra channel in analogReadMulti() is about 8 us ==> so every transaction initialization takes about 12 us
Created a table of the comparison
to be saved as readADC_vs_readADCMultiple.md
### Comparing multiple **readADC()** calls with one **readADCMultiple()** call
- ESP32
- source https://github.com/RobTillaart/MCP_ADC/pull/11#issuecomment-1676461195
- figures are indicative and platform dependent.
| function | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | <- channels
|:--------------------:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|
| analogRead | | 42 | 59 | 78 | 98 | 118 | 137 | 157 | (us)
| analogReadMultiple | | 27 | 34 | 41 | 49 | 55 | 62 | 70 | (us)
| ratio | | 1.56 | 1.73 | 1.90 | 2.00 | 2.15 | 2.21 | 2.24 |
TODO AVR UNO
Think the code looks good to merge into master.
The remaining remarks are minor and I'll need to check if there is other "low hanging fruit" I can include in the release.
Think the code looks good to merge into master.
The remaining remarks are minor and I'll need to check if there is other "low hanging fruit" I can include in the release.
Thanks.
I'll try to go thru the comments and do any changes.
Raw output performance test on AVR / UNO
MCP3208_performance\MCP3208_performance.ino
MCP_ADC_LIB_VERSION: 0.1.9
ADC CHAN MAXVALUE
mcp28 8 4095
Timing in micros().
***************************************
1000000
mcp28.analogRead() 8z: 512
mcp28.analogReadMultiple() 8x: 368
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 468
mcp28.deltaRead() 8x: 928
2000000
mcp28.analogRead() 8z: 392
mcp28.analogReadMultiple() 8x: 272
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 368
mcp28.deltaRead() 8x: 740
3000000
mcp28.analogRead() 8z: 408
mcp28.analogReadMultiple() 8x: 272
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 368
mcp28.deltaRead() 8x: 740
4000000
mcp28.analogRead() 8z: 352
mcp28.analogReadMultiple() 8x: 224
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 320
mcp28.deltaRead() 8x: 644
5000000
mcp28.analogRead() 8z: 356
mcp28.analogReadMultiple() 8x: 224
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 320
mcp28.deltaRead() 8x: 644
6000000
mcp28.analogRead() 8z: 356
mcp28.analogReadMultiple() 8x: 224
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 324
mcp28.deltaRead() 8x: 644
7000000
mcp28.analogRead() 8z: 356
mcp28.analogReadMultiple() 8x: 224
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 324
mcp28.deltaRead() 8x: 644
8000000
mcp28.analogRead() 8z: 336
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 296
mcp28.deltaRead() 8x: 596
9000000
mcp28.analogRead() 8z: 332
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 300
mcp28.deltaRead() 8x: 596
10000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 304
mcp28.deltaRead() 8x: 588
11000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 208
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 304
mcp28.deltaRead() 8x: 592
12000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 208
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 304
mcp28.deltaRead() 8x: 592
13000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 208
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 296
mcp28.deltaRead() 8x: 588
14000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 296
mcp28.deltaRead() 8x: 588
15000000
mcp28.analogRead() 8z: 324
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 300
mcp28.deltaRead() 8x: 596
16000000
mcp28.analogRead() 8z: 328
mcp28.analogReadMultiple() 8x: 200
analogRead() time / analogReadMultiple() time 1
mcp28.differentialRead() 8x: 300
mcp28.deltaRead() 8x: 596
***************************************
8000000
mcp28.analogRead() 1: 40
mcp28.analogReadMultiple() 1: 32
analogRead() time / analogReadMultiple() time 1.25
mcp28.analogRead() 2: 76
mcp28.analogReadMultiple() 2: 60
analogRead() time / analogReadMultiple() time 1.27
mcp28.analogRead() 3: 112
mcp28.analogReadMultiple() 3: 88
analogRead() time / analogReadMultiple() time 1.27
mcp28.analogRead() 4: 148
mcp28.analogReadMultiple() 4: 104
analogRead() time / analogReadMultiple() time 1.42
mcp28.analogRead() 5: 184
mcp28.analogReadMultiple() 5: 128
analogRead() time / analogReadMultiple() time 1.44
mcp28.analogRead() 6: 220
mcp28.analogReadMultiple() 6: 152
analogRead() time / analogReadMultiple() time 1.45
mcp28.analogRead() 7: 264
mcp28.analogReadMultiple() 7: 176
analogRead() time / analogReadMultiple() time 1.50
mcp28.analogRead() 8: 292
mcp28.analogReadMultiple() 8: 200
analogRead() time / analogReadMultiple() time 1.46
@alx-uta Created a develop branch as preparation for the new release 0.2.0 . Besides the analogReadMultiple() it includes maintenance updates, documentation etc.
@alx-uta
Apparently I did not think and test well enough, the single flag for differentialRead got lost. I'm going to fix this in the upcoming release by reverting the analogADC() code. The multiple code will stay of course.
int16_t MCP_ADC::readADC(uint8_t channel, bool single)
{
if (channel >= _channels) return 0;
int16_t reading[1];
MCP_ADC::readADCMultiple(&channel, 1, reading);
return reading[0];
}
Raw output performance test on AVR / UNO
The results are close, compared to the esp32 where we can see the performance gain.
Created a develop branch as preparation for the new release 0.2.0 . Besides the analogReadMultiple() it includes maintenance updates, documentation etc.
That sounds good, I guess you'll also need to update the examples.
Apparently I did not think and test well enough, the single flag for differentialRead got lost. I'm going to fix this in the upcoming release by reverting the analogADC() code. The multiple code will stay of course.
Ah, sorry about that, I missed it when I moved the code.
0.2.0 released
0.2.0 released
Thanks again for this great library.
I had a bit of free time, and I decided to see if I could read the data faster. This was mainly because I have 2x MCP23S17 and 2x MCP3208 on my PCB.
It looks like if we'll read all the channels inside the same SPI transaction the data read is faster in most of the cases. If possible, can you please have a look and see if this is something you'd add to the library?
I'll be more than happy to do any adjustments, mainly because my programming skills in C/C++ are a bit limited.
Test results: