CuriousScientist0 / ADS1256

An Arduino-compatible library for the ADS1256 24-bit analog-to-digital converter.
MIT License
31 stars 11 forks source link

High-speed differential measurements are incorrect #12

Closed BenjaminPelletier closed 2 months ago

BenjaminPelletier commented 2 months ago

Attempting to cycleDifferential at high speed results in incorrect values. Specifically, the measurement from the differential pair is repeated as the value for the second differential pair, and the same value (I'm not sure which) is repeated for both the third and fourth differential pairs.

To reproduce, use the first sketch below. Physically tie each AINx to ground, except tie AIN0 to 3.3V. I used an Arduino Uno. Here is a sample of the output I observed:

Print while reading: 5541145    -15 1   38  
Print after reading: 5541193    5541193 42  42   BAD!
Print while reading: 5541176    3   -17 43  
Print after reading: 5541257    5541257 7   7    BAD!
Print while reading: 5541230    -26 4   51  
Print after reading: 5541253    5541253 -34 -34  BAD!
Print while reading: 5541269    64  18  55  
Print after reading: 5541168    5541168 25  25   BAD!
Print while reading: 5541142    6   21  -26 
Print after reading: 5541173    5541173 32  32   BAD!

Each line is printing the 4 differential values from a full cycle of cycleDifferential. Note:

  1. Each line prefaced with "Print while reading" is reasonable -- each value is usually different, and the first channel is around 5 million ADC counts (corresponding to 3.3V).
  2. The first and second, then third and fourth, values are always identical in the lines prefaced with "Print after reading". It is implausible that these represent the true values as the second differential pair is connected directly to ground, yet we observe exactly the same value as the first channel (which is not connected to ground) every time.

I strongly suspect this is due to incorrect timing or handshakes in the acquisition of differential measurements from the ADS1256. The difference between the two lines is that "Print while reading" writes content to the serial port after each channel's measurement while "Print after reading" waits until all 4 channels' measurements have been captured before writing to the serial port. This means there is an additional delay in "Print while reading" that is not present in "Print after reading". Since the latter fails while the former succeeds, I suspect that a manual delay is not set to be long enough in the library, or else an handshake signal is interpreted incorrectly.

If I use the second sketch below to figure out what additional delay fixes the issue, the answer appears to be "at least 112 microseconds":

114us:  0   bad out of 1000; last measurement:  5541083 23  29  30
113us:  0   bad out of 1000; last measurement:  5541238 -9  33  33
112us:  0   bad out of 1000; last measurement:  5541410 15  9   53
111us:  786 bad out of 1000; last measurement:  5541413 5541413 20  60
110us:  894 bad out of 1000; last measurement:  5541300 5541300 22  22
109us:  795 bad out of 1000; last measurement:  5541234 5541234 11  11
108us:  479 bad out of 1000; last measurement:  5541321 9   9   10
107us:  635 bad out of 1000; last measurement:  5541225 -26 -26 48
106us:  870 bad out of 1000; last measurement:  5541365 5541365 -13 -13
105us:  991 bad out of 1000; last measurement:  5541329 5541329 3   3
104us:  995 bad out of 1000; last measurement:  5541120 5541120 14  14
103us:  994 bad out of 1000; last measurement:  5541259 5541259 13  13
102us:  998 bad out of 1000; last measurement:  5541403 5541403 38  38
101us:  996 bad out of 1000; last measurement:  67  7   7   30
100us:  996 bad out of 1000; last measurement:  5541168 5541168 24  24
99us:   997 bad out of 1000; last measurement:  5541288 5541288 -15 -15
98us:   997 bad out of 1000; last measurement:  69  -71 -71 41
97us:   998 bad out of 1000; last measurement:  36  6   6   70
96us:   998 bad out of 1000; last measurement:  5541149 5541149 8   8
95us:   998 bad out of 1000; last measurement:  130 -7  -7  18

Demo sketch:

#include <ADS1256.h>

const uint8_t ADC_DRDY = 2;
const uint8_t ADC_RESET = 0;
const uint8_t ADC_PDWN = 8;
const uint8_t ADC_CS = 10;
const float ADC_VREF = 5.0f;
ADS1256 adc(ADC_DRDY, ADC_RESET, ADC_PDWN, ADC_CS, ADC_VREF);

const uint8_t N_CHANNELS = 4;

void setup()
{
  Serial.begin(115200);

  while (!Serial)
  {
    ; //Wait until the serial becomes available
  }

  Serial.println("Initializing ADC1256...");
  adc.InitializeADC();
  Serial.println("Initialized.");

  adc.setPGA(PGA_1);
  adc.setDRATE(DRATE_30000SPS);
}

bool print_while_reading = true;
void loop()
{
  Serial.print(print_while_reading ? "Print while reading: " : "Print after reading: ");
  int32_t values[N_CHANNELS];
  for (uint8_t channel = 0; channel < N_CHANNELS; channel++)
  {
    values[channel] = (adc.cycleDifferential() << 8) / 256;
    if (print_while_reading) {
      Serial.print(values[channel]);
      Serial.print("\t");
    }
  }
  if (!print_while_reading) {
    for (uint8_t channel = 0; channel < N_CHANNELS; channel++) {
      Serial.print(values[channel]);
      Serial.print("\t");
    }
  }
  if (values[0] == values[1] && values[2] == values[3]) {
    Serial.print(" BAD!");
  }
  Serial.println();
  delay(1000);
  print_while_reading = !print_while_reading;
}

Additional delay evaluation sketch:

#include <ADS1256.h>

const uint8_t ADC_DRDY = 2;
const uint8_t ADC_RESET = 0;
const uint8_t ADC_PDWN = 8;
const uint8_t ADC_CS = 10;
const float ADC_VREF = 5.0f;
ADS1256 adc(ADC_DRDY, ADC_RESET, ADC_PDWN, ADC_CS, ADC_VREF);

const uint8_t N_CHANNELS = 4;
const uint16_t ADDITIONAL_DELAY_US = 0;

void setup()
{
  Serial.begin(115200);

  while (!Serial)
  {
    ; //Wait until the serial becomes available
  }

  Serial.println("Initializing ADC1256...");
  adc.InitializeADC();
  Serial.println("Initialized.");

  adc.setPGA(PGA_1);
  adc.setDRATE(DRATE_30000SPS);

  Serial.println("Configured.");
  delay(1000);
}

#define fix24bit(value) ((value) & (1l << 23) ? (value) - 0x1000000 : value)

void loop()
{
  const int32_t THRESHOLD = 100000;
  for (uint16_t dt = 115; dt >= 85; dt -= 1) {
    int32_t value1, value2, value3, value4;
    uint16_t n_total = 0;
    uint16_t n_bad = 0;

    for (uint16_t i = 0; i < 1000; i++) {
      value1 = adc.cycleDifferential();
      delayMicroseconds(dt);
      value2 = adc.cycleDifferential();
      delayMicroseconds(dt);
      value3 = adc.cycleDifferential();
      delayMicroseconds(dt);
      value4 = adc.cycleDifferential();
      delayMicroseconds(dt);
      n_total++;
      value1 = fix24bit(value1);
      value2 = fix24bit(value2);
      value3 = fix24bit(value3);
      value4 = fix24bit(value4);
      if (value1 < THRESHOLD || value2 > THRESHOLD || value3 > THRESHOLD || value4 > THRESHOLD) {
        n_bad++;
      }
    }

    Serial.print(dt);
    Serial.print("us:\t");
    Serial.print(n_bad);
    Serial.print("\tbad out of ");
    Serial.print(n_total);
    Serial.print("; last measurement:\t");
    Serial.print(value1);
    Serial.print("\t");
    Serial.print(value2);
    Serial.print("\t");
    Serial.print(value3);
    Serial.print("\t");
    Serial.print(value4);
    Serial.println();
  }

  delay(10000);
}
CuriousScientist0 commented 2 months ago

Wow, thank you for the fantastic explanation!

Coincidentally, I have a follower on Instagram who noticed the same erratic behaviour and reported it to me very recently. At high acquisition speeds, he got the exact same symptoms as you presented here. I tried to reproduce it, but sometimes it occurred, sometimes not, so I just thought it might be poor wiring or floating inputs. I was wrong.

It is interesting that this issue happens because there should be a firm handshake signal when the new acquisition can start which is the DRDY pin's status change. I carefully followed the datasheet of the chip and double-checked the timings with a logic analyser.

So, coming back to the code you presented, it seems that I need to introduce an additional delay between two readings while cycling the differential inputs. This is where the handshake (DRDY) pin should drive the subsequent readings, but seemingly it does not do it. The code is waiting before switching the MUX to the new channel and when the DRDY goes low, it lets the acquisition run. Then it waits again for the DRDY... and so on... With your modification, it seems that you added a little extra time after the acquisition is done and before the code starts to wait for the DRDY before changing the MUX to the new channel.

I put a delayMicroseconds(120); line directly before the waitForDRDY(); in the cycleDifferential() function and now it seems that the values are stable.

Do you think that I should do it differently, or I could just insert this small delay directly in the cycleDifferential() function?

BenjaminPelletier commented 2 months ago

It does look like you are very meticulous (more so than me; I think we're both good stereotypes of engineer and scientist) so I am looking forward to seeing why this happens. I'm planning to look into it too, but wanted to document the repro steps before moving onto that :)

I think the 120 us delay should practically work, but I feel like that's just the empirical solution and I'm interested in knowing why that's needed (plus, I'd expect adding a delay like that to decrease the achieved sample rate and that's very undesirable). Like you say, I would expect the chip to work exactly like the datasheet says, so I wouldn't expect that additional delay to be necessary if the implementation were perfect. If it is necessary despite a perfect implementation, then the next bug report would probably be to Texas Instruments, but I don't know that I've ever experienced a hardware bug myself before.

Thanks for the library, and especially the excellent documentation :)

CuriousScientist0 commented 2 months ago

Thank you!

Well, based on the details you provided in your comments, I think you are even more meticulous than me. I am really thankful that you took the time to try to work on the solution and provide a very good step-by-step guide!

I have the same feeling about adding a manual delay. It feels very empirical. I am not a big fan of "magic numbers" because they are usually just a temporary fix, but they don't solve the root cause of the issue.

I will run a few more tests with a logic analyser and oscilloscope to be 100% sure, but I already did it so many times when I wrote the library. But it is of course possible that I overlooked something. If it still does not lead to anything, then I will contact TI and ask them about their opinion.

I am very happy to hear that you liked the documentation. It saves a lot of time in the long run because I can just tell people to read it. I think I documented most of the things down to the tiniest details.

CuriousScientist0 commented 2 months ago

So, I could not resist and quickly hooked up a test system with a logic analyser. I monitored the DRDY, SCLK, DIN (MOSI) and DOUT (MISO) pins.

The signals below are captured from the code that has no delay. If you see the DRDY signals, it is rather random. It "misfires" several times and this seems to cause the erroneous behaviour. NoDelay_CycleDifferential_c

If you look at this image below, you can see that it is much more homogeneous and the DRDY has a very periodic signal. This signal is produced with the 120 us delay before the waitForDRDY() function. This seems to make everything correct. 120usDelay_CycleDifferential_c

Now I need to figure out if it is my code or the ADS1256!?

BenjaminPelletier commented 2 months ago

That's fantastically useful. I'll see if I can reproduce with an oscilloscope to see if it might be an electrical noise thing versus a digital logic thing (presumably the latter, but perhaps worth ruling things out)

BenjaminPelletier commented 2 months ago

Hmm, at first glance, it seems like the ADS1256 is only being queried twice for a cycle of "four" channel measurements. Below, red is SCLK and blue is DRDY. I'm running a 4-channel read cycle every second, and there does not appear to be any SCLK movement outside this period. The sketch I'm using is below as well.

image

...and this is after setting the ADDITIONAL_DELAY_US constant to 120:

image

It looks like the library is behaving as if DRDY went low when it did not go low electrically. Here's a zoomed in view, back to no additional delay (current bad behavior):

image

Continuing the investigation :)

Trial sketch:

#include <ADS1256.h>

const uint8_t ADC_DRDY = 2;
const uint8_t ADC_RESET = 0;
const uint8_t ADC_PDWN = 8;
const uint8_t ADC_CS = 10;
const float ADC_VREF = 5.0f;
ADS1256 adc(ADC_DRDY, ADC_RESET, ADC_PDWN, ADC_CS, ADC_VREF);

const uint8_t N_CHANNELS = 4;
const uint16_t ADDITIONAL_DELAY_US = 0;

void setup()
{
  Serial.begin(115200);

  while (!Serial)
  {
    ; //Wait until the serial becomes available
  }

  Serial.println("Initializing ADC1256...");
  adc.InitializeADC();
  Serial.println("Initialized.");

  adc.setPGA(PGA_1);
  adc.setDRATE(DRATE_30000SPS);

  Serial.println("Configured.");
  delay(1000);
}

void loop()
{
  const int32_t THRESHOLD = 100000;
  int32_t value1, value2, value3, value4;
  uint16_t n_total = 0;
  uint16_t n_bad = 0;

  for (uint16_t i = 0; i < 1; i++) {
    value1 = adc.cycleDifferential();
    delayMicroseconds(ADDITIONAL_DELAY_US);
    value2 = adc.cycleDifferential();
    delayMicroseconds(ADDITIONAL_DELAY_US);
    value3 = adc.cycleDifferential();
    delayMicroseconds(ADDITIONAL_DELAY_US);
    value4 = adc.cycleDifferential();
    delayMicroseconds(ADDITIONAL_DELAY_US);
    n_total++;
    if (value1 < THRESHOLD || value2 > THRESHOLD || value3 > THRESHOLD || value4 > THRESHOLD) {
      n_bad++;
    }
  }

  Serial.print(n_bad);
  Serial.print("\tbad out of ");
  Serial.print(n_total);
  Serial.print("; last measurement:\t");
  Serial.print(value1);
  Serial.print("\t");
  Serial.print(value2);
  Serial.print("\t");
  Serial.print(value3);
  Serial.print("\t");
  Serial.print(value4);
  Serial.println();

  delay(1000);
}
CuriousScientist0 commented 2 months ago

Hmm. After rereading the datasheet, I might have some clue about the issue. It would explain at least the fact that the erroneous behaviour is only present at high sampling speeds.

On page 20, table 13 mentions the settling time. And on page 21, table 14 mentions the MUX cycling throughput.

It somehow feels that this could be the key, however, I might misinterpret it. At high sampling speeds, the conversion (basically, 1 cycle) is performed so fast that when the code is about to reach the MUX changing part of the code again, the settling time has not yet passed. And then this can cause the second channel to return the first channel's value.

I zoomed in the chart here and there, and I found parts where the time from WREG to WREG (t19) is less than 228.6 us which is t19 at 30 ksps. Also, the DRDY is still high when the next WREG is written, although this should not happen.

Screenshot 2024-09-09 001648

I believe that adding the extra delay helps both the MUX to settle and the DRDY to properly settle at the desired status.

On the other hand, as you also mentioned, it messes up the sampling frequency. The throughput goes down to roughly 1.8 kHz, instead of 4.3 kHz. After a little sleep, I will try to dig deeper.

BenjaminPelletier commented 2 months ago

I think the DRDY handshake is being misinterpreted. Below is what I think is a zoomed-in trace of two calls to cycleDifferential with my annotations:

image

I believe the primary problem is the bold text. At this point, DRDY falls and therefore marks DRDY_value as true, but then DRDY is, in fact, not ready at the time waitForDRDY() is later called. The solution should be to set DRDY_value to false just before the first possible moment it could potentially indicate true readiness for the next call. I have a draft PR that I think does this close to correctly, but I think you may need to refine the locations I've set DRDY_value to false.