sparkfun / SparkFun_DataLogger

Documentation and firmware for the SparkFun DataLogger IoT line of products.
https://docs.sparkfun.com/SparkFun_DataLogger/
18 stars 4 forks source link

NAU7802 Inconsistent Readings on Device Reset/Reboot #22

Closed bherbruck closed 8 months ago

bherbruck commented 9 months ago

Description

Issue Summary: Our IoT data logger setup, integrating the SparkFun Qwiic Scale - NAU7802, is experiencing significant variations in sensor readings immediately after the device undergoes a reset or reboot. This challenge persists regardless of the power supply or load cell used and is not mitigated by the fact that calibration remains intact. Notably, the required taring process post-reboot is not feasible for our headless, production-environment devices, leading to unreliable data collection and compromising data integrity and analysis.

Expected Behavior: Sensor readings should be consistent and accurate before and after device resets or reboots, with minimal variation that adheres to the expected tolerance levels specified in the sensor's datasheet, without the need for manual intervention such as taring.

Actual Behavior: Post reset or reboot, readings from the NAU7802 vary significantly, despite the calibration being intact. The issue occurs even with no load cell connected and without any change in the measured weight. For example, readings can exhibit a substantial margin of variation (e.g., more than 10% variation) immediately following a device reset, indicating a need for taring that cannot be addressed in our headless device configuration.

Steps to Reproduce

  1. Power on the device and allow it to stabilize for a few minutes.
  2. Record baseline readings from the NAU7802 for a known weight (or with no load to observe variance).
  3. Reset or reboot the device without altering the setup.
  4. Once the device is back online, immediately take new readings from the NAU7802.
  5. Note the significant variations in readings, indicating a need for taring which is not feasible.

Environment

Additional Information

bherbruck commented 9 months ago

Here is a visualization of the readings with a constant 1kg test weight over 30 minutes with a wake/sleep time of 120/10 seconds

image

gigapod commented 9 months ago

Hi @bherbruck,

Okay - we'll take a look at this. We might have something set wrong in our driver (which uses the same logic/code as the qwiic/open log driver).

And thank you for the very detailed information - very helpful on our end.

-Kirk

bherbruck commented 9 months ago

@gigapod I just confirmed this also is the case when using the sparkfun Arduino library on other esp32-thing devices. Maybe it is my unit.

gigapod commented 9 months ago

We think the issue might be that the calibration and zero point values that are calculated when you run those functions from the DataLogger menu system for the NAU7802 are not saved intnernally. So when when you restart the board, or the system wakes up from sleep, those settings are not resotred and the system runs in a initial/uncalibrated mode.

If this is the cause, it's a easy fix, but We need to verify this (propbably tomorrow when we get some hardware to check against).

One way you could verify this now is to run the loadcell calibration and zero-point functions. Then (before restarting/sleeping the system) in in the menu system, select the "Save Settings" option, and from there select the "Save Settings" function - this will force a save. At this point, we belieave everything will be stable when you restart or wake the system.

bherbruck commented 9 months ago

One way you could verify this now is to run the loadcell calibration and zero-point functions. Then (before restarting/sleeping the system) in in the menu system, select the "Save Settings" option, and from there select the "Save Settings" function - this will force a save. At this point, we belieave everything will be stable when you restart or wake the system.

I still have the issue after trying this.

Interestingly, I also have the issue on hard-coded calibration values on all my SparkFun Thing Plus boards.

This may be my hardware or a firmware issue with SparkFun Qwiic Scale NAU7802 Arduino Library.

I do not have this issue with all the same setup on my HX711.

gigapod commented 9 months ago

Okay - thanks for trying this. We'll continue to troubleshoot. ... If we don't see the issue, we'll get a new board sent to you.

-Kirk

bherbruck commented 9 months ago

UPDATE It might be in the firmware, maybe moving this issue to that repo would be more useful. I just tested on another Qwiic Scale and I'm having the same results. The raw readings from the scale appear to be "randomly" offset. The relative readings remain the same (-6g "zero" = 994g with load)

gigapod commented 9 months ago

Hi -

For clarifiction - by Firmware, do you mean the firmware for the Qwiic Scale?

bherbruck commented 9 months ago

I mean the library here: SparkFun Qwiic Scale NAU7802 Arduino Library.

PaulZC commented 9 months ago

Hi @bherbruck / @gigapod ,

I can replicate the same issue on OpenLog Artemis. I've been updating the OLA firmware and took the opportunity to see if I could debug this issue. It's a strange one... I have made progress and can minimise the 'steps' on power-up, but I don't have a magic cure. I'm still seeing steps at about the 0.5% to 1% level using a 100g test weight on a kitchen scale load cell.

Here's a snippet of OLA data. It is powering off the Qwiic bus, sleeping for 5 seconds, waking, powering on, taking a measurement using an average of 4 readings:

01/01/2000,00:05:29.78,99.08,0.206,
01/01/2000,00:05:34.78,100.59,0.206,
01/01/2000,00:05:39.78,99.51,0.205,
01/01/2000,00:05:44.79,99.84,0.205,
01/01/2000,00:05:49.79,100.59,0.205,
01/01/2000,00:05:54.79,100.67,0.205,
01/01/2000,00:05:59.78,98.93,0.205,
01/01/2000,00:06:04.79,99.43,0.205,
01/01/2000,00:06:09.79,99.36,0.205,
01/01/2000,00:06:14.79,99.50,0.205,
01/01/2000,00:06:19.78,99.61,0.205,
01/01/2000,00:06:24.78,99.19,0.205,
01/01/2000,00:06:29.79,98.95,0.205,
01/01/2000,00:06:34.78,99.30,0.205,
01/01/2000,00:06:39.79,99.05,0.205,
01/01/2000,00:06:44.79,98.65,0.205,
01/01/2000,00:06:49.79,99.30,0.204,
01/01/2000,00:06:54.79,99.55,0.204,
01/01/2000,00:06:59.78,98.97,0.204,
01/01/2000,00:07:04.78,98.87,0.204,
01/01/2000,00:07:09.80,99.49,0.204,
01/01/2000,00:07:14.78,99.91,0.204,

Ignore the right-most column. That's just the sample rate. The 100g weight is being read at roughly the +/- 1% level.

Things I discovered along the way:

By default, the NAU7802 library performs a calibrateAFE on each begin. This tells the NAU7802 to do its own internal calibration using a built-in voltage source. The offset seen is stored in a register and then deducted from each measurement by the chip itself - not the library. I suspect this causes the biggest steps in the data on each wake. Skipping the calibrateAFE seems to help, by doing .begin(Wire, false); , but it is not a full cure.

By default, the NAU7802 library sets the internal LDO regulator to 3.3V. That doesn't make much sense as the Qwiic bus runs at 3.3V, there's no headroom. Setting the LDO to 3.0V seems to help, but again it is not a full cure.

The above data was taken with: calibrateAFE disabled; and the LDO set to 3.0V instead of 3.3V.

I'll be publishing the updated OLA firmware tomorrow. It might be useful in helping to further diagnose this issue:

Menu: Configure NAU7802 Load Cell Amplifier

Scale calibration factor: 391.130005
Scale zero offset: 227770
Weight currently on scale: 100.06

1) Sensor Logging: Enabled
2) Zero scale
3) Calibration weight: 100.0
4) Calibrate scale
5) Number of decimal places: 2
6) Average number of readings to take per weight read: 4
7) Gain: 128
8) Sample rate: 80
9) LDO voltage: 3.0
x) Exit
gigapod commented 9 months ago

Thanks @PaulZC - this is helpful.

The datalogger was not persisting the Zero and Cal values for the NAU7802 across restarts, so I have enabled that in a test environment I have. It might of helped slightly, but there was still noise.

Let me couple this with the adjustments you noted above and see what results I get.

An option that "zero's" the NAU at startup could be added, but this won't work for a good slice of use cases (and seems like the wrong approach).

gigapod commented 9 months ago

@bherbruck @PaulZC -

I took @PaulZC suggestions and increased the sample counts used when interacting with the NAU7802 and do see improvements over samples and over sleep/wake events. It's still noisy at the 4th significant digit in the readings I'm seeing, but improved. This could also be my test setup (load cell screwed to a 2x4 :).

Below are some output samples - zero point across sleep wake and with mass on the load cell across sleep/wake events.

I'll work to get a "preview" of this firmware posted next week so it can be tested / evaluated in the field (on last issue to resolve before this).

zero point after wake

Screenshot 2024-02-23 at 5 09 55 PM

zero point after wake

Screenshot 2024-02-23 at 5 10 11 PM

with "12 unit" mass after wake

Screenshot 2024-02-23 at 5 10 27 PM

with "12 unit" mass after wake and then jumps into menu system

Screenshot 2024-02-23 at 5 11 01 PM
PaulZC commented 9 months ago

I think I might have found the root cause... In this logic analyzer screenshot, Channel 2 is AVDD - the output of the NAU7802's internal LDO which powers the bridge. After being enabled, it takes about 0.2s to ramp to voltage on our SEN-15242. And another 0.2s if you change the voltage. The 'glitch' at ~4.77s is me instructing the LDO to change from 3.3V to 3.0V. The library currently doesn't wait for the LDO to stabilize; during .begin it sets the LDO voltage to 3.3V and then - almost immediately - tries to perform a calibrateAFE. I suspect the voltage change during the calibration could be causing all kinds of badness.

image

More testing to do...

PaulZC commented 9 months ago

Nah, that wasn't it... Another red herring... I think I've proven that the internal calibration (calibrateAFE) is causing the steps. But I'm struggling to minimise. Disabling the calibrateAFE does remove the steps, but I think leaves the chip more vulnerable to temperature changes.

PaulZC commented 9 months ago

OK. More results for you! I've thrown every trick I can think of at this. Increasing the number of readings to average definitely helps. The steps are definitely caused by calibrateAFE (called by begin by default) - the NAU7802 sets its internal offset register to a slightly different value after each calibration and this appears as the steps in the data after waking from sleep. The thing I've had most success with is using External offset calibration. This is the equivalent of a true Tare / Zero. It reads the actual voltage from the strain gauge - not the internal reference - and loads that into the the NAU7802's own offset register. I've modified the NAU7802 library to make that possible - it can now read the 24-bit signed offsets and 32-bit gains if needed, and lets you select External calibration. The library zero offset then reads as close to zero, because the NAU7802 offset register is doing all the work. But, despite all of this, I'm still seeing variations at about the 1% level. Here's data from my kitchen scale, with and without a 100g weight, wake every 5 seconds, restore the saved offset register and library offset/gain, sample, average 20 readings at 80Hz, gain 128, LDO 3.0V.

Menu: Configure NAU7802 Load Cell Amplifier

Scale calibration factor: 390.269989
Scale zero offset: 34
Scale offset register: -8162734
Weight currently on scale: -0.68

1) Sensor Logging: Enabled
2) Zero scale
3) Calibrate scale
4) Calibration weight: 100.0
5) Number of decimal places: 2
6) Average number of readings to take per weight read: 20
7) Gain: 128
8) Sample rate: 80
9) LDO voltage: 3.0
10) Calibration mode: External
x) Exit

Menu: Configure Attached Devices
1) NAU7802 Weight Sensor (0x2A)
x) Exit

Menu: Main Menu
x) Return to logging

01/01/2000,00:00:42.42,-0.73,
01/01/2000,00:00:42.87,-0.52,
01/01/2000,00:00:47.98,-0.79,
01/01/2000,00:00:52.97,-0.97,
01/01/2000,00:00:57.98,-1.40,
01/01/2000,00:01:02.98,-1.22,
01/01/2000,00:01:07.97,-0.87,
01/01/2000,00:01:12.98,-0.57,
01/01/2000,00:01:17.98,-0.43,
01/01/2000,00:01:22.99,-0.36,
01/01/2000,00:01:27.98,99.72,
01/01/2000,00:01:32.99,99.53,
01/01/2000,00:01:37.98,99.66,
01/01/2000,00:01:42.98,99.72,
01/01/2000,00:01:47.98,99.93,
01/01/2000,00:01:52.98,99.87,
01/01/2000,00:01:57.98,99.98,
01/01/2000,00:02:02.98,99.99,
01/01/2000,00:02:07.99,100.00,
01/01/2000,00:02:12.98,99.96,
01/01/2000,00:02:17.97,-0.25,
01/01/2000,00:02:22.98,-0.29,
01/01/2000,00:02:27.98,-0.44,
01/01/2000,00:02:32.98,-0.21,
01/01/2000,00:02:37.99,-0.36,
01/01/2000,00:02:42.99,-0.46,
01/01/2000,00:02:47.98,-0.08,
01/01/2000,00:02:52.99,-0.06,
01/01/2000,00:02:57.98,-0.17,
01/01/2000,00:03:02.99,-0.20,
01/01/2000,00:03:07.99,99.65,
01/01/2000,00:03:12.99,99.43,
01/01/2000,00:03:17.99,99.61,
01/01/2000,00:03:23.00,99.44,
01/01/2000,00:03:27.99,100.0,
01/01/2000,00:03:32.98,99.91,
01/01/2000,00:03:37.99,99.95,
01/01/2000,00:03:42.99,99.99,
01/01/2000,00:03:47.99,99.99,
01/01/2000,00:03:53.00,99.94,
01/01/2000,00:03:57.99,100.11,

I'll release updated versions of the NAU7802 library and the OLA firmware tomorrow. But both release_candidate branches on GitHub are up to date if you want to take a peek.

Enjoy! Paul

PaulZC commented 9 months ago

The updated OLA firmware just went live. The most relevant changes for the NAU7802 are here. You will need to pick your way through this... I went overboard and added: support for none, internal or external calibration; gain; sample rate; and LDO voltage. The external calibration is not a magic fix but, I think, does give improved performance over internal calibration. You need to decide if you want to add gain, sample rate, and LDO voltage as properties too. If you don't allow the LDO voltage to be changed, I'd recommend setting it to 3.0V manually (begin sets it to 3.3V).

Other relevant bits of code are here and here.

gigapod commented 9 months ago

Thanks @PaulZC for all this work over the weekend!

I've taken this update logic and added it to the DataLogger framework - for a 195g value weight, I'm seeting a variance value of +/- ~.6 - around a 0.5% range. And the values are the same across sleep/wake events.

Some more work todo - after a cal or zero cal - the values are not correct until a restart ... just need to reinit the device after the cal calls. But I should be in a position in a couple days to post a preview with this update in it for testingin.

PaulZC commented 9 months ago

Sorry, I forgot to share more important bits of code:

Here is where the scale is zero'd. This is the only place where the External calibration is performed - if selected. Here is where the scale calibration weight is applied. Here is where the gain is set. If the code is using Internal calibration, it is re-calibrated after the gain is changed. This will introduce a step in the data. Here is the sample rate change. Ditto on internal calibration. Here is the LDO change. Ditto on internal calibration. If the calibration mode is changed, everything gets reset Here.

PaulZC commented 9 months ago

There is more information in the NAU7802 Library Release Notes and the OpenLog Artemis Release Notes.

Enjoy! Paul

gigapod commented 8 months ago

Fixed: The fix for this is included in v1.2

1.2 Pre Release