CircuitSetup / Expandable-6-Channel-ESP32-Energy-Meter

Hardware & Software documentation for the CircuitSetup Expandable 6 Channel ESP32 Energy Meter. Works with ESPHome and Home Assistant.
https://circuitsetup.us/product/expandable-6-channel-esp32-energy-meter/
MIT License
510 stars 102 forks source link

Different voltage readings for connected voltage inputs #88

Open mmallozzi opened 2 years ago

mmallozzi commented 2 years ago

I'm in the US with split phase 120V/240V AC power. I just set up my first expandable meter, starting with one of my subpanels where all I want to measure right now are a few 240V HVAC appliances which seem to be balanced (all newly wired with 2 wires plus ground, so no neutrals), so I figured I'd be okay with one AC transformer. So I have not cut the jumpers to enable separate transformers (I'll end up doing that for the meter on my main panel). If I understand correctly, that means they are hardwired together, and I should get the same exact reading for V1 and V2. However, I am seeing ~117V for V1 and ~127V for V2. The voltage calibration values in EmonESP are the same for both.

CircuitSetup commented 11 months ago

I am also wondering whether the 4 voltage readings are actually necessary to read 6 power measurements. Would it be possible to just use the L1 and L2 voltages measurements from one of the two boards for the purposes of the entire system? I am guessing not as each chip will independently read voltage & current and calculate power by multiplying the two?

They can't be internally mapped to each other, no. You could do the calculation with an ESPHome lambda, but that would add additional overhead to the ESP32 (which may be okay, but depends largely on how much data you're collecting).

Regardless, you don't need to output all 4 voltages for the power calculations to be done on the metering chips. That happens independently of the ESP32.

descipher commented 11 months ago

Also, I'm not sure if this matters or not since I haven't tested it, but the Application Note indicates the offset calibration should be done before the gain calibration. I believe you have the offset running after the gain in your code.

atm90e32_calibration_flow

I missed that order statement, will change it in case it does matter.

descipher commented 11 months ago

I am also wondering whether the 4 voltage readings are actually necessary to read 6 power measurements. Would it be possible to just use the L1 and L2 voltages measurements from one of the two boards for the purposes of the entire system? I am guessing not as each chip will independently read voltage & current and calculate power by multiplying the two?

They can't be internally mapped to each other, no. You could do the calculation with an ESPHome lambda, but that would add additional overhead to the ESP32 (which may be okay, but depends largely on how much data you're collecting).

Regardless, you don't need to output all 4 voltages for the power calculations to be done on the metering chips. That happens independently of the ESP32.

I can concur that you only need to set voltage/current gain at the instances of L1 and L2 phases for a chip to calculate split phase correctly, the two voltages can be measured from any board. You would need to set the right gains for input CTs and LINE voltage for the specific circuit. Keep in mind that a voltage can drop significantly if sourced at the end of a heavily loaded conductor. Since a chip has the VA feed send to all phases then for the highest level of accuracy you want to match it with the corresponing LINE+Voltage. E.g if its sent a VA of L1's voltage then keep the CT's from L1 on that chip.

descipher commented 11 months ago

I am glad to make any changes you require but am unsure I understood your instructions. Below is a simplified version of the YAML where the vn_cal are used.

Main Board IC1 phase a, b, c use v1_cal IC2 phase a, b, c use v2_cal

Addon Board IC1 phase a, b, c use v3_cal IC2 phase a, b, c use v4_cal

@alexruffell The information I am looking for is if there are voltages reads coming from an unexpected source register. For example we read UrmsA and we get some other data back that is not the target. This tells me there are SPI read problems that are not being validated and thus we need to do something to ensure we are getting back the correct register address data. To do this we only set gain on specific voltage inputs we expect to be accurate, reads of any other inputs will be default to a gain of 0x8000H and will be obviouly out of range when calculated.

This temporary test yaml has the substitute variable lines commented out on the non-target phases per chip.

#IC1
  - platform: atm90e32
    id: chip1
    cs_pin: 5
    phase_a:
      voltage:
        name: "L1 V"
        id: ic1Volts
      current:
        name: "L1 A"
        id: ct1Amps
        filters:
          - multiply: 4
      power:
        name: "L1 W"
        id: ct1Watts
        filters:
          - multiply: 4
      gain_voltage: ${v1_cal}
      gain_ct: ${current_cal_ct1}
    phase_b:
      current:
        name: "${appliance1} L1 A" #Dryer
        id: ct2Amps
      power:
        name: "${appliance1} L1 W" #Dryer
        id: ct2Watts
      #gain_voltage: ${v1_cal}
      gain_ct: ${current_cal_ct2}
    phase_c:
      current:
        name: "${appliance2} L1 A" #Oven
        id: ct3Amps
      power:
        name: "${appliance2} L1 W" #Oven
        id: ct3Watts
      #gain_voltage: ${v1_cal}
      gain_ct: ${current_cal_ct3}
    frequency:
      name: "L1 Hz"
      device_class: frequency
    line_frequency: 60Hz
    gain_pga: 1X

#IC2
  - platform: atm90e32
    id: chip2
    cs_pin: 4
    phase_a:
      current:
        name: "${appliance2} L2 A" #Oven
        id: ct4Amps
      power:
        name: "${appliance2} L2 W" #Oven
        id: ct4Watts
      #gain_voltage: ${v2_cal}
      gain_ct: ${current_cal_ct4}
    phase_b:
      current:
        name: "${appliance1} L2 A" #Dryer
        id: ct5Amps
      power:
        name: "${appliance1} L2 W" #Dryer
        id: ct5Watts
      #gain_voltage: ${v2_cal}
      gain_ct: ${current_cal_ct5}
    phase_c:
      voltage:
        name: "L2 V"
        id: ic2Volts
      current:
        name: "L2 A"
        id: ct6Amps
        filters:
          - multiply: 4
      power:
        name: "L2 W"
        id: ct6Watts
        filters:
          - multiply: 4
      gain_voltage: ${v2_cal}
      gain_ct: ${current_cal_ct6}
    frequency:
      name: "L2 Hz"
      device_class: frequency
    line_frequency: 60Hz
    gain_pga: 1X

#IC1 AddOn
  - platform: atm90e32
    id: chip3
    cs_pin: 0
    phase_a:
      voltage:
        name: "L1 AO V"
        id: ic3Volts
      current:
        name: "CT7 Amps"
        id: ct7Amps
      power:
        name: "CT7 Watts"
        id: ct7Watts
      gain_voltage: ${v3_cal}
      gain_ct: ${current_cal_ct7}
    phase_b:
      current:
        name: "${hvac1} L1 A" #Downstairs
        id: ct8Amps
      power:
        name: "${hvac1} L1 W" #Downstairs
        id: ct8Watts
      #gain_voltage: ${v3_cal}
      gain_ct: ${current_cal_ct8}
    phase_c:
      current:
        name: "${hvac2} L1 A" #Upstairs
        id: ct9Amps
      power:
        name: "${hvac2} L1 W" #Upstairs
        id: ct9Watts
      #gain_voltage: ${v3_cal}
      gain_ct: ${current_cal_ct9}
    line_frequency: 60Hz
    gain_pga: 1X

#IC2 AddOn
  - platform: atm90e32
    id: chip4
    cs_pin: 16
    phase_a:
      current:
        name: "${hvac2} L2 A" #Upstairs
        id: ct10Amps
      power:
        name: "${hvac2} L2 W" #Upstairs
        id: ct10Watts
      #gain_voltage: ${v4_cal}
      gain_ct: ${current_cal_ct10}
    phase_b:
      current:
        name: "${hvac1} L2 A" #Downstairs
        id: ct11Amps
      power:
        name: "${hvac1} L2 W" #Downstairs
        id: ct11Watts
      #gain_voltage: ${v4_cal}
      gain_ct: ${current_cal_ct11}
    phase_c:
      voltage:
        name: "L2 AO V"
        id: ic4Volts
      current:
        name: "CT12 Amps"
        id: ct12Amps
      power:
        name: "CT12 Watts"
        id: ct12Watts
      gain_voltage: ${v4_cal}
      gain_ct: ${current_cal_ct12}
    line_frequency: 60Hz
    gain_pga: 1X
descipher commented 11 months ago

I did a test on one of my test rigs, I fed a 6vac L1 source signal into two phase inputs on a single chip which are networked using1% dividers resistors. I logged output of that measurement and had a consistant phase input variation of 0.25vac +- 0.01v on the inputs before calibrating the gain of each input phase. After calibration there is never a deviation of more than +- 0.01v in the logs. The code in the my repo provides significant voltage accuracy improvements using 10 consecutive averaging reads and the offset calibrations.

CircuitSetup commented 11 months ago

@descipher That's awesome! Thanks so much for taking the time to write and test your code. If you think it's ready for the ESPHome beta, definitely open a pull request.

alexruffell commented 11 months ago

@descipher I commented out the gain_voltage as per your example code, but forgot to change the update rate from 10s to the 3s I use for testing (first red arrow), then I changed it to 3s (second red arrow).

image

While the measurements appear to be quite close, the graph seems all over the place. Not sure how to interpret this...

Another view with a narrower timespan:

image

The gaps are where I flashed the ESP.

descipher commented 11 months ago

There are some interesting things going on now. We now know that all reads are the intended registers and nothing weird is going on with the SPI reads. Note that in my recent code it checks every read or write for SPI errors and logs them so check the logs via the esphome control panel when you have time.

One important note is that in the last third of the first chart you posted we can see that L1 A0 V and L2 A0 V are synced and thats very odd, since you should not see that with those phases. L1 an L2 are normally different voltage values and should not act as a sync'd offset value. I would expect that if they were on the same phase. The same thing occurs in the middle on the second chart. Am I suspicious that maybe we are observing and out of order SPI init callback where the slave CS line is not what we think it is on init. Need to rule that out somehow. Can you audit your test system and verify that L1 and L2 are connected the way you expect them to be connected? Can you post a close up photo of the CS jumpers on the boards?

alexruffell commented 11 months ago

@descipher The voltage reference for both inputs on both boards is still from the main transformer as you had requested a few posts back. I can revert to using the two separate transformers fed by the same outlet to see what happens.

CircuitSetup commented 11 months ago

@descipher The voltage reference for both inputs on both boards is still from the main transformer as you had requested a few posts back. I can revert to using the two separate transformers fed by the same outlet to see what happens.

If they're fed by the same outlet, then they're reading the same phase, which would defeat the purpose of reading 2 voltages.

alexruffell commented 11 months ago

@CircuitSetup The system has been on my workbench for a year... I have not installed it as once I do so, it will be hard to do any troubleshooting to eliminate the issue discussed in this thread. Also, by powering everything with one phase, should make it easier to calibrate it as I should be seeing the same measurement on all voltage inputs. Once it is installed it will be fed by two phases which will both vary randomly.

@descipher BTW, do you recommend I modify my system to use 2 identical transformers? The main issue is the size but assuming I can fit them, should I do so? The image below shows both the large and small transformers I am using. The small one is only for the other phase voltage reference.

image

EDIT: The large transformer's secondary is wired in series so 12V ac @ 1.2A

Cougar commented 11 months ago

I think it is much easier to do (offset) calibration and voltage difference troubleshooting when both inputs are feed from the same source. As long the same source give different readings, there is no point to assume that different sources are magically more precise.

descipher commented 11 months ago

@alexruffell Right! forgot we did that, that explains the phases being in sync so thats good. The reason the read values are close is due to the averaging function thats now active. These results are not bad, I have changed the way polling data retrieval works now by locally reading the registers and storing it and then during a poll the api will get the last stored value. That allows a faster api call avoiding blocking warnings unless the api itself is the delay. The refresh locally is testing at 500ms. I would like to run it faster to see what a reasonable local load/rate would be. One drawback to this method when the refresh is slow like at 500ms it can slightly vary the point in time of read value vs the api point in time value. I am seeing that in your charts. One thing I noticed is there are the read time deltas with the two board and separate chips. Each SPI slave will have a small read point in time difference which will give us a different value on the same phase. This is normal and to be expected . image

descipher commented 11 months ago

@alexruffell I would recommend that the transformers be as close to equal as possible.

alexruffell commented 11 months ago

@descipher I measured the output of the two transformers while connected to the system:

9V transformer outputs 14.5Vac 12V transformer outputs 15.5Vac

Given there is only 1Vac difference, is that good enough to be left as is? I reviewed the space I have and fitting two of these transformers is not going to work so I'd have to find replacements (which I am struggling to do - mouser seems to only have this brand/model. The small one was recovered from a device).

Edit:

I am puzzled... I am now back using both transformers and tried to tweak the calibration to get L1 and L2 closer. Now L1 in swinging up and down as can be seen below. Also, once again this swing appears to be around 0.5V. When looking at the voltage with the DMM (see pic below), it also swings up and down but less than 100mV. This swing may be due to what @Cougar discussed earlier in this thread (power consumption of the ESP32 when TXing). The other transformer only swings 30mV.

image image image
descipher commented 11 months ago

At 15.46vac into a 20:1 divider will yield an input vrms of 773mv which is outside the ATM90E32 datasheets specified range of 0-750mv. You need to have an input voltage no more than 12vac to keep it from overflowing the ADC input voltage divider network circuit. The scope tells us the 15.46 vrms is very noisy, not sure what that power pack is doing but it's ugly ..

Correction the datasheet specs a 720mv maximum

Cougar commented 11 months ago

You can check how good is the transformer output using just a resistor as a load.

ESP load is anything but stable. To feed ESP and measure stable AC voltage you need big enough transformer and big capacitors at DC side to keep the fluctuation below the error margin.

My 9V and 6W (0.67A) transformers were not good for that and I changed the board DC power to separate DC power supply and use transformers only for AC voltage measurement like I described in https://github.com/CircuitSetup/Expandable-6-Channel-ESP32-Energy-Meter/issues/88#issuecomment-1257067335

descipher commented 11 months ago

@CircuitSetup Did a correction to the voltage and current offsets calcs, the application guide defines reading the full 32bit measurement value, I was reading only the upper 16 bits with read16(). Please feel free to test this.

alexruffell commented 11 months ago

I am replacing the larger transformer with an IF-14-20 which outputs 10V ac 1.4A when both secondary windings are connected in parallel. This will require a minor PCB hack as the one I am using has two 6Vac windings connected in series.

The current 12Vac nominal output is pretty much all the time over 15Vac (input max with the 20:1 divider):

image

I might be able to make the modification tomorrow.

alexruffell commented 10 months ago

I installed the IF-14-20 transformer which has a nominal output of 10Vac. My DMM measures around 12.7Vac. This is a histogram over a brief period of time so I expect it to be a wider range due to line fluctuations:

image

Both L1 tend to still jump up and down and by different amounts but that might just be the load of the ESP along with time delays and the added math. Just a guess. The two L2 look pretty good as they both track my DMM well.

@Cougar - I've been trying to replicate what you did in Grafana to graph the difference but due to the samples not having the same timestamp, I can't get it to calculate a difference. How did you do your graphs?

@descipher I spent some time trying to calibrate the voltage measurements further and I think I can't get it any better:

image

I caught the screenshot at a good moment, it isn't always as close.

Is there any room for improvement, or should I just call it done and observe over a few days? I'd feel better if both L1 looked more like the 2 L2 (tracking each other closely) but other than implementing @Cougar 's hardware mod, I don't know what else to try. The caps I added to assist with load on 3.3V and 5V rails on the ESP do not seem to do much but I guess I should take a second look given all the changes.

descipher commented 10 months ago

@alexruffell This is an output of 15s samples with only L1 fed into my Power Meter 1 design using a 6vac BLOCK transformer. Calibrations were done for 1% resistors on a 10:1 divider net.

phase_a_voltage_cal: '4470' phase_c_voltage_cal: '4479'

image

The variance is +-0.02v between phase a and phase c inputs of the single IC. There will always be some noise. The other design possibillity for voltage sensing is to use a directly coupled AC line voltage divider. I agree that the small time delta between api calls does cause some variation in read values when the data is callback polled over separate spi interfaces.

Cougar commented 10 months ago

@Cougar - I've been trying to replicate what you did in Grafana to graph the difference but due to the samples not having the same timestamp, I can't get it to calculate a difference. How did you do your graphs?

Here is my Flux query for that:

import "join"

left = from(bucket: "emon")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["measurement"] == "EMON-3 IC1 LineA Voltage")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> keep(columns: ["_time", "_start", "_stop", "_field", "_value", "_measurement"])
  |> group(columns: ["_time", "_value"], mode: "except")

right = from(bucket: "emon")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["measurement"] == "EMON-3 IC2 LineA Voltage")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> keep(columns: ["_time", "_start", "_stop", "_field", "_value", "_measurement"])
  |> group(columns: ["_time", "_value"], mode: "except")

join.time(method: "full", left: left, right: right, as: (l, r) => ({l with f2: r._value}))
  |> map(fn: (r) => ({ r with _value: (r.f2 - r._value) / r.f2 }))
  |> drop(columns: ["f2"])
  |> filter(fn: (r) => r._value > -0.01 and r._value < 0.01)

I take reading of two sensors (with one second updates) to left and right table, then join them by time, calculate the difference and divide by one reading to get a percent. I filtered out bigger than 1% (0.01) difference just to ignore random peaks and not change graph mean calculation. In graph the unit is percent (0.00-1.00) and I set max and min to 1% (-0.01 to 0.01). If you remove the division part (/ r.f2) then you can get absolute difference value.

I haven't tried the latest code with changed calibration yet but will do soon. I have three extension modules too, so I can connect up to 8 ATM90E32 chips to one AC input at the same time and compare all of them while powering ESP and ATM90E32 separately from external 3.3V DC.

CircuitSetup commented 10 months ago

@CircuitSetup Did a correction to the voltage and current offsets calcs, the application guide defines reading the full 32bit measurement value, I was reading only the upper 16 bits with read16(). Please feel free to test this.

Oh good, I didn't catch that. Thanks so much for modifying everything! Definitely open a pull request so it gets added to the next dev release.

descipher commented 10 months ago

@CircuitSetup Did a correction to the voltage and current offsets calcs, the application guide defines reading the full 32bit measurement value, I was reading only the upper 16 bits with read16(). Please feel free to test this.

Oh good, I didn't catch that. Thanks so much for modifying everything! Definitely open a pull request so it gets added to the next dev release.

Thanks. Will do, I would like to do some more testing I see a wierd nuonce sometimes where the voltage is recorded as 0 which which is likely to be real during upstream AC relay load switching but I need to be certain that it is true by setting up parallel sensors synced up in time for data correlation.

Cougar commented 10 months ago

When I first found the issue, I used 2022.6.0. Unfortunately even docker image from that time is not usable any more. It used obsolete PIO Core which is not working any more. So, it is not possible to easily compile and test this version any more.

The oldest one that I can run is 2022.11.1. With this version I don't see these changing differences between ICs any more. I have now 3 extension boards (total 8 ICs) connected with the same AC source without DC rectifier on first board.

There are constant offsets between ICs and lines which probably come from resistors and is fine. The standard deviation of any voltage reading difference is around 0.015 V which looks better than datasheet shows. This is how 12h measurements look like even if I reset the ESPHome in every 5 min like before.

Screenshot 2023-08-27 at 16-57-33 EMON - Dashboards - Grafana

descipher commented 10 months ago

@Cougar thanks for the tests, I think we are on the right track. The code changes I made will continuously process sensor data locally in the background every 200ms with an averaging over 5 samples. When an HA polling request comes in it supplies the collected values via local memory stores vs collecting a polled value directly from the IC. This reduces lag time between polling callbacks and component processing time as the are no waits other than outside factors like network delay etc. I appears to have a positive result so far.

Cougar commented 10 months ago

Your changes are great and some of them makes a lot of sense and code more readable.

However, I'm skeptical regarding averaging readings even if it makes the result more "smooth". It can't remove the offset anyway. Personally I would not use it (or make it optional). Averaging can be done with additional filters in ESPHome by user if needed. This is probably good idea anyway when updating data every min or so to get more meaningful results.

Also, IC already calculates average over 16 cycles which is 320 ms at 50 Hz or 266 ms at 60 Hz. Data collection at 200 ms interval doesn't make sense here.

IMHO, the positive effect should come from better (or more correct) IC initialization, error checking (and faster?) SPI, reading full register, or independent data collection and sending by ESP.

It would be still good to find out the real cause. I'm not sure if it is worth the time and effort if it is good enough already but one thing I haven't re-tested is to use multiple EPSHomes.

One of my initial hypothesis was that the value depends on timing between SPI commands and IC internal processing. This could be a reason of the offset jump on ESPHome that got reboot and this could also be a cause I don't see it on 4 board but one ESP setup.

So, next I plan to split one ESP and 4 board setup to 2 ESP with 2 boards and see does it change anything or not to hopefully eliminate this possibility.

alexruffell commented 10 months ago

I concur on not having averaging within the component as I would expect raw data from it, and I would then pick the filter that best fits my needs in ESPHome. Also, the readings seemed more "stable" (ie followed each other more closely) without the additional math. My biggest issue has always been that 1ch randomly gets what seems to be a 0.5V error. At first I thought it was just one of them having the issue but then I noticed the error jumping around between channels. I believe the issue has not been resolved but it is harder to see on the graphs after the averaging was added. I am not 100% certain and unfortunately I have no time to mess with it right now :(

Is there any way to sync up the readings? Simultaneous sampling, or close to it? While this may be overkill in normal use, it would help in troubleshooting. Mains fluctuate a lot and trying to calibrate and validate readings is hard if they are not taken all at the same time.

Edit: Regarding the averaging, if I understand correctly it happens 5x (200ms) faster than the data sent to HA (1 sample / sec) if so, then the averaged 1S/s could be considered a more accurate raw sample? Hope this makes sense...

descipher commented 10 months ago

IC already calculates average over 16 cycles which is 320 ms at 50 Hz or 266 ms at 60 Hz. Data collection at 200 ms interval doesn't make sense here.

Missed that completely the application sheet did not dialogue that but the datasheet certainly did. The local avg calc needs to be removed, it provides no value. What does provide value is the quick polling local memory retrieval which have no additional delays.

Do you think adding the ability to set the SPI clock rate from the external yaml config function has value?

Did some more research on the calibrations when they dialogue this Ub=Uc=Un, Ua=0, Ia=0 you have to have a 0 input level for both the current and the voltage in order to set a valid voltage calibration on Ua. The design of the circuit we use does not allow this so I need to make it optional in the code for any design that could use it.

CircuitSetup commented 10 months ago

Do you think adding the ability to set the SPI clock rate from the external yaml config function has value?

I don't think this would be necessary since you'd want the fastest rate possible. Slowing it down would only produce slightly more less accurate results.

Did some more research on the calibrations when they dialogue this Ub=Uc=Un, Ua=0, Ia=0 you have to have a 0 input level for both the current and the voltage in order to set a valid voltage calibration on Ua. The design of the circuit we use does not allow this so I need to make it optional in the code for any design that could use it.

I honestly forgot about this for the voltage. The only way around it is to power the ESP32 separately on startup. I think it is more applicable to the current channels anyway.