Enhancement suggestion - Align Production and consumption data with tesla app and gateway data

jasonacox / Powerwall-Dashboard

Grafana Monitoring Dashboard for Tesla Solar and Powerwall Systems

MIT License

299 stars 64 forks source link

Enhancement suggestion - Align Production and consumption data with tesla app and gateway data #87

Closed BuongiornoTexas closed 9 months ago

BuongiornoTexas commented 2 years ago

Hi @jasonacox and others who may be interested,

I've been looking at the various kWh production and consumption metrics, and I think there may be value in moving to a new method for calculating these.

The current approach is to integrate instantaneous power values over intervals e.g. influxdb generates home load kWh data using integral(home)/1000/3600, and the grafana configuration uses similar queries for for kWh data.

As far as I can tell, the integrals are approximations of the cumulative data available in the following aggregate metering fields:

site_energy_exported
site_energy_imported
load_energy_imported 
load_energy_exported # Should be zero, but a misconfigured meter can result in load exports. Ask me how I know :-(
solar_energy_exported
solar_energy_imported # small values accumulate on this - I suspect from inverter start up and shut down
battery_energy_exported
battery_energy_imported

The advantage to using these measurements is that total power over any interval is the difference between the last reading in the interval and the last reading of the previous interval, or for a long interval, approximately the difference between the first and last reading in the interval - which is a radically simpler calculation than the integral (which also suffers numerical drift - see below) and appears to also be the same data source as the Tesla app (so it is easy to do reconciliation and debugging).

I've put together a proof of concept for the grafana power meters, where I've added a second reading based on differences from the cumulative data:

This was my consumption up to about 1:15pm today. My corresponding app readings were 7.5, 15.9, 5.5 (from solar abs and solar percentage), 1.4, 6.3 and 1.8. The close match between the app readings and the difference readings has been pretty consistent in my testing - differences appear to be tied to when the Tesla database updates (15min-ish) and possible small rounding errors.

The grafana queries for home usage by integral and difference look like:

Similar queries apply for the other meters. If you want to try it your self, the data will only go back three days as I'm taking values from the raw data set, which is only retained 3 days.

I'd like to propose switching to usage based on the cumulative data based on the following benefits:

It aligns usage data with the Tesla app and the data reported from the gateway.
It's much simpler to calculate power usage over an interval and over multiple intervals (use Difference() or Difference(Last()) queries.
I think it may allow elimination of the hourly, daily and monthly cqs, as these can be rebuilt extremely quickly from the cumulative data.
More efficient calculations - calculating production over a long period is the difference of two numbers rather the integral of many short periods (which induces the drift seen above).
It is much more robust to downtime in the logging system - if your database is out for many hours or days, the cumulatives keep accumulating on the gateway and can be pulled in after the database comes back up (may result in odd reporting spikes, but totals remain correct).

If you think this is worth following up, I'm happy to take on the first pass PR - let me know what you think. If there is no interest, I'll let this drop, as I don't want to be maintaining a parallel fork.

One final thing to note - I'm not sure that it will be possible to preserve historical data. Assuming @mcbirse is successful with his historical import, this may not be an issue, and another alternative would be to export all data ahead of making changes, doing appropriate transforms and reimporting.

jasonacox commented 2 years ago

Hi @BuongiornoTexas - brilliant work here! I like the approach and need to think about it more (long day and I'm in a mental void right now :). However, whatever we do needs to somehow preserve the historic data. Many in the community have nearly a year worth of data that helps give a solar-year view of production for future comparison. The delta is minor, but like you, anything we can do to get even more accurate data is worth exploring. I'm probably saying things that don't make sense, but could we alter the CQs to use the diff method for the downsampled data?

mcbirse commented 2 years ago

One final thing to note - I'm not sure that it will be possible to preserve historical data. Assuming @mcbirse is successful with his historical import, this may not be an issue, and another alternative would be to export all data ahead of making changes, doing appropriate transforms and reimporting.

Interesting idea. However with this change, I am wondering how you could then extrapolate kwh, daily, monthly data from autogen for specific time periods?

I will 100% have the historical import program functioning.

This change would possibly affect how I am handling updates to the InfluxDB after import of data from Tesla cloud though?

At the moment, after importing data I am running the SELECT queries based on the integral calculation from autogen into kwh, then daily, monthly, etc. and it is working well. The analysis data graphs will be updated based on the data imported from Tesla cloud.

I'm close to finishing, but also adding some features to the program, such as auto-detecting data gaps (>5m) so this can be filled with the historical data from Tesla cloud (thereby eliminating the need to manually identify missing data or gaps), and safety measures, i.e. ability to reverse/delete anything that was imported from Tesla cloud (and this then requires another run of integral SELECT queries as the data in autogen will change).

jasonacox commented 2 years ago

One thought... is "Tesla cloud" accurate data? I can see us use that data for historic gaps or when data collection wasn't working correctly, but otherwise, I'm not convinced that the sample rate in the Cloud is as good as what we are doing. I say that because I find that the data we are collecting directly seem to match closer to the data usage reported by my power utility (SCE) than the Tesla cloud data. (more research needed here)

I could be wrong on this, just want to highlight that we may be getting more precise data (sample rate) than the Tesla cloud and therefore we would prefer our own samples over the cloud data except when we are missing data (which is what I believe @mcbirse is doing here https://github.com/jasonacox/Powerwall-Dashboard/issues/12).

Now, I will say, that what @BuongiornoTexas is proposing is more about how we compute kWh over durations from our sample data for highest accuracy. The logic behind using the gateway recorded total kWh makes sense as in theory it should be building that integral with high frequency samples, but I do want to validate that so that we aren't actually moving to something less accurate.

BuongiornoTexas commented 2 years ago

I'll hit both of your responses in various orders, and probably in several posts.

long day and I'm in a mental void right now :

Most important things first - I get that! Not far off starting dinner, wine and relaxing for the evening. Will knock a few of these off for you two to think about during my overnight.

I will 100% have the historical import program functioning.

Excellent! I'm expecting to rely on it after I break my own datasets in the future (100% probability given my past history!).

A quick query - if we added the eight variables I listed above to autogen, could your import program bring those values in? I suspect the data should be available, as Tesla is likely to be quite interested in cumulative performance of their systems.

However, whatever we do needs to somehow preserve the historic data.

Fair enough - I think we can do something that will work - later post, as there are some kinks around missing data.

The delta is minor,

Unfortunately, the error in the integrals is likely to get larger over time (just the nature of the numerical approximations in these calculations) - e.g. the difference between the integral and cumulative home usage rises to 3kW over 3 days with my data. I'm pretty sure the cumulative method is self correcting between intervals (error in one interval is fixed in the the next). The worst cumulative error should remain relatively constant and small over time. I'm doing some testing on this, and so far it has held.

However with this change, I am wondering how you could then extrapolate kwh, daily, monthly data

It's the joy of the series being cumulative. If I want to know power consumption between the 1st and the 30th of September, I literally collect the series values at two the time points closest to the start and end of the month and take the difference. Likewise for any given day - I take the difference between the first and last reading of the day.

The one thing it took me a while to get my head around - when you gather data for an interval, you don't take the mean or any other representative value over the period - you take the last value as that represents total cumulative consumption/generation to the end of the interval. I can put together some data based versions if my text isn't making sense (I know what I want to say, but I'm pretty sure my terminology is not quite right).

Ugh -Alright - maybe a long post as the starter.

BuongiornoTexas commented 2 years ago

Validation and logic thoughts:

The logic behind using the gateway recorded total kWh makes sense as in theory it should be building that integral with high frequency samples, but I do want to validate that so that we aren't actually moving to something less accurate.

Agreed - Although I'm not sure how to do that with high accuracy! We are already playing with a hand tied behind our back because we can't sample as frequently as the Tesla, and we are relying on data it has generated! The best tests would be using a independent meters where we can - my thinking on this is independent clamp meters, inverter generation data and utility metering data (gold standard!). Ironically, I started down this path because I wanted to be able to compare inverter and tesla metering data so that I could stop going out in the rain to read the inverter without a wifi connection. Now I might need to do it to validate the Tesla readings.

therefore we would prefer our own samples over the cloud data

Agreed again - if we go down the cumulative kWh path, I would suggest:

Use cumulative going forward.
Reconstruct cumulative from our historical sampled data where available
Fill any gaps with Tesla cloud data per @mcbirse #12.

I say that because I find that the data we are collecting directly seem to match closer to the data usage reported by my power utility (SCE) than the Tesla cloud data. (more research needed here)

Agreed again. I'm much more interested in matching utility data than Tesla's internal math. I'm hoping that a) Tesla does things the same way as the utilities and b) per your point, that using local data will allow us to match the utility meter relatively well. I'm assuming the cloud data is at lower resolution, so can't be useful for high quality metering - @mcbirse - can you comment on the frequency of the cloud data points for this issue?

I've been doing a bit of digging on this as well - the Tesla records cumulative data in (at least) two different modes. The aggregates seem to do net metering on an instantaneous basis (i.e. netted out over all phases), where as the CT meters in vitals seem to record total export over the phases exporting and total import over the phases importing. I don't think this helps, but it does confirm it won't be straighttfoward to work out what is going on!

And that's my limit for today - thoughts on reconstructing data and modifying existing queries tomorrow.

BuongiornoTexas commented 2 years ago

Implementation detail for later follow up if needed: There is an offset between the gateway cumulative values and the app reported lifetime values. I'm guessing that this is a combination of factory testing, site commissioning and any period of incorrectly configured meters during the site setup period.

Assuming the cumulative works as described above, this will be a constant offset for each reading.

If we want to present dashboard values that align with the app, we will either need the user to provide manual calibration data or figure out an automated value. Possibly using the historical data extraction process?

BuongiornoTexas commented 2 years ago

Spent a lot of my spare time today chasing a water leak, so not much progress on this.

Onto implementation and historic data conversion.

Still all in my head, so there will be wrinkles. I'll go through this with solar production to home as the sample variable.

The first thing we will need to do is add the cumulative variable to the autogen retention policy. We could possibly a lower resolution retention policy if we are happy reporting power on say 5 or 10 minute intervals - but I'll go with adding to the autogen for now.


CREATE CONTINUOUS QUERY cq_autogen ... SELECT LAST(solar_energy_exported)/1000 as solar_cumul ... GROUP BY time(1m) ... FILL(???)

So at the end of this, each 1m interval should contain the value of total lifetime solar export by the end of that minute. Not sure what fill should be - linear, null and previous all seem to be contenders.

The kwh, daily and monthlies are generated with:


CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m BEGIN SELECT DIFFERENCE(LAST(solar_cumul)) as solar ... FROM autogen.http GROUP BY time(1h), month, year tz('America/Los_Angeles') END

CREATE CONTINUOUS QUERY cq_daily ON powerwall RESAMPLE EVERY 1m BEGIN SELECT DIFFERENCE(LAST(solar_cumul) as solar ... FROM autogen.http GROUP BY time(1d), month, year tz('America/Los_Angeles') END

CREATE CONTINUOUS QUERY cq_monthly ON powerwall RESAMPLE EVERY 1m BEGIN SELECT DIFFERENCE(LAST(solar_cumul) as solar ... FROM autogen.http GROUP BY time(365d), month, year tz('America/Los_Angeles') END

I've edited the query from the current code - should the monthly be grouping by 365d, or should this be monthly, and do we really need resampling at 1min for daily and monthly?

I'm also assuming sampling from autogen will be more efficient than retaining the sums - we may want to test this, as both are workable alternatives.

All of this works on my understanding that Difference(Last()) when combined with GROUP by Time calculates the value in each interval as (last value in current interval) - (last value in previous interval).

We apply similar queries in grafana.


# For time series, either keep current form time series based on kwh, daily RPs, or:

SELECT difference(last("solar_cumul")) FROM "autogen"."http" WHERE $timeFilter GROUP by time(1m) fill(linear) tz('America/Los_Angeles')

# DIFFERENCE without LAST and GROUP by time generates one value (last value in time range) - (first value in time range)
# This is a difference between grafana and InfluxQL (differences sequential values) - see https://grafana.com/docs/grafana/latest/panels/calculation-types/

SELECT difference("solar_cumul")  FROM "autogen"."http" WHERE $timeFilter tz('America/Los_Angeles')

but could we alter the CQs to use the diff method for the downsampled data?

Yes - see above. May or may not be more efficient than generating directly from the cumulatives. May be worth testing to see which is better, or just go with least change.

With luck, I'll do one more post only tomorrow on reconstructing cumulative curves from historical autogen data.

mcbirse commented 2 years ago

@mcbirse - can you comment on the frequency of the cloud data points for this issue?

The power usage data from Tesla cloud is available in 5 minute intervals. Here's an example response:

{
    "timestamp": "2022-04-18T12:10:00+10:00",
    "solar_power": 7522,
    "battery_power": -4750,
    "grid_power": -1675.8333333333333,
    "grid_services_power": 0,
    "generator_power": 0
}

From this I am calculating the appropriate field values for import into autogen.http as below.

# Calculate power usage values (TODO: check 'home' power usage calculation is correct during VPP event - NOTE: response includes 'grid_services_power')
home = d['solar_power'] + d['battery_power'] + d['grid_power']
solar = d['solar_power']
from_pw = d['battery_power'] if d['battery_power'] > 0 else 0
to_pw = -d['battery_power'] if d['battery_power'] < 0 else 0
from_grid = d['grid_power'] if d['grid_power'] > 0 else 0
to_grid = -d['grid_power'] if d['grid_power'] < 0 else 0

I have a note about checking the 'home' usage calc is correct during a VPP event - which I hope @cwagz will be able to help me with as I know he has some data on that due to his VPP event on 17th August posted here: #65 😊

I was doing some further checking tonight of what other data is available from the cloud. It is possible to get daily energy use, as below, which may be a more accurate way to rebuild the analysis data rather than the integral SELECT queries perhaps. I'm guessing this would therefore match exactly with the Tesla app (although as mentioned, do we even know that is accurate anyway?)

{
    "timestamp": "2022-04-24T01:00:00+10:00",
    "solar_energy_exported": 41407.67546323687,
    "generator_energy_exported": 0,
    "grid_energy_imported": 178.30743105325382,
    "grid_services_energy_imported": 0,
    "grid_services_energy_exported": 0,
    "grid_energy_exported_from_solar": 28398.867932646535,
    "grid_energy_exported_from_generator": 0,
    "grid_energy_exported_from_battery": 0,
    "battery_energy_exported": 8130,
    "battery_energy_imported_from_grid": 0,
    "battery_energy_imported_from_solar": 7990,
    "battery_energy_imported_from_generator": 0,
    "consumer_energy_imported_from_grid": 178.30743105325382,
    "consumer_energy_imported_from_solar": 5018.807530590333,
    "consumer_energy_imported_from_battery": 8130,
    "consumer_energy_imported_from_generator": 0
}

A note on this:

I've edited the query from the current code - should the monthly be grouping by 365d, or should this be monthly,

I was wondering the same thing, and discovered this is a limitation of the version of InfluxDB the dashboard is using. Version 1.8 does not support GROUP BY (1mo) (this was added in InfluxDB v2 onwards).

So that query is relying on tagging the data by the month and year instead, and what you end up with is one data point per year (not sure how the date of that data point is chosen), like below, which can be used to identify the monthly values.

Not really a big issue I believe?

BuongiornoTexas commented 2 years ago

@mcbirse - that energy import/export data is the kind of thing I was hoping to see. It's also a more granular breakdown than I was expecting. If this is available on 5 minute intervals, it may well be acceptable for many/most peoples requirements for power consumption and generation logging. I'd fall into this camp - I like to see instant power at high frequency, but the kWh loads shouldn't change nearly as fast, and I'd be happy enough with 5 minute updates.

All three of us have also touched on accuracy, so its probably worth figuring out what we mean by it and where it matters to us (I know I have a tendency to get caught up on interesting problems, which aren't always the important ones).

For me, accurate is matching my utilities metering data. So I'm hoping that Tesla uses similar calculation methods and generates data that follows what the meter is reporting fairly closely (slow drift is fine, as long as it is consistent and easy to correct periodically). This is the important one for me. Which means taking lots of meter readings and cross checking the Tesla.

@jasonacox raises two more points relating to the Tesla Cloud data.

The local gateway will have higher frequency data than the cloud. So we'll definitely prefer it for data where frequency matters. But the data above suggests we can get more detail at lower frequency for some things - so maybe we'll find a use for this data where frequency is less important.
The gateway and Tesla Cloud data may not match. I think this is well worth checking. I'd be both surprised and unhappy if the differences were anything more than offsets for cumulative data (this kind of data really should be consistent over time with maybe a consistent difference). Point data is a bit harder - the cloud data might be a representative value (mean), a sampled value or an instant reading - it'll be hard to test that, and harder to validate if the Tesla is using more data points than we have available!

Finally there is the accuracy of our calculations vs data available from the Tesla - I'd tend to the view that I would expect the gateway would be doing more accurate calculations than we can achieve - which is why I suggested the cumulative approach vs using influx integrals. But, I also think it is worth doing our calculations to check that the Tesla data and our understanding are consistent.

Finallly, finally there's alignment with the App - as this is a black box with at best +/- 0.1 kW/kWh precision, I'm happy with preferring data from any other source, and will take any reasonable match we get to the app as a happy bonus.

mcbirse commented 2 years ago

@mcbirse - that energy import/export data is the kind of thing I was hoping to see. It's also a more granular breakdown than I was expecting. If this is available on 5 minute intervals, it may well be acceptable for many/most peoples requirements for power consumption and generation logging.

I did mention, unfortunately, that the "import/export" data from Tesla cloud is available daily only (not per 5 minute). Only the power usage data is available per 5 minutes (which is what I am using to populate autogen.http fields, and it works very well / closely matches what you would normally see logged in the dashboard per my post in #12 where I showed some images of the graphs for comparison).

However, for the purpose of importing historical data at least, this daily import/export data from Tesla could still be useful to populate the daily & monthly retention policies in InfluxDB and more accurate than the integral sampled from autogen (note: kwh however is hourly, so would still need to be generated from autogen). I could use this perhaps, but it is more work and also at the moment it is rather moot as I believe there are no queries at all of the daily or monthly retention policies in the dashboard.

BuongiornoTexas commented 2 years ago

I did mention, unfortunately, that the "import/export" data from Tesla cloud is available daily only (not per 5 minute).

My apologies - I misinterpreted your post to mean daily values updated every 5 minutes. That's a shame, as it would have made pulling consistent cumulatives easy.

But, I think we can still do a pretty reasonable job of rebuilding the historical cumulative values using the data we have available. My rough first thoughts on the process for each cumulative - for solar in this example:

Use @mcbirse process to fill historical gaps in autogen and kwh.
Find the oldest cumulative data point/timestamp when we started to logging cumulatives: (first_solar_date, first_local_solar_cumul).
Working backwards from this value day by day in the history:
- Assuming we trust them, get the start of day and end of day cumulative values from the tesla cloud (day_start_solar_cumul, day_end_solar_cumul), and calculate:
```
expected_production = day_end_solar_cumul - day_start_solar_cumul
# for partial days at start and of record set, use:
expected_production = first_local_solar_cumul - day_start_solar_cumul
expected_production = day_end_solar_cumul - earliest_record_solar_cumul
```
- Use the historical data and integral to create kwh data at whatever interval we want to keep for each day/partial day.
- Sum the integral kwh for the period (integral_sum).
- Rescale each interval record in the day by: expected_production / integral_sum.
- Create the cumulative values from the start day value and doing a cumulative sum on the interval values we have generated (or work backwards from the end of the day - the result should be the same either way).
- Write the synthetic cumulative to the database.

This is a messy process, but does two things:

Preserves the shape of the historical production data.
Scales it to match the Telsa cumulative values on a day by day basis - eliminating long term differences between our logged data and the tesla data.

If we decide that the Tesla cloud cumulatives are not trustworthy, then we eliminate the scaling part and the history filling becomes less messy.

This change would possibly affect how I am handling updates to the InfluxDB after import of data from Tesla cloud though?

Based on the on your most recent posts, I don't think it changes anything you are doing right now. If we adopt the cumulative approach, we will need to add a few more elements as outlined above. I assume we would run your tool as a utility as an intial fix and then periodically to catch any future data gaps?

OK, I think that's the last of my brain dump. I'll leave it to @jasonacox and the community to decide if it is worth trying this enhancement.

cwagz commented 2 years ago

I have a note about checking the 'home' usage calc is correct during a VPP event - which I hope @cwagz will be able to help me with as I know he has some data on that due to his VPP event on 17th August posted here: #65 😊

@mcbirse - I would be glad to help. Just let me know what to do. I am not sure there will be any further VPP events this year. We had about 8 total during the recent heatwave.

jasonacox commented 2 years ago

@BuongiornoTexas thanks for the research here! Increasing accuracy would be a great enhancement. Even if we can't verify that it actually delivers on that promise, it would be good to allow users to select something that better matches the Tesla app/gateway data.

My proposal is that we take an incremental approach to the new methodology instead of a replacement approach. By that, I mean we allow for BOTH methods to compute and represent kWh. This would allow for users to continue as-is, switch to the new, or use both.

What would be the way to proceed? What would be needed?

Creating and using new database/series data and additional CQs for InfluxDB
New (separate) dashboard.json that uses the new methodology/data - we could even present a combined dashboard that compares the differences.
Data backfill which converts historic data into the new methodology (and of course, even backfill gaps in the existing datastore).

What do you think? What am I missing (I know I must be missing something)? 😊

BuongiornoTexas commented 2 years ago

My proposal is that we take an incremental approach to the new methodology instead of a replacement approach. By that, I mean we allow for BOTH methods

What do you think? What am I missing (I know I must be missing something)?

Sounds good - this is also where my thinking was going (I hadn't got as far as parallel dashboards). At a high level, I don't think anything obvious is missing.

If you are happy, I will put together a PR addressing the new data series and the parallel dashboard. It may take a while, as I'll need to get to grips with the TICK stack, and I'm rusty on shell scripting and git. We'll see if that throws up any issues, and we can look at integrating historical cloud data later - I'm expecting we will be well behind Michael's work on this!

@mcbirse - I'm sure you've already seen, but I came across this post #https://github.com/influxdata/influxdb/issues/3904#issuecomment-268918613 and which references this project #https://github.com/hgomez/influxdb - both of which may be useful in your data history recovery.

mcbirse commented 2 years ago

@mcbirse - I would be glad to help. Just let me know what to do. I am not sure there will be any further VPP events this year. We had about 8 total during the recent heatwave.

Thanks @cwagz - much appreciated. The Tesla history program will have debug and test options so we will be able to run it for that day and simply output the responses to a file instead of importing data.

My reason for needing to check this is because of the grid_services_power value in the response and not knowing exactly how Tesla allocates the power amounts. If this field is just used to show the component of power that was used for grid services, then we should be good and the calculated amount for home usage would be correct.

@mcbirse - I'm sure you've already seen, but I came across this post #influxdata/influxdb#3904 (comment) and which references this project #https://github.com/hgomez/influxdb - both of which may be useful in your data history recovery.

Thanks @BuongiornoTexas - this discussion seems relevant to fixing our incorrect month tags. I had an idea about this using a similar concept which I'm still to look into, but am pretty sure I can write a tool for this if there is interest for it.

BuongiornoTexas commented 9 months ago

I've now got slightly more than a year of data, and can confirm the following:

The powerwall dashboard integral based on the Telsa data does drift from values taken from my utility metering.
The drift is small, but appears to have a consistent bias. For example, the long term error between the dashboard export calculation and the metered value has increased steadily from -0.1% to 0.8% over a 14 month period. At the end of the period, total export stands at about 10MWh, so this is not a problem with small numbers. I see similar effects on my off peak import, with a magnified error dues to smaller total imports (~500kWh).
The error is only an issue for my self consumption value calculations, as I use the utility metering as definitive for export. I'm also happy with the error, as the bias tends to slightly inflate the value of self consumption (and the cumulative error is not large enough to really matter in these calcs).

I'm pretty sure it will be possible to dig out better data from the powerwall API, but I've got other projects chewing my time. So I'm going to close this as a) a real but not major issue and b) requiring significant additional work to address. As and when someone has the time (maybe even future me), they will need to dig into the API, develop and test a better reconciliation calculation and prepare a PR to address the issue.

Closing as will not implement.

BuongiornoTexas commented 9 months ago

Whoops - closed in previous comment.

jasonacox commented 9 months ago

Thanks, @BuongiornoTexas - This is great research and useful for anyone wanting understanding the error. And of course, happy to have anyone help improve.