jasonacox / Powerwall-Dashboard

Grafana Monitoring Dashboard for Tesla Solar and Powerwall Systems
MIT License
268 stars 57 forks source link

Feature request: heatmap chart #78

Open matthew-pakulski opened 1 year ago

matthew-pakulski commented 1 year ago

I found someone using a heatmap to chart solar data over time. Thought something like this would be a great addition to the dashboard.

image https://enlighten.enphaseenergy.com/pv/public_systems/3J9a932010/overview

jasonacox commented 1 year ago

Thanks @matthew-pakulski ! We have the "Solar Year" graph but the heat map is very cool. Do you know if this is a grafana panel? If you can find the panel json for this, please post it! Thanks!

youzer-name commented 1 year ago

This may not be possible using InfluxDB and InfluxQL. It might require using Flux as the query language since InfluxQL can't do anything like "datepart" to get the month or hour from the row data. Doing the heatmap would require having an 'hour' column on each row and a numeric 'month' column. Right now we only have the year and an alpha 'month' column.

I ran into this issue when I tried to make a chart like the one that shows the past month of home use (or solar, or powerwall use, etc), but I wanted it to show a year's data with one bar showing the total for each month.

I was able to get close...

image

I can use this query:

SELECT sum("home") FROM "daily"."http" WHERE ("month" != '') AND $timeFilter GROUP BY "month" tz('America/New_York')

The problem is I don't see any way to sort it in the correct order. Influx returns the totals sorted alphabetically by month and I can't see how to re-order it in Grafana.

Maybe we could update the Telegraf -> InfluxDB process to also store the month number, but a little Googling seems to say that isn't at all simple to do. The date processor in Telegraf appears to only do the month as a 3-digit alpha string. There may be a Telegraf plugin that would allow this, but it might be more trouble than figuring out how to make this work with Flux queries.

Then again, between typing the last paragraph and this one I spent about an hour trying to figure out Flux. I was able to get a query to return the month numbers, but it throws an error when trying to sum the values. Google search results say this might be a limitation of running it on the Raspberry Pi. I'm giving up for now.

This would be so much easier with a more standard SQL-type database (like MySQL). I know my way around those. InfluxDB seems to have so many limitations.

jasonacox commented 1 year ago

Thanks for trying @youzer-name !

I may not be following everything, but looking at the heat map, it is basically just time series data with quanta set to "per day" and then the block colored based on that daily value. This is essentially the same thing we have in our "Solar Year" graph:

image

Which is based on the daily kwh data built by using InfluxDB continuous query:

CQ:

CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m 
BEGIN 
    SELECT 
        integral(home)/1000/3600 AS home, 
        integral(solar)/1000/3600 AS solar, 
        integral(from_pw)/1000/3600 AS from_pw, 
        integral(to_pw)/1000/3600 AS to_pw, 
        integral(from_grid)/1000/3600 AS from_grid, 
        integral(to_grid)/1000/3600 AS to_grid 
            INTO powerwall.kwh.:MEASUREMENT 
    FROM autogen.http GROUP BY time(1h), month, year tz('America/Los_Angeles') 
END

Grafana Query:

SELECT sum("solar") 
    FROM "kwh"."http" WHERE ("solar" > 0) 
        AND time >= 1632362261667ms and time <= now() 
        GROUP BY time(1d) fill(0) tz('America/Los_Angeles')
SELECT sum("home") 
    FROM "kwh"."http" WHERE ("home" > 0) 
        AND time >= 1632362261667ms and time <= now() 
        GROUP BY time(1d) fill(0) tz('America/Los_Angeles')
mcbirse commented 1 year ago

One thing to note here with the CQ's and how data is being stored in autogen, kwh, etc. is that all of the data points are being tagged with a "month" and "year" value.

This is created from the GROUP BY clause of the CQ:

FROM autogen.http GROUP BY time(1h), month, year tz('America/Los_Angeles')

A comma after the "time" period is used to add tags to the data. I've only learnt this recently while working on the Tesla history data import program, as I'm needing to replicate the behaviour with the imported data.

@youzer-name example is showing in his query, data is being grouped by the tag values, rather than specific time periods. If it could be done by time period instead, maybe it could work?

The month tag will be a short alpha of the month name, e.g. "Sep" which is also why it won't sort well.

I've noticed what seems to be an issue with tagging of data by "month" "year" etc. as well, in that it seems to tag them in UTC regardless of a tz('your/timezone') in the query.

Here's an example, where you see the "month" tag of my data is changing month at 10am, since my timezone is +10 (at the moment) 😞 image

@youzer-name I'm curious, do you see the same timezone issue with tagging by "month" "year" in your data?

jasonacox commented 1 year ago

Wow @mcbirse - that is odd. Can you share all of the query? I ran some tests against InfluxDB and it seems to honor timzeone (tz). I wonder if this is a Grafana 9 issue or did you see this before?

mcbirse commented 1 year ago

@jasonacox I don't think this is a Grafana issue (actually haven't got around to upgrading my production server to 9 yet!)

You can see the issue if you just query the InfluxDB directly.

Three query examples below:

  1. First showing query using time format in my timezone (for start of Sep, note month tag shows "Aug")
  2. Second query is the same, but used time format in UTC and added tz() just to ensure same data is output
  3. Third query shows data returned for time from 9:55am to 10:05 am - you can see this is where the month tag changes over to "Sep"
InfluxDB shell version: 1.8.10
> SELECT * FROM autogen.http WHERE time >= '2022-09-01T00:00:00+10:00' AND time <= '2022-09-01T00:05:00+10:00'
name: http
time                from_grid          from_pw            home               host month percentage         solar              to_grid            to_pw url year
----                ---------          -------            ----               ---- ----- ----------         -----              -------            ----- --- ----
1661954400000000000 7.636363636363637  316.3636363636364  318.34090909090907      Aug   56.326253953908726 -8.818181818181818 1.0909090909090908 0         2022
1661954460000000000 3.909090909090909  317.27272727272725 313.6818181818182       Aug   56.281066425666516 -8.909090909090908 7.909090909090909  0         2022
1661954520000000000 8.363636363636363  302.72727272727275 299.6818181818182       Aug   56.23587889742431  -8.727272727272727 2.272727272727273  0         2022
1661954580000000000 14.090909090909092 294.54545454545456 296.8863636363636       Aug   56.19069136918211  -8.818181818181818 1.9090909090909092 0         2022
1661954640000000000 9                  298.1818181818182  303.77272727272725      Aug   56.1455038409399   -8.636363636363637 2.5454545454545454 0         2022
1661954700000000000 9.818181818181818  300                299.52272727272725      Aug   56.09278505799066  -8.636363636363637 0.6363636363636364 0         2022
> SELECT * FROM autogen.http WHERE time >= '2022-08-31T14:00:00Z' AND time <= '2022-08-31T14:05:00Z' tz('Australia/Sydney')
name: http
time                from_grid          from_pw            home               host month percentage         solar              to_grid            to_pw url year
----                ---------          -------            ----               ---- ----- ----------         -----              -------            ----- --- ----
1661954400000000000 7.636363636363637  316.3636363636364  318.34090909090907      Aug   56.326253953908726 -8.818181818181818 1.0909090909090908 0         2022
1661954460000000000 3.909090909090909  317.27272727272725 313.6818181818182       Aug   56.281066425666516 -8.909090909090908 7.909090909090909  0         2022
1661954520000000000 8.363636363636363  302.72727272727275 299.6818181818182       Aug   56.23587889742431  -8.727272727272727 2.272727272727273  0         2022
1661954580000000000 14.090909090909092 294.54545454545456 296.8863636363636       Aug   56.19069136918211  -8.818181818181818 1.9090909090909092 0         2022
1661954640000000000 9                  298.1818181818182  303.77272727272725      Aug   56.1455038409399   -8.636363636363637 2.5454545454545454 0         2022
1661954700000000000 9.818181818181818  300                299.52272727272725      Aug   56.09278505799066  -8.636363636363637 0.6363636363636364 0         2022
> SELECT * FROM autogen.http WHERE time >= '2022-09-01T09:55:00+10:00' AND time <= '2022-09-01T10:05:00+10:00'
name: http
time                from_grid          from_pw home               host month percentage        solar             to_grid            to_pw              url year
----                ---------          ------- ----               ---- ----- ----------        -----             -------            -----              --- ----
1661990100000000000 0.6363636363636364 0       392.74999999999983      Aug   65.5294472059045  5994              631.9090909090909  4960.909090909091      2022
1661990160000000000 0                  0       408.56818181818164      Aug   66.09429130893206 5899.909090909091 582.2727272727273  4904.545454545455      2022
1661990220000000000 2.272727272727273  0       372.4318181818182       Aug   66.58382286488929 6012.727272727273 1212.5454545454545 4420.909090909091      2022
1661990280000000000 0                  0       387.5681818181818       Aug   67.15619822262389 7179.272727272727 1797.3636363636363 4992.727272727273      2022
1661990340000000000 4.363636363636363  0       381.74999999999994      Aug   67.72857358035849 6794.363636363636 1478.8181818181818 4934.545454545455      2022
1661990400000000000 0                  0       387.1136363636362       Sep   68.28588642867902 6869.181818181818 1483.8181818181818 4992.727272727273      2022
1661990460000000000 2.5454545454545454 0       375.24999999999983      Sep   68.79048049405031 5788.454545454545 822.9090909090909  4585.454545454545      2022
1661990520000000000 5.909090909090909  0       389.9318181818181       Sep   69.33273083295677 5745.818181818182 814.1818181818181  4557.272727272727      2022
1661990580000000000 6.545454545454546  0       368.81818181818176      Sep   69.80719987949993 4919.727272727273 289.6363636363636  4255.454545454545      2022
1661990640000000000 9.909090909090908  0       369.70454545454544      Sep   70.2440126525079  3786.181818181818 9.545454545454545  3417.2727272727275     2022
1661990700000000000 5                  0       375.6590909090908       Sep   70.65070040668775 5927.909090909091 1192.2727272727273 4366.363636363636      2022

Because of this issue, grouping by month/year tags seems completely pointless and useless, unless you live in UTC timezone and do not have daylight savings!

Still curious to know if other people see the same issue here?

@jasonacox Can you do a similar check as above on your data. I think your timezone would be -7 from UTC?

If no else is seeing this problem, then there could be an issue with my set up somewhere....

mcbirse commented 1 year ago

Moved this to issue #80 as I may have discovered it is a problem with telegraf.conf?

youzer-name commented 1 year ago

After a lot of trial and error (mostly error) I was able to copy all of my InfluxDB data to MySQL and get a query of solar production by month. Still no luck on the heatmap because I'm not sure that a heatmap in Grafana is exactly the same thing as the heatmap in the OP's example. I'll probably keep playing with the heatmaps later...

image

image

The copy of the data was accomplished using a Python script, so it should be easy to schedule the script to run and keep the data flowing into the MySQL database.

If anyone is interested in doing something like this, I'll do a separate post under 'show & tell' as there were a bunch of steps needed to get to this result.

jasonacox commented 1 year ago

Nice work @youzer-name ! I definitely think we should add this MySQL data pipeline script to the tools folder for anyone wanting to do the same.

youzer-name commented 1 year ago

@jasonacox Ugh... I found something else amiss in my data as a result of getting it over into MySQL. The issue @mcbirse noticed with the timezones and month tags also spills over into the downsampled daily data.

Background info: I just updated the telegraf.conf file to add the TZ reference per the recent discussion and bug report. I recreated the stack after that. I am using America/New_York for my TZ, (== GMT -0400 in DST and -0500 standard time).

When I tried to set the datetime as a primary key in MySQL on the 'daily' table (to prevent duplicates being created) I got an error that the key couldn't be created as there were duplicate values in the datetime field.

Looking at the data, I have TWO entries for the last day of each month, each with a different value in the 'month' field.

2022-08-29T04:00:00Z 27.61983005411255   9.093889971139971    50.55084173656204       Aug   26.54738099927849      0.09192271464646465   12.595960317460316        2022
2022-08-30T04:00:00Z 19.71932777777778   13.123890782828282   41.86573716856061       Aug   20.400520580808077     0.11669671717171716   11.23269696969697         2022
2022-08-31T04:00:00Z 3.800571969696969   3.667393939393939    7.439026704545457       Sep   -0.0024954545454545455 0.018624242424242425  0.0024848484848484843     2022
2022-08-31T04:00:00Z 8.392989318181817   5.115769696969697    26.56693058080808       Aug   24.57233343434344      0.11186666666666668   11.380322474747475        2022
2022-09-01T04:00:00Z 46.81284646464646   3.8834981060606055   61.56610385416667       Sep   28.972736111111104     0.10087462121212122   17.993631944444445        2022

The entry showing the 'next month' (so in this case, the entry on 8/31 that shows 'Sep') never seems to show much in the way of solar, leading me to think this entry is always only using data from the night time.

When I sum up the hourly data from kwh using this query (in MySQL):

select sum(home) from kwh where datetime >= '2022-08-31 04:00:00' and datetime < '2022-09-01 00:00:00'

I get 26.56, which is the value in the daily table for 8/31/22 Aug

When I sum up the hourly data for the last four hours of the day (local time)

select sum(home) from kwh where datetime >= '2022-09-01 00:00:00' and datetime < '2022-09-01 04:00:00'

I get 7.43, which is the value in the daily table for 8/31/22 Sep.

I guess this is as a result of the downsampling to 'daily' happening every hour. In EDT timezone, for 20 hours on the last day of each month the process wrote the downsampled data to a row with the correct month, and for the last 4 hours of the day it was writing it to a new row with the wrong month.

This doesn't really create too big an issue if you ignore the month column (which was kind of the whole purpose for me trying to get this data into MySQL) except it means I can't create a primary key on the datetime field to prevent duplicate entries. I could make a multi-field index to prevent duplicates (for example: the combination of datetime and month must be unique), but then I realized that the way the downsampling occurs also creates an issue if I try to pull the daily data for today before the day is over. Since the daily entry for today is updated every hour and isn't a complete sum of today's data until the day is over, if I set it as unique, then the first entry for today's date will be the only one that makes it into the SQL database.

The two options I see for working around this are: 1) Just stick with the hourly data. It is 24x as much data, but may just be simpler 2) Set up the query in the python script to only pull yesterday's data. Never today's. Combined with a unique constraint that doesn't allow duplicates with the exact same datetime and month, that should result in accurate data in the MySQL daily table.

Given the above, I'm going to update my post in the show & tell section to have it pull both daily and hourly data, and make sure the query for daily data always lags by one day. I'll include notes about the multi-field unique key that needs to be created in the daily table to deal with the old data for anyone that was logging prior to the bug fix in telegraf.conf.

jasonacox commented 1 year ago

Good catch, @youzer-name - I suspect the culprit is the downsampling CQs:

CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m 
BEGIN 
SELECT integral(home)/1000/3600 AS home, 
integral(solar)/1000/3600 AS solar, integral(from_pw)/1000/3600 AS from_pw, 
integral(to_pw)/1000/3600 AS to_pw, integral(from_grid)/1000/3600 AS from_grid, 
integral(to_grid)/1000/3600 AS to_grid 
INTO powerwall.kwh.:MEASUREMENT 
FROM autogen.http 
GROUP BY time(1h), month, year tz('America/Los_Angeles') 
END

CREATE CONTINUOUS QUERY cq_daily ON powerwall RESAMPLE EVERY 1h 
BEGIN 
SELECT sum(home) AS home, sum(solar) AS solar, 
sum(from_pw) AS from_pw, sum(to_pw) AS to_pw, sum(from_grid) AS from_grid, 
sum(to_grid) AS to_grid 
INTO powerwall.daily.:MEASUREMENT 
FROM powerwall.kwh.http 
GROUP BY time(1d), month, year tz('America/Los_Angeles') 
END 

It is pulling the month and year tags (set by telegraf) into the downsample series so prior to the telegraf.conf fix, that data is bad too. Unless I'm missing something. The existing dashboard graphs won't care since they aren't using the tags, but this definitely creates a problem for anything using those tag.

BuongiornoTexas commented 1 year ago

@jasonacox I think this issue crosses over to some early work I'm doing on trying to get the power reporting to line up as near as possible to the Tesla generation data (which will help me solve an entirely separate problem of being to lazy to get up from my desk to do a reading from an unconnected inverter ...).

From the work I've been doing, I'm think that I can get the result I'm looking for by using a different version of the powerwall.kwh.xxx query, and I think this may also help with resolving heatmap and other downsampling issues on the kwh data.

The revised version I've got in my head (but haven't tested yet as I'm not quite ready to trash my database just yet) is along the lines:

# New retention policy - grab total kWh generated in a minute
CREATE RETENTION POLICY kwh_pm ON powerwall duration INF replication 1

# ...

# Query pulls minute of data by differencing last cumulative export/import reading between buckets.
# Needs all of  LAST, DIFFERENCE and GROUP BY <time> to work correctly.
# This approach eliminates drift from integral (1:1 correspondence with Powerwall source of truth) and *may* 
# self correct for outages (as it is working on differences of cumulatives)
# Doing this allows easy accumulation of data during hour by sum. Not sure if resampling is needed?
CREATE CONTINUOUS QUERY cq_kwh_pm ON powerwall  
BEGIN 
SELECT DIFFERENCE(LAST(solar_energy_exported))/1000 AS solar,
DIFFERENCE(LAST(load_energy_imported))/1000 AS home, 
DIFFERENCE(LAST(battery_energy_exported))/1000 AS from_pw, 
DIFFERENCE(LAST(battery_energy_imported))/1000 AS to_pw, 
DIFFERENCE(LAST(site_energy_imported))/1000 AS from_grid, 
DIFFERENCE(LAST(site_energy_exported))/1000 AS to_grid 
INTO powerwall.kwh_pm.:MEASUREMENT 
FROM raw.http 
GROUP BY time(1m), month, year fill(0) tz('America/Los_Angeles') 
END

# And then hourly becomes:
CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m 
BEGIN 
SELECT sum(home) AS home, sum(solar) AS solar, 
sum(from_pw) AS from_pw, 
sum(to_pw) AS to_pw, 
sum(from_grid) AS from_grid, 
sum(to_grid) AS to_grid 
INTO powerwall.kwh.:MEASUREMENT FROM powerwall.kwh_pm.http GROUP BY time(1h), month, year tz('America/Los_Angeles') END 

# daily and monthly unchanged.

If this works as I think it does, it should be a bit a cleaner than the integral and should have a one for one correspondence to the tesla cumulatives as long as there is no downtime (and may work correctly even then if InfluxQL is smart enough about the interval differences).

I'll keep working this for the stuff I'm looking at and will report back, but I thought I would mention it here in case it helps with the current issue.