Open matthew-pakulski opened 1 year ago
Thanks @matthew-pakulski ! We have the "Solar Year" graph but the heat map is very cool. Do you know if this is a grafana panel? If you can find the panel json for this, please post it! Thanks!
This may not be possible using InfluxDB and InfluxQL. It might require using Flux as the query language since InfluxQL can't do anything like "datepart" to get the month or hour from the row data. Doing the heatmap would require having an 'hour' column on each row and a numeric 'month' column. Right now we only have the year and an alpha 'month' column.
I ran into this issue when I tried to make a chart like the one that shows the past month of home use (or solar, or powerwall use, etc), but I wanted it to show a year's data with one bar showing the total for each month.
I was able to get close...
I can use this query:
SELECT sum("home") FROM "daily"."http" WHERE ("month" != '') AND $timeFilter GROUP BY "month" tz('America/New_York')
The problem is I don't see any way to sort it in the correct order. Influx returns the totals sorted alphabetically by month and I can't see how to re-order it in Grafana.
Maybe we could update the Telegraf -> InfluxDB process to also store the month number, but a little Googling seems to say that isn't at all simple to do. The date processor in Telegraf appears to only do the month as a 3-digit alpha string. There may be a Telegraf plugin that would allow this, but it might be more trouble than figuring out how to make this work with Flux queries.
Then again, between typing the last paragraph and this one I spent about an hour trying to figure out Flux. I was able to get a query to return the month numbers, but it throws an error when trying to sum the values. Google search results say this might be a limitation of running it on the Raspberry Pi. I'm giving up for now.
This would be so much easier with a more standard SQL-type database (like MySQL). I know my way around those. InfluxDB seems to have so many limitations.
Thanks for trying @youzer-name !
I may not be following everything, but looking at the heat map, it is basically just time series data with quanta set to "per day" and then the block colored based on that daily value. This is essentially the same thing we have in our "Solar Year" graph:
Which is based on the daily kwh
data built by using InfluxDB continuous query:
CQ:
CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m
BEGIN
SELECT
integral(home)/1000/3600 AS home,
integral(solar)/1000/3600 AS solar,
integral(from_pw)/1000/3600 AS from_pw,
integral(to_pw)/1000/3600 AS to_pw,
integral(from_grid)/1000/3600 AS from_grid,
integral(to_grid)/1000/3600 AS to_grid
INTO powerwall.kwh.:MEASUREMENT
FROM autogen.http GROUP BY time(1h), month, year tz('America/Los_Angeles')
END
Grafana Query:
SELECT sum("solar")
FROM "kwh"."http" WHERE ("solar" > 0)
AND time >= 1632362261667ms and time <= now()
GROUP BY time(1d) fill(0) tz('America/Los_Angeles')
SELECT sum("home")
FROM "kwh"."http" WHERE ("home" > 0)
AND time >= 1632362261667ms and time <= now()
GROUP BY time(1d) fill(0) tz('America/Los_Angeles')
One thing to note here with the CQ's and how data is being stored in autogen, kwh, etc. is that all of the data points are being tagged with a "month" and "year" value.
This is created from the GROUP BY clause of the CQ:
FROM autogen.http GROUP BY time(1h), month, year tz('America/Los_Angeles')
A comma after the "time" period is used to add tags to the data. I've only learnt this recently while working on the Tesla history data import program, as I'm needing to replicate the behaviour with the imported data.
@youzer-name example is showing in his query, data is being grouped by the tag values, rather than specific time periods. If it could be done by time period instead, maybe it could work?
The month tag will be a short alpha of the month name, e.g. "Sep" which is also why it won't sort well.
I've noticed what seems to be an issue with tagging of data by "month" "year" etc. as well, in that it seems to tag them in UTC regardless of a tz('your/timezone')
in the query.
Here's an example, where you see the "month" tag of my data is changing month at 10am, since my timezone is +10 (at the moment) 😞
@youzer-name I'm curious, do you see the same timezone issue with tagging by "month" "year" in your data?
Wow @mcbirse - that is odd. Can you share all of the query? I ran some tests against InfluxDB and it seems to honor timzeone (tz). I wonder if this is a Grafana 9 issue or did you see this before?
@jasonacox I don't think this is a Grafana issue (actually haven't got around to upgrading my production server to 9 yet!)
You can see the issue if you just query the InfluxDB directly.
Three query examples below:
InfluxDB shell version: 1.8.10
> SELECT * FROM autogen.http WHERE time >= '2022-09-01T00:00:00+10:00' AND time <= '2022-09-01T00:05:00+10:00'
name: http
time from_grid from_pw home host month percentage solar to_grid to_pw url year
---- --------- ------- ---- ---- ----- ---------- ----- ------- ----- --- ----
1661954400000000000 7.636363636363637 316.3636363636364 318.34090909090907 Aug 56.326253953908726 -8.818181818181818 1.0909090909090908 0 2022
1661954460000000000 3.909090909090909 317.27272727272725 313.6818181818182 Aug 56.281066425666516 -8.909090909090908 7.909090909090909 0 2022
1661954520000000000 8.363636363636363 302.72727272727275 299.6818181818182 Aug 56.23587889742431 -8.727272727272727 2.272727272727273 0 2022
1661954580000000000 14.090909090909092 294.54545454545456 296.8863636363636 Aug 56.19069136918211 -8.818181818181818 1.9090909090909092 0 2022
1661954640000000000 9 298.1818181818182 303.77272727272725 Aug 56.1455038409399 -8.636363636363637 2.5454545454545454 0 2022
1661954700000000000 9.818181818181818 300 299.52272727272725 Aug 56.09278505799066 -8.636363636363637 0.6363636363636364 0 2022
> SELECT * FROM autogen.http WHERE time >= '2022-08-31T14:00:00Z' AND time <= '2022-08-31T14:05:00Z' tz('Australia/Sydney')
name: http
time from_grid from_pw home host month percentage solar to_grid to_pw url year
---- --------- ------- ---- ---- ----- ---------- ----- ------- ----- --- ----
1661954400000000000 7.636363636363637 316.3636363636364 318.34090909090907 Aug 56.326253953908726 -8.818181818181818 1.0909090909090908 0 2022
1661954460000000000 3.909090909090909 317.27272727272725 313.6818181818182 Aug 56.281066425666516 -8.909090909090908 7.909090909090909 0 2022
1661954520000000000 8.363636363636363 302.72727272727275 299.6818181818182 Aug 56.23587889742431 -8.727272727272727 2.272727272727273 0 2022
1661954580000000000 14.090909090909092 294.54545454545456 296.8863636363636 Aug 56.19069136918211 -8.818181818181818 1.9090909090909092 0 2022
1661954640000000000 9 298.1818181818182 303.77272727272725 Aug 56.1455038409399 -8.636363636363637 2.5454545454545454 0 2022
1661954700000000000 9.818181818181818 300 299.52272727272725 Aug 56.09278505799066 -8.636363636363637 0.6363636363636364 0 2022
> SELECT * FROM autogen.http WHERE time >= '2022-09-01T09:55:00+10:00' AND time <= '2022-09-01T10:05:00+10:00'
name: http
time from_grid from_pw home host month percentage solar to_grid to_pw url year
---- --------- ------- ---- ---- ----- ---------- ----- ------- ----- --- ----
1661990100000000000 0.6363636363636364 0 392.74999999999983 Aug 65.5294472059045 5994 631.9090909090909 4960.909090909091 2022
1661990160000000000 0 0 408.56818181818164 Aug 66.09429130893206 5899.909090909091 582.2727272727273 4904.545454545455 2022
1661990220000000000 2.272727272727273 0 372.4318181818182 Aug 66.58382286488929 6012.727272727273 1212.5454545454545 4420.909090909091 2022
1661990280000000000 0 0 387.5681818181818 Aug 67.15619822262389 7179.272727272727 1797.3636363636363 4992.727272727273 2022
1661990340000000000 4.363636363636363 0 381.74999999999994 Aug 67.72857358035849 6794.363636363636 1478.8181818181818 4934.545454545455 2022
1661990400000000000 0 0 387.1136363636362 Sep 68.28588642867902 6869.181818181818 1483.8181818181818 4992.727272727273 2022
1661990460000000000 2.5454545454545454 0 375.24999999999983 Sep 68.79048049405031 5788.454545454545 822.9090909090909 4585.454545454545 2022
1661990520000000000 5.909090909090909 0 389.9318181818181 Sep 69.33273083295677 5745.818181818182 814.1818181818181 4557.272727272727 2022
1661990580000000000 6.545454545454546 0 368.81818181818176 Sep 69.80719987949993 4919.727272727273 289.6363636363636 4255.454545454545 2022
1661990640000000000 9.909090909090908 0 369.70454545454544 Sep 70.2440126525079 3786.181818181818 9.545454545454545 3417.2727272727275 2022
1661990700000000000 5 0 375.6590909090908 Sep 70.65070040668775 5927.909090909091 1192.2727272727273 4366.363636363636 2022
Because of this issue, grouping by month/year tags seems completely pointless and useless, unless you live in UTC timezone and do not have daylight savings!
Still curious to know if other people see the same issue here?
@jasonacox Can you do a similar check as above on your data. I think your timezone would be -7 from UTC?
If no else is seeing this problem, then there could be an issue with my set up somewhere....
Moved this to issue #80 as I may have discovered it is a problem with telegraf.conf
?
After a lot of trial and error (mostly error) I was able to copy all of my InfluxDB data to MySQL and get a query of solar production by month. Still no luck on the heatmap because I'm not sure that a heatmap in Grafana is exactly the same thing as the heatmap in the OP's example. I'll probably keep playing with the heatmaps later...
The copy of the data was accomplished using a Python script, so it should be easy to schedule the script to run and keep the data flowing into the MySQL database.
If anyone is interested in doing something like this, I'll do a separate post under 'show & tell' as there were a bunch of steps needed to get to this result.
Nice work @youzer-name ! I definitely think we should add this MySQL data pipeline script to the tools folder for anyone wanting to do the same.
@jasonacox Ugh... I found something else amiss in my data as a result of getting it over into MySQL. The issue @mcbirse noticed with the timezones and month tags also spills over into the downsampled daily data.
Background info: I just updated the telegraf.conf file to add the TZ reference per the recent discussion and bug report. I recreated the stack after that. I am using America/New_York for my TZ, (== GMT -0400 in DST and -0500 standard time).
When I tried to set the datetime as a primary key in MySQL on the 'daily' table (to prevent duplicates being created) I got an error that the key couldn't be created as there were duplicate values in the datetime field.
Looking at the data, I have TWO entries for the last day of each month, each with a different value in the 'month' field.
2022-08-29T04:00:00Z 27.61983005411255 9.093889971139971 50.55084173656204 Aug 26.54738099927849 0.09192271464646465 12.595960317460316 2022
2022-08-30T04:00:00Z 19.71932777777778 13.123890782828282 41.86573716856061 Aug 20.400520580808077 0.11669671717171716 11.23269696969697 2022
2022-08-31T04:00:00Z 3.800571969696969 3.667393939393939 7.439026704545457 Sep -0.0024954545454545455 0.018624242424242425 0.0024848484848484843 2022
2022-08-31T04:00:00Z 8.392989318181817 5.115769696969697 26.56693058080808 Aug 24.57233343434344 0.11186666666666668 11.380322474747475 2022
2022-09-01T04:00:00Z 46.81284646464646 3.8834981060606055 61.56610385416667 Sep 28.972736111111104 0.10087462121212122 17.993631944444445 2022
The entry showing the 'next month' (so in this case, the entry on 8/31 that shows 'Sep') never seems to show much in the way of solar, leading me to think this entry is always only using data from the night time.
When I sum up the hourly data from kwh using this query (in MySQL):
select sum(home) from kwh where datetime >= '2022-08-31 04:00:00' and datetime < '2022-09-01 00:00:00'
I get 26.56, which is the value in the daily table for 8/31/22 Aug
When I sum up the hourly data for the last four hours of the day (local time)
select sum(home) from kwh where datetime >= '2022-09-01 00:00:00' and datetime < '2022-09-01 04:00:00'
I get 7.43, which is the value in the daily table for 8/31/22 Sep.
I guess this is as a result of the downsampling to 'daily' happening every hour. In EDT timezone, for 20 hours on the last day of each month the process wrote the downsampled data to a row with the correct month, and for the last 4 hours of the day it was writing it to a new row with the wrong month.
This doesn't really create too big an issue if you ignore the month column (which was kind of the whole purpose for me trying to get this data into MySQL) except it means I can't create a primary key on the datetime field to prevent duplicate entries. I could make a multi-field index to prevent duplicates (for example: the combination of datetime and month must be unique), but then I realized that the way the downsampling occurs also creates an issue if I try to pull the daily data for today before the day is over. Since the daily entry for today is updated every hour and isn't a complete sum of today's data until the day is over, if I set it as unique, then the first entry for today's date will be the only one that makes it into the SQL database.
The two options I see for working around this are: 1) Just stick with the hourly data. It is 24x as much data, but may just be simpler 2) Set up the query in the python script to only pull yesterday's data. Never today's. Combined with a unique constraint that doesn't allow duplicates with the exact same datetime and month, that should result in accurate data in the MySQL daily table.
Given the above, I'm going to update my post in the show & tell section to have it pull both daily and hourly data, and make sure the query for daily data always lags by one day. I'll include notes about the multi-field unique key that needs to be created in the daily table to deal with the old data for anyone that was logging prior to the bug fix in telegraf.conf.
Good catch, @youzer-name - I suspect the culprit is the downsampling CQs:
CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m
BEGIN
SELECT integral(home)/1000/3600 AS home,
integral(solar)/1000/3600 AS solar, integral(from_pw)/1000/3600 AS from_pw,
integral(to_pw)/1000/3600 AS to_pw, integral(from_grid)/1000/3600 AS from_grid,
integral(to_grid)/1000/3600 AS to_grid
INTO powerwall.kwh.:MEASUREMENT
FROM autogen.http
GROUP BY time(1h), month, year tz('America/Los_Angeles')
END
CREATE CONTINUOUS QUERY cq_daily ON powerwall RESAMPLE EVERY 1h
BEGIN
SELECT sum(home) AS home, sum(solar) AS solar,
sum(from_pw) AS from_pw, sum(to_pw) AS to_pw, sum(from_grid) AS from_grid,
sum(to_grid) AS to_grid
INTO powerwall.daily.:MEASUREMENT
FROM powerwall.kwh.http
GROUP BY time(1d), month, year tz('America/Los_Angeles')
END
It is pulling the month and year tags (set by telegraf) into the downsample series so prior to the telegraf.conf fix, that data is bad too. Unless I'm missing something. The existing dashboard graphs won't care since they aren't using the tags, but this definitely creates a problem for anything using those tag.
@jasonacox I think this issue crosses over to some early work I'm doing on trying to get the power reporting to line up as near as possible to the Tesla generation data (which will help me solve an entirely separate problem of being to lazy to get up from my desk to do a reading from an unconnected inverter ...).
From the work I've been doing, I'm think that I can get the result I'm looking for by using a different version of the powerwall.kwh.xxx
query, and I think this may also help with resolving heatmap and other downsampling issues on the kwh data.
The revised version I've got in my head (but haven't tested yet as I'm not quite ready to trash my database just yet) is along the lines:
# New retention policy - grab total kWh generated in a minute
CREATE RETENTION POLICY kwh_pm ON powerwall duration INF replication 1
# ...
# Query pulls minute of data by differencing last cumulative export/import reading between buckets.
# Needs all of LAST, DIFFERENCE and GROUP BY <time> to work correctly.
# This approach eliminates drift from integral (1:1 correspondence with Powerwall source of truth) and *may*
# self correct for outages (as it is working on differences of cumulatives)
# Doing this allows easy accumulation of data during hour by sum. Not sure if resampling is needed?
CREATE CONTINUOUS QUERY cq_kwh_pm ON powerwall
BEGIN
SELECT DIFFERENCE(LAST(solar_energy_exported))/1000 AS solar,
DIFFERENCE(LAST(load_energy_imported))/1000 AS home,
DIFFERENCE(LAST(battery_energy_exported))/1000 AS from_pw,
DIFFERENCE(LAST(battery_energy_imported))/1000 AS to_pw,
DIFFERENCE(LAST(site_energy_imported))/1000 AS from_grid,
DIFFERENCE(LAST(site_energy_exported))/1000 AS to_grid
INTO powerwall.kwh_pm.:MEASUREMENT
FROM raw.http
GROUP BY time(1m), month, year fill(0) tz('America/Los_Angeles')
END
# And then hourly becomes:
CREATE CONTINUOUS QUERY cq_kwh ON powerwall RESAMPLE EVERY 1m
BEGIN
SELECT sum(home) AS home, sum(solar) AS solar,
sum(from_pw) AS from_pw,
sum(to_pw) AS to_pw,
sum(from_grid) AS from_grid,
sum(to_grid) AS to_grid
INTO powerwall.kwh.:MEASUREMENT FROM powerwall.kwh_pm.http GROUP BY time(1h), month, year tz('America/Los_Angeles') END
# daily and monthly unchanged.
If this works as I think it does, it should be a bit a cleaner than the integral and should have a one for one correspondence to the tesla cumulatives as long as there is no downtime (and may work correctly even then if InfluxQL is smart enough about the interval differences).
I'll keep working this for the stuff I'm looking at and will report back, but I thought I would mention it here in case it helps with the current issue.
I found someone using a heatmap to chart solar data over time. Thought something like this would be a great addition to the dashboard.