Closed inselbuch closed 1 year ago
+1
Not sure i understand your example correctly but you can achieve a time-weighted average with time_bucket_gapfill:
CREATE TABLE data AS SELECT * FROM (VALUES ('2000-01-01 09:59'::timestamp,15),('2000-01-01 10:01',14),('2000-01-01 10:04',15),('2000-01-01 10:08',10),('2000-01-01 10:00',15),('2000-01-01 10:59',15)) v(time,value);
SELECT
time_bucket('15m', time),
avg(value)
FROM (
SELECT
time_bucket_gapfill('1m', time, '2000-01-01 10:00:00', '2000-01-01 11:00:00') AS time,
locf(avg(value)) AS value
FROM
data
GROUP BY
1) sub
WHERE
time BETWEEN '2000-01-01 10:00:00'
AND '2000-01-01 11:00:00'
GROUP BY
1;
In the inner query you generate a data point for every minute carrying forward the last seen value, in the outer query you calculate 15 minute averages.
That is an arithmetic average of generated one-minute values. And it will give the correct result as long as the values are all aligned on one-minute boundaries. But what happens if the values have resolutions below one minute?
Also, that is quite cumbersome, is it not?
Sent from my iPad
On Oct 13, 2020, at 4:35 PM, Sven Klemm notifications@github.com wrote:
Not sure i understand your example correctly but you can achieve a time-weighted average with time_bucket_gapfill:
CREATE TABLE data AS SELECT * FROM (VALUES ('2000-01-01 09:59'::timestamp,15),('2000-01-01 10:01',14),('2000-01-01 10:04',15),('2000-01-01 10:08',10),('2000-01-01 10:00',15),('2000-01-01 10:59',15)) v(time,value);
SELECT time_bucket('15m', time), avg(value) FROM ( SELECT time_bucket_gapfill('1m', time, '2000-01-01 10:00:00', '2000-01-01 11:00:00') AS time, locf(avg(value)) AS value FROM data GROUP BY 1) sub WHERE time BETWEEN '2000-01-01 10:00:00' AND '2000-01-01 11:00:00' GROUP BY 1; In the inner query you generate a data point for every minute carrying forward the last seen value, in the outer query you calculate 15 minute averages.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
There are different solutions possible e.g using window functions but unfortunately they dont get less cumbersome when accounting for corner cases
Consider the scenario where you have around a dozen values per day with timestamps significant to the second. You are looking to generate a daily time-weighted average for the past six months.
The algorithm for time-weighted averages would work with the 2,160 values.
To use the bucket-fill method, Timescale needs to produce over a fifteen million values (15,552,000).
And if we want to be accurate to the tenth of a second? Arithmetic average of equally spaced values demands generation of over ten times that number of values (150 million) while still only actually needing 2,160 values.
If you are working with synchrophasors that are recorded 60 times per second, close to a billion values 933,120,000.
Here is a link to Python source for the time-weighted average:
@inselbuch Have you been able to solve this problem somehow with continuous aggregates? I'm currently in the same situation, having to calculate a time-weighted average.
Such a functionality in timescaledb would be very helpful, as in just writing twa()
instead of avg()
. Before using an inner query as a workaround, it is easier to code it in Python.
Looks like it's implemented as time_weight
function in the timescaledb_toolkit
extension:
https://www.timescale.com/blog/what-time-weighted-averages-are-and-why-you-should-care/
Should we close out this issue given time_weight
in the toolkit @davidkohn88 ?
Sorry yes!
It would be valuable for TimescaleDB to generate time-weighted averages for any specified interval on the fly. A time-weighted average differs from an arithmetic average:
1) the value preceding the time interval in question must be included in the computation 2) any value in the interval is weighted by the amount of time it was true (i.e., until the next value arrived)
For example:
select time, twa(value) from myData where time > startTime and time < endTime and period = '+0:15:00'
with a range of 5 hours would generate 4 values per hour.
time-weighted average.xlsx