When querying a real-time continuous aggregate (materialized_only=false) built with a time_bucket interval, I should be able to use the time_bucket_gapfill function with a smaller interval and get a) distinct results for each of the desired buckets (via recursively traveling down to the source hypertable to get finer-grained data), and b) standard gapfilling behavior for buckets that are null.
Actual Behavior:
All results for a time_bucket_gapfill query to the CAGG returns a) having their data downsampled _into the time buckets of the CAGG's time_bucket interval_, i.e. if the CAGG has 10m time buckets and I query the cagg with time_bucket_gapfill with an interval of 1m, a 00:10:00 bucket will have results, as will the 00:20:00 bucket, but the intervening buckets aren't returned at all!
Example Query to the Real-Time CAGG:
SELECT time_bucket_gapfill(INTERVAL '1 m', bucket, 'ETC/UTC')
,approx_percentile(0.75,percentile_agg_lcp)
FROM cagg_10m
WHERE bucket >= now() - INTERVAL '60 m';
I thought that maybe this makes sense, given this quote from the CAGG docs:
You can't use time_bucket_gapfill directly in a continuous aggregate. This is because you need access to previous data to determine the gapfill content, which isn't yet available when you create the continuous aggregate. You can work around this by creating the continuous aggregate using time_bucket, then querying the continuous aggregate using time_bucket_gapfill.
However, the CAGG in question was created with with time_bucket, and I am querying the cagg with time_bucket_gapfill so I'm currently at a loss.
Suspicion/Hypothesis
Given that the time_bucket_gapfill call isn't providing the requested bucket sizes at all (see above, where they're all returned 10m buckets instead of 1m buckets), my suspicion is that the actual value aggregation calculation formula is correct, but the user-facing result is still incorrect due to the gapfilling itself not taking place?
TimescaleDB version affected
2.14.2
PostgreSQL version used
16.2 (Ubuntu 16.2-1.pgdg22.04+1) on aarch64-unknown-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
What type of bug is this?
Incorrect result
What subsystems and features are affected?
Continuous aggregate
What happened?
Expected behavior:
When querying a real-time continuous aggregate (
materialized_only=false
) built with atime_bucket
interval, I should be able to use thetime_bucket_gapfill
function with a smaller interval and get a) distinct results for each of the desired buckets (via recursively traveling down to the source hypertable to get finer-grained data), and b) standard gapfilling behavior for buckets that are null.Actual Behavior:
All results for a
time_bucket_gapfill
query to the CAGG returns a) having their data downsampled _into the time buckets of the CAGG'stime_bucket
interval_, i.e. if the CAGG has 10m time buckets and I query the cagg withtime_bucket_gapfill
with an interval of 1m, a00:10:00
bucket will have results, as will the00:20:00
bucket, but the intervening buckets aren't returned at all!Example Query to the Real-Time CAGG:
And here's some sample data from the source hypertable over that same timespan.
I thought that maybe this makes sense, given this quote from the CAGG docs:
However, the CAGG in question was created with with
time_bucket
, and I am querying the cagg withtime_bucket_gapfill
so I'm currently at a loss.Suspicion/Hypothesis
Given that the
time_bucket_gapfill
call isn't providing the requested bucket sizes at all (see above, where they're all returned 10m buckets instead of 1m buckets), my suspicion is that the actual value aggregation calculation formula is correct, but the user-facing result is still incorrect due to the gapfilling itself not taking place?TimescaleDB version affected
2.14.2
PostgreSQL version used
16.2 (Ubuntu 16.2-1.pgdg22.04+1) on aarch64-unknown-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
What operating system did you use?
Mac OS X 10.5 ARM
What installation method did you use?
Docker
What platform did you run on?
Timescale Cloud
Relevant log output and stack trace
No response
How can we reproduce the bug?