timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.8k stars 852 forks source link

[Bug]: `time_bucket_gapfill` in real-time CAGGs with time buckets fails to recursively aggregate results from source hypertable #7058

Closed vanderhoop closed 5 days ago

vanderhoop commented 1 week ago

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Continuous aggregate

What happened?

Expected behavior:

When querying a real-time continuous aggregate (materialized_only=false) built with a time_bucket interval, I should be able to use the time_bucket_gapfill function with a smaller interval and get a) distinct results for each of the desired buckets (via recursively traveling down to the source hypertable to get finer-grained data), and b) standard gapfilling behavior for buckets that are null.

Actual Behavior:

All results for a time_bucket_gapfill query to the CAGG returns a) having their data downsampled _into the time buckets of the CAGG's time_bucket interval_, i.e. if the CAGG has 10m time buckets and I query the cagg with time_bucket_gapfill with an interval of 1m, a 00:10:00 bucket will have results, as will the 00:20:00 bucket, but the intervening buckets aren't returned at all!

Example Query to the Real-Time CAGG:

SELECT  time_bucket_gapfill(INTERVAL '1 m', bucket, 'ETC/UTC')
       ,approx_percentile(0.75,percentile_agg_lcp)
FROM cagg_10m
WHERE bucket >= now() - INTERVAL '60 m';
+------------------------+--------------------+
|  time_bucket_gapfill   | approx_percentile  |
+------------------------+--------------------+
| 2024-06-22 15:00:00+00 |  360.3232201304007 |
| 2024-06-22 14:40:00+00 | 285.14601555033624 |
| 2024-06-22 14:40:00+00 | 1144.8190260763279 |
| 2024-06-22 14:40:00+00 |  247.3987325835215 |
| 2024-06-22 14:40:00+00 |  196.9601178933079 |
+------------------------+--------------------+

And here's some sample data from the source hypertable over that same timespan.

SELECT initiated_at, lcp
FROM navigations
WHERE initiated_at >= now() - INTERVAL '60 m';
+----------------------------+--------+
|        initiated_at        |  lcp   |
+----------------------------+--------+
| 2024-06-22 15:01:26.014+00 |  360.2 |
| 2024-06-22 14:49:25+00     |  285.2 |
| 2024-06-22 14:48:40.974+00 |  196.9 |
| 2024-06-22 14:48:31.276+00 |  247.6 |
| 2024-06-22 14:48:27.396+00 | 1145.7 |
| 2024-06-22 14:42:55.875+00 |   NULL |
+----------------------------+--------+

I thought that maybe this makes sense, given this quote from the CAGG docs:

You can't use time_bucket_gapfill directly in a continuous aggregate. This is because you need access to previous data to determine the gapfill content, which isn't yet available when you create the continuous aggregate. You can work around this by creating the continuous aggregate using time_bucket, then querying the continuous aggregate using time_bucket_gapfill.

However, the CAGG in question was created with with time_bucket, and I am querying the cagg with time_bucket_gapfill so I'm currently at a loss.

Suspicion/Hypothesis

Given that the time_bucket_gapfill call isn't providing the requested bucket sizes at all (see above, where they're all returned 10m buckets instead of 1m buckets), my suspicion is that the actual value aggregation calculation formula is correct, but the user-facing result is still incorrect due to the gapfilling itself not taking place?

TimescaleDB version affected

2.14.2

PostgreSQL version used

16.2 (Ubuntu 16.2-1.pgdg22.04+1) on aarch64-unknown-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

What operating system did you use?

Mac OS X 10.5 ARM

What installation method did you use?

Docker

What platform did you run on?

Timescale Cloud

Relevant log output and stack trace

No response

How can we reproduce the bug?

[TO BE COMPLETED, HOPEFULLY THIS WEEKEND :)]
vanderhoop commented 5 days ago

Closing, as the above was due to a lack of an aggregate accessor and including the time_bucket_gapfill in a GROUP BY 🤦🏻‍♂️