timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.86k stars 884 forks source link

ORDER BY is not supported by continuous view #2872

Open karlsosha opened 3 years ago

karlsosha commented 3 years ago

Relevant system information:

Describe the bug [0A000] ERROR: aggregates with FILTER / DISTINCT / ORDER BY are not supported

To Reproduce

create table test_table(ts bigint, price double precision);
select create_hypertable('test_table', 'ts');

Populate test_table with data

create materialized view test_view with (timescaledb.continuous) as select time_bucket(60, ts) as bucket, max(price), min(price), percentile_cont(0.5) within group (order by price) from test_table group by bucket;

Expected behavior Continuous Aggregate Created.

Actual behavior If you're looking to update median price on a test table continuously the view like above will need to be created. But the ORDER BY is not supported currently for the continuous aggregates and the creation will fail.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context You are not only looking min/max/first/last continuously but you also may be looking for median or percentile of the prices. Which means that calculation may need to be supported even though it cannot act only on the new data but it should include the entire subset.

Issue 1432 first reported this.

davidkohn88 commented 3 years ago

This likely won't be supported in continuous aggregates as ordered set aggregates require the entire data set, so you'd have to materialize all of that in order for them to work. However, we do have an issue over in Timescale Analytics to support approximation/sketching algorithms: https://github.com/timescale/timescale-analytics/issues/1 which will provide a bounded-error estimate of various percentiles, and also have already merged code to support UDDSketch: https://github.com/timescale/timescale-analytics/pull/75 and tdigest: https://github.com/timescale/timescale-analytics/pull/61 (though that was a later commit cleaning some of that code up). You can find documentation for each of those here: https://github.com/timescale/timescale-analytics/blob/main/extension/docs and we'd love it if you wanted to try out one of our nightly builds and provide feedback!

Both of those sketches have forms that will work with continuous aggregates.