timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.82k stars 852 forks source link

[Enhancement]: Remove requirement that `compress_after` be larger than refresh policy's `start_offset` #7021

Open RobAtticus opened 3 weeks ago

RobAtticus commented 3 weeks ago

What type of enhancement is this?

Configuration

What subsystems and features will be improved?

Continuous aggregate

What does the enhancement do?

Prior to compressed chunks supporting all DML operations, I could see why it would be useful to ensure compression on a continuous aggregate was only happening on data outside the refresh policy's range. However, now that compressed chunks support all DML operations, the compression policy's configuration of continuous aggregate should be independent of its refresh policy's configuration.

In my case, and likely most cases of time series data, new data to the hypertable I'm materializing from is only added, little to no backfill. I do set a wider refresh window than I actually expect new data to appear in, just to catch random stragglers. However, by doing so, that forces my compression policy (well, sort of, see below) to wait longer than I'd like to start compressing. For example, if I only expect data 5-15mins old, but at most 2 hrs old, I might set the start_offset to a very conservative 24hours. But then I can't compress the continuous aggregate for at least 24h, even though I know the data is likely final.

The restriction is also only skin deep. I can alter_job and change the compression policy's compress_after once I've added the policy to be anything I want; no warnings or errors are given.

Implementation challenges

No response

RobAtticus commented 3 weeks ago

Also worth noting that this issue prevents you front attaching any compression policy to a continuous aggregate if it does not have a refresh policy. While refresh policies are definitely convenient, they may not be suitable in all cases, so it's a bit odd to also prevent compression.

(In my case, I use a custom policy to do refreshing, or some caggs I might manually refresh as needed)