timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.74k stars 885 forks source link

[Feature]: Add new strategy for data retention #5132

Open MrLamberg opened 1 year ago

MrLamberg commented 1 year ago

What problem does the new feature solve?

Hello. I met some problems using timescaleDB. Our Datasource can stop sending data for a long time. But other users expect to see data in the database.

Example

  1. Create a table CREATE TABLE test_table ( event_time TIMESTAMPTZ NOT NULL, location TEXT NOT NULL ); SELECT create_hypertable('test_table', 'event_time');
  2. Add a retention policy SELECT add_retention_policy('test_table', INTERVAL '1 year');

If we stop sending data we got the empty table after 1 year.

If it matches your vision of the product I can try to make a pull request for this feature.

What does the feature do?

It is possible to add a feature to make a strategy for retention policy based on max(event_time(or another column that we use for data splitting))?

I am offering to make a feature like this: when we make a retention policy, add a user possibility to choose a retention strategy for the table. SELECT add_retention_policy('test_table', INTERVAL '1 year', MAX_DATE_MODE); And it will delete data only if it is older than the younger record in the table.

Implementation challenges

No response

mfreed commented 1 year ago

One thing to be careful about is you probably want min(now(), MAX_DATE_MODE). Earlier we had done some features by keying off of max date (as opposed to now()), and the tricky bit is that if you ever accidentally add an entry way into the future (including like MAX_INT due to an error, or to erroneous clocks), you risk wiping away all your data.