timescale / timescaledb-backfill

Backfill hypertable data from one timescale instance to another
Apache License 2.0
0 stars 0 forks source link

TS 2.14 compression settings are stored differently #148

Closed alejandrodnm closed 6 months ago

alejandrodnm commented 10 months ago

More context is pending

https://iobeam.slack.com/archives/C04M7NQFH89/p1704788673081989

JamesGuthrie commented 9 months ago

I took a look into this.

What changed?

Previously, when compression was enabled, a parent table was created for the compressed data. Compressed chunks corresponding to the uncompressed chunk were then created as child tables of the parent table:

- parent compressed table
   - compressed chunk 1
   - compressed chunk 2
   - ...

With these changes, the compressed chunks can have different internal layout, so they are no longer created as children of the compressed table:

- parent compressed table (placeholder, no columns)
- compressed chunk 1
- compressed chunk 2

How are we affected?

  1. Users must not adjust compression settings in the source or the target before all chunks have been backfilled.
  2. If the empty compressed chunk is not present in the target for a compressed chunk in the source (this will happen during "normal" operations)

The way that we copy compressed chunks from the source to the target is if there is a compressed chunk in the source, but not in the target, we create a new child table of the parent compressed table, copy the compressed rows from the source into the target, and then use an internal API to promote the child table to a chunk belonging to the parent compressed table.

This will no longer work because the "parent compressed table" has no columns, so a child of it can't receive the rows in the source chunk. We also cannot create the chunk in the target using Timescale's create_compress_chunk function, a) because it's not exposed from SQL, and b) because the compression settings in the target could be different than the compression settings of the source chunk.

What must we do?

When creating the table to receive the compressed rows from the source chunk, we must use the table definition as it exists in the source database.

alejandrodnm commented 9 months ago

cc @giokostis @VineethReddy02

giokostis commented 9 months ago

Let's aim to make a decision on this item's priority by this coming Monday.

According to the latest information we have on the 2.14 it's coming out in ~2 weeks from now https://iobeam.slack.com/archives/CDZGF8U7J/p1706610233161829

alejandrodnm commented 6 months ago

closed by #159