Closed JamesGuthrie closed 4 months ago
If I understand the PR correctly, your patch wouldn't send the catalog updates for the chunk tables itself, would it? That would actually break functionality on my tool, since I completely replicate the catalog in memory and need that information.
I'd probably define a second set of markers with slightly different prefix names. In this case everybody would be able to decide to either ignore them or act on them. Alternatively, we could use the message body, like adding the table names, but I think the second prefix is easier and more obvious.
Changing the GUC name is ok, I can let people know in the docs :)
If I understand the PR correctly, your patch wouldn't send the catalog updates for the chunk tables itself, would it?
This PR doesn't change which catalog updates are emitted. One thing that has changed is where exactly the "end decompression marker" is in the WAL. It now surrounds both inserts into the uncompressed chunk, and the associated catalog changes. To visualise this:
Previously the WAL would look like this for a "transparent decompression" event (e.g. UPDATE to compressed data):
BEGIN
message: transactional: 1 prefix: ::timescaledb-decompression-start, sz: 0 content:
table _timescaledb_internal.compress_hyper_2_2_chunk: DELETE: (no-tuple-data)
table _timescaledb_internal._hyper_1_1_chunk: INSERT: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:1
table _timescaledb_internal._hyper_1_1_chunk: INSERT: "time"[timestamp with time zone]:'2023-06-30 18:00:00-07' device_id[bigint]:1 value[double precision]:1
message: transactional: 1 prefix: ::timescaledb-decompression-end, sz: 0 content:
table _timescaledb_catalog.chunk: UPDATE: id[integer]:1 hypertable_id[integer]:1 schema_name[name]:'_timescaledb_internal' table_name[name]:'_hyper_1_1_chunk' compressed_chunk_id[integer]:2 dropped[boolean]:false status[integer]:9 osm_chunk[boolean]:false
table _timescaledb_internal._hyper_1_1_chunk: UPDATE: old-key: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:1 new-tuple: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:22
COMMIT
Now, it looks like this:
BEGIN
message: transactional: 1 prefix: ::timescaledb-decompression-start, sz: 0 content:
table _timescaledb_internal.compress_hyper_2_2_chunk: DELETE: (no-tuple-data)
table _timescaledb_internal._hyper_1_1_chunk: INSERT: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:1
table _timescaledb_internal._hyper_1_1_chunk: INSERT: "time"[timestamp with time zone]:'2023-06-30 18:00:00-07' device_id[bigint]:1 value[double precision]:1
table _timescaledb_catalog.chunk: UPDATE: id[integer]:1 hypertable_id[integer]:1 schema_name[name]:'_timescaledb_internal' table_name[name]:'_hyper_1_1_chunk' compressed_chunk_id[integer]:2 dropped[boolean]:false status[integer]:9 osm_chunk[boolean]:false
message: transactional: 1 prefix: ::timescaledb-decompression-end, sz: 0 content:
table _timescaledb_internal._hyper_1_1_chunk: UPDATE: old-key: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:1 new-tuple: "time"[timestamp with time zone]:'2023-06-30 17:00:00-07' device_id[bigint]:1 value[double precision]:22
COMMIT
Note that the timescaledb-decompression-end
message now comes after the UPDATE to the _timescaledb_catalog.chunk
table.
IMHO this is "more correct", as the decompression markers now surround all decompression activities. Your tool can decide which kinds of activities it does or does not ignore in the context of those markers.
Ah I see. Yeah that is totally fine, since I can still act on all catalog entries, while ignoring anything outside the catalog 👍
This allows to distinguish between "user-driven" and "compression-driven" DML on uncompressed chunks. This is a requirement to be able to support DML on compressed chunks in live migration.
Note: A previous commit 1 added wal markers before and after inserts which were part of "transparent decompression". Transparent decompression is triggered when an UPDATE or DELETE statement affects compressed data, or an INSERT statment inserts into a range of compressed data which has a unique or primary key constraint. In these cases, the data is first moved from the compressed chunk to the uncompressed chunk, and then the DML is applied.
This change extends the existing behaviour on two fronts: