timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.82k stars 852 forks source link

[Bug]: Insert directly into compressed chunk doesn't update compression state #7026

Open erimatnor opened 3 weeks ago

erimatnor commented 3 weeks ago

What type of bug is this?

Unexpected error

What subsystems and features are affected?

Compression

What happened?

When inserting data directly into a compressed chunk (not via the hypertable), the "partially compressed" state is not properly updated. This prevents running recompression, instead leading to an error.

TimescaleDB version affected

2.15.2

PostgreSQL version used

16.2

What operating system did you use?

Ubuntu 24.04

What installation method did you use?

Source

What platform did you run on?

Other

Relevant log output and stack trace

insert into :chunk3 values ('2022-06-15 16:00', 8, 8, 8.0, 8.0);
select * from only :chunk3;
          created_at          | location_id | device_id | temp | humidity 
------------------------------+-------------+-----------+------+----------
 Wed Jun 15 16:00:00 2022 PDT |           8 |         8 |    8 |        8
(1 row)

select compress_chunk(:'chunk3', compress_using => 'heap');
NOTICE:  chunk "_hyper_1_5_chunk" is already compressed
             compress_chunk             
----------------------------------------
 _timescaledb_internal._hyper_1_5_chunk
(1 row)

How can we reproduce the bug?

1. Create a table with at least one chunk.
2. Insert a row directly into a chunk as shown above.
3. Run compress_chunk() on the chunk and an error will be shown.
nikkhils commented 2 weeks ago

@erimatnor @antekresic as a rule, shouldn't we reject direct INSERTs into a compressed chunk?

erimatnor commented 2 weeks ago

@erimatnor @antekresic as a rule, shouldn't we reject direct INSERTs into a compressed chunk?

That's an option, but I see no reason to reject it because it is a valid insert and uncompressed data goes into that chunk anyway. The only difference when inserting via hypertable is that the data is routed to the chunk instead of going directly. The routing is also where the partial compressed state is updated.