timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.88k stars 882 forks source link

Update show_chunks() function to be space partition aware #1289

Open dianasaur323 opened 5 years ago

dianasaur323 commented 5 years ago

From community user: We can find the set of chunks for a day using show_chunks(). But if there's a second dimension on the hypertable that will show all the chunks for the time range. Can we find which chunk a given value for the second dimension will hash to? It would be very handy to find which individual chunk a time / value combination was stored in when there's a second dimension.

srstsavage commented 3 years ago

I need this info also. I'm looking at compressing older chunks, but will periodically need to decompress certain chunks to allow us to backfill data from new sources. We're exploring distributed hypertables and using a secondary space partioning column. I can't find a way to get the range of the secondary column for each chunk (not in timescaledb_information.chunks or accessible via a show_chunks argument), and without it I would need to decompress every chunk in the time range of the backfill instead of just those chunks whose secondary partitioning column range contains the new data.

srstsavage commented 3 years ago

Ok, after peeking at the secondary partitioning column values in some distributed tables I'm realizing it wouldn't make any sense to have a secondary range in timescaledb_information.chunks since the distribution is determined by hashing.

So instead, as the original request stated, it would be good to be able to feed partitioning column values to show_chunks to show which chunks correspond to those values.