timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.97k stars 883 forks source link

Support for CREATE INDEX CONCURRENTLY #504

Open pedrokost opened 6 years ago

pedrokost commented 6 years ago

Hypertables do not support concurrent index creation.

Hypertable tables often have high ingress rate, and locking the table to create an index is undesirable.

It would be great to see support for concurrent index creation in TimescaleDB.

mfreed commented 5 years ago

@pedrokost This isn't exactly the same, but I can confirm that you can create a concurrent index on an individual chunk, and we also just merged a fix that performs an individual transaction (lock) on each individual chunk, rather than a lock on the entire hypertable:

https://github.com/timescale/timescaledb/pull/1005

dianasaur323 commented 5 years ago

@JLockerman Can we close this out? I think #1005 might adequately address this issue. Thoughts?

kvc0 commented 5 years ago

1005 does not adequately address this issue. Extended table write outages are not really an option in cases like mine, and I can not know which indexes I will need ahead of time (living schema). I have a large amount of RAM (hundreds of gb) so my chunks are often quite sizable - the chunk lock is still critically long.

I would hazard that temporally local write amplification is far preferable to write outage in the general case and while anecdotal, I imagine I'm not the only one who needs CONCURRENTLY support.

joekohlsdorf commented 4 years ago

@pedrokost This isn't exactly the same, but I can confirm that you can create a concurrent index on an individual chunk, and we also just merged a fix that performs an individual transaction (lock) on each individual chunk, rather than a lock on the entire hypertable:

1005

And after creating the index on individual chunks, what do you do to keep creating it on new chunks? I found no way of creating the index on the base table afterwards. If you mess with _timescaledb_catalog.chunk_index you get a duplicate key error and if you run CREATE INDEX IF NOT EXISTS on the base table it will create a second index with the name suffix _1 on each chunk.

1005 still takes the lock far too long if you have big chunks.

stampy88 commented 8 months ago

Is there a solution for this?

And after creating the index on individual chunks, what do you do to keep creating it on new chunks? I found no way of creating the index on the base table afterwards. If you mess with _timescaledb_catalog.chunk_index you get a duplicate key error and if you run CREATE INDEX IF NOT EXISTS on the base table it will create a second index with the name suffix _1 on each chunk.

Normal postgres allows you to create index concurrently for each partition and then from the table. per the docs

When CREATE INDEX is invoked on a partitioned table, the default behavior is to recurse to all partitions to ensure they all have matching indexes. Each partition is first checked to determine whether an equivalent index already exists, and if so, that index will become attached as a partition index to the index being created, which will become its parent index. If no matching index exists, a new index will be created and automatically attached; the name of the new index in each partition will be determined as if no index name had been specified in the command. If the ONLY option is specified, no recursion is done, and the index is marked invalid. (ALTER INDEX ... ATTACH PARTITION marks the index valid, once all partitions acquire matching indexes.) Note, however, that any partition that is created in the future using CREATE TABLE ... PARTITION OF will automatically have a matching index, regardless of whether ONLY is specified.