postgrespro / vops

Other
166 stars 22 forks source link

Does vops support native partition? #10

Closed Mad-Apes closed 12 months ago

Mad-Apes commented 6 years ago

Hello, Does vops support native partition in PostgreSQL 10? How shuld I do to use VOPS in partition tables? Can you provide me a demo?

knizhnik commented 6 years ago

VOPS doesn't provide any special support for partitioning. But VOPS tables are normal PostgreSQL tables, so you can definitely use native PostgreSQL 10 partitioning or pathman extension.

The main question is which goal you want to achieve by using partitioning? If you do not you sharding, i.e. partitioning of data between several nodes, then the only advantage of partitioning from performance point of view is reducing size of indexes which make it possible to hold them in memory and significantly reduce time of index lookups. It first of all critical for secondary indexes, because them use to have worse locality of references than primary index. But VOPS is mostly oriented on sequential query processing and here partitioning doesn't help.

Scatter data between several nodes (sharding), can be done using standard partitining and Postgres FDW (foreign data wrapper). In this case in theory you can take an advantage of parallel query processing at multiple nodes. But in practice it is not yet possible: several patches include aggregates push-down, asynchronous execution of queries by postgres_fdw,... has to be applied to make it really work. So unfortunately right now this approach doesn't work.

Summarizing all above: you can do local or remote partitioning for VOPS tables. You will have to manually insert data in each partition. But right now you will not get some significant benefits in performance because of this partitioning. What can really improve performance, especially on multicore systems, is parallel query execution. Please check you execution plan and make sure that it includes parallel processing. If statistic is up-to-date, then Postgres 10.0 should use parallel plan with default configuration parameters. But you can manually adjust some of them, first of all max_worker_processes, to increase level of concurrency.