Closed Mad-Apes closed 12 months ago
In principle, you can populate data in parallel by spawning several concurrent populate statements with different predicates:
select populate(destination:='vops_table_gb'::regclass, source:='std_table'::regclass, sort := 'column_int2,column_int4', predicate:='column_int2 >= 0 and column_int2 < 1');
select populate(destination:='vops_table_gb'::regclass, source:='std_table'::regclass, sort := 'column_int2,column_int4', predicate:='column_int2 >= 1 and column_int2 < 2');
...
It will allow to load all CPU cores and perform import up to N times faster. But please take in account that performing parallel insert will violate sequential order of records in vops_table_gb table which may have negative impact on performance of subsequent select queries. This was one of the reasons why I have not implemented in VOPS parallel load.
Hello, Through recent practice, I found out that query performance has greatly improved, but populate function consumes a lot of time. The populate function will take more than 200 seconds when normal table contains 2000,000 rows. So, what should I do to make it better?