Open GangLiCN opened 4 years ago
Sorry, I forgot one "important" thing。
I also tried to "convert" all worker nodes to "master" nodes (In other words, after conversion, the orignal worker node is both for coordinate and for worker), The steps is as below: 1)Run SQL statement on worker nodes: ...... truncate table pg_dist_node; truncate table pg_dist_partition; truncate table pg_dist_shard;; --truncate table pg_dist_shard_placement; truncate table pg_dist_colocation;
copy pg_dist_node from PROGRAM 'psql "host=master1 port=5432 dbname=benchmarksql user=benchmarksql" -Atc "copy pg_dist_node to STDOUT"';
copy pg_dist_partition from PROGRAM 'psql "host=master1 port=5432 dbname=benchmarksql user=benchmarksql" -Atc "copy pg_dist_partition to STDOUT"';
copy pg_dist_shard from PROGRAM 'psql "host=master1 port=5432 dbname=benchmarksql user=benchmarksql" -Atc "copy pg_dist_shard to STDOUT"';
--Must execute "copy (select from xxx) to copy data from view copy pg_dist_shard_placement from PROGRAM 'psql "host=master1 port=5432 dbname=benchmarksql user=benchmarksql" -Atc "copy (select from pg_dist_shard_placement) to STDOUT"';
copy pg_dist_colocation from PROGRAM 'psql "host=master1 port=5432 dbname=benchmarksql user=benchmarksql" -Atc "copy pg_dist_colocation to STDOUT"';
Summary: After executed above sql statements, the metadata of coordinate node would transfer to work nodes, which makes orginal worker node "converted" to coordinator node as well。
2) Executed "runBenchmark.sh
Simple OLTP benchmarks often end up being bottlenecked on #connections / response time. In Citus, response time is naturally much higher than in PostgreSQL because every statement involves an extra network round-trip, hence throughput is lower unless you make sure that there is a sufficient number of connections to keep the coordinator busy.
The other challenge with this type of OLTP benchmark is that almost all the work is in planning the query, not in executing it. In the Citus case that means the coordinator has to do about as much work as single PostgreSQL server would normally do. In practice, applications that require scale (read: Citus) often have much higher execution times (e.g. because data does not fit in memory) and this equation works out very differently.
A somewhat more sophisticated TPC-C implementation is HammerDB, which is published by the TPC-council: https://github.com/TPC-Council/HammerDB
HammerDB wraps transactions in stored procedures, and Citus can delegate the call to the procedure to a worker node with minimal planning overhead on the coordinator and in a single network round trip. Hence, it gives a better reflection of the benefits of scaling out.
We have some tooling to make it easier to run HammerDB against Citus: https://github.com/citusdata/ch-benchmark
I think this blog explains even in more detail: https://www.citusdata.com/blog/2020/11/21/making-postgres-stored-procedures-9x-faster-in-citus/
[Summary] I'm hitting a very tough issue: The tpcc performance is very poor (about 1/10 of single DB) if using citus extension against PostgreSQL。
【Environment】 PostgreSQL: 11.X Benchmarksql: 5.0 Citus: 9.4 Coordinate node: 1 Work nodes: 2
[Description]
SELECT create_reference_table('bmsql_item'); SELECT create_reference_table('bmsql_district'); SELECT create_reference_table('bmsql_warehouse'); SELECT create_reference_table('bmsql_stock');
The reason why I created reference table for [warehouse] ,[stock],[item],[district] is because the foreign key's limitation, which caused I couldn't create distributed table for all type tables.
Executed "runDatabaseBuild.sh" to load warehouse data into database, then created index and foreign keys, finally vacuumed all table's statistics data;
Executed "runBenchmark.sql" to run TPCC test, unfortunately the result is very poor, "tpmC" is about 2000。
【Questions】
Thanks advance for your help !
Attachments is two screenshots for different tpcc test results。 test env: 100 warehouses, 20clients.