Tencent / TBase

TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
Other
1.38k stars 262 forks source link

benchmark 5.0 压测 tbase 报错 #130

Open yangql opened 2 years ago

yangql commented 2 years ago

benchmark 配置

warehouses=1000 terminals=500 runMins=10 架构

gtm master cn01 master cn02 master datanode1 master datanode2 master

cn节点的 连接参数配置

/tbase/pgxc/nodes/coord_master/postgresql.conf:max_pool_size = 2000 /tbase/pgxc/nodes/coord_master/postgresql.conf:max_connections = 2000

dn节点的连接参数配置

/tbase/pgxc/nodes/dn001/postgresql.conf:max_connections = 8000 /tbase/pgxc/nodes/dn001/postgresql.conf:max_pool_size = 8000

/tbase/pgxc/nodes/dn002/postgresql.conf:max_connections = 8000 /tbase/pgxc/nodes/dn002/postgresql.conf:max_pool_size = 8000

报错信息

11:13:17,498 [Thread-295] ERROR jTPCCTData : ERROR: node:dn001, backend_pid:19703, nodename:dn001,backend_pid:19703,message:Failed to get pooled connections Hint: This may happen because one or more nodes are currently unreachable, either because of node or network failure. Its also possible that the target node may have hit the connection limit or the pooler is configured with low connections. Please check if all nodes are running fine and also review max_connections and max_pool_size configuration parameters 11:13:17,498 [Thread-55] ERROR jTPCCTData : Unexpected SQLException in STOCK_LEVEL

查看数据库 pg_stat_activity 表显示有sql请求被阻塞 ,

postgres=# select * from pg_preparedxacts; transaction | gid | prepared | owner | database -------------+--------------------------+-------------------------------+-----------+----------- 966365 | $XC$1529356:cn001:F:2:0 | 2022-04-01 16:13:33.979825+08 | benchmark | benchmark 965853 | $XC$1528537:cn001:F:2:0 | 2022-04-01 16:13:33.982607+08 | benchmark | benchmark 966380 | $XC$1529348:cn001:F:2:0 | 2022-04-01 16:13:33.982619+08 | benchmark | benchmark 966381 | $XC$1529314:cn001:F:2:0 | 2022-04-01 16:13:33.989759+08 | benchmark | benchmark 966199 | $XC$1529014:cn001:F:2:0 | 2022-04-01 16:13:33.990128+08 | benchmark | benchmark

问题

1 这个 报错Failed to get pooled connections 提示需要修改连接参数但是连接参数已经比较大了,还要怎么修改?

2 对于性能压测 有没有关于数据库 分布式事务的参数?

JennyJennyChen commented 2 years ago

1、TPCC推荐配置: 如三台机器16core + 64G,则tpcc的参数配置可以是 conn=jdbc:postgresql://192.168.0.2:11379,102.168.0.3:11381,192.168.0.4:11379/global?loadBalanceHosts=true&oracle_compile=true warehouses=500 loadWorkers=32 terminals=96

内核参数: persistent_datanode_connections = 'on' enable_material = 'off' enable_bitmapscan = 'off' max_wal_size = '12GB' shared_buffers = '16GB' checkpoint_timeout = '600' min_wal_size = '4GB' pooler_scale_factor = '64' archive_status_control = 'continue' maintenance_work_mem = '4GB' effective_cache_size = '50GB' max_parallel_workers_per_gather = '0' max_pool_size = '65535' work_mem = '8MB' wal_keep_segments = '4096'

2、针对你说的 select * from pg_prepared_xacts; 查询到的是2PC事务中处于prepare阶段的事务,在分布式系统中属于正常会自动结束,如果某个事务长时间不结束可能存在2PC残留的情况,可参考下面的材料进行人工自动清理: https://github.com/Tencent/TBase/wiki/11-v2.3.0%E5%8D%87%E7%BA%A7%E7%89%B9%E6%80%A7pg_clean%E4%BD%BF%E7%94%A8%E8%AF%B4%E6%98%8E