ApsaraDB / PolarDB-for-PostgreSQL

A cloud-native database based on PostgreSQL developed by Alibaba Cloud.
https://apsaradb.github.io/PolarDB-for-PostgreSQL/zh/
Apache License 2.0
2.89k stars 480 forks source link

[Question]运行HTAP测试的时候出现ERROR: could not resize shared memory segment #302

Closed rong961 closed 1 year ago

rong961 commented 2 years ago

运行HTAP测试的时候出现ERROR: could not resize shared memory segment

首先,我们使用oltpbench运行polardb的测试前,测试了读写节点和所有只读节点前的连通性,可以通过psql -h -p -Upostgres postgres登录 然后我们在读写节点上运行TP事务,在其中一个读写节点上运行AP查询(猜测只读节点之间会自动平衡负载),TP和AP中数据库的连接均填写了读写节点的ip,运行过程中,读写节点没有出现错误,而AP执行中出现以下错误

07:26:41,776 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
        at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:246)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:71)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.1522265545" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:69)
        ... 4 more
07:28:32,885 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.1509574628" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
07:38:58,125 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
        at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:246)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:71)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.888046644" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:69)
        ... 4 more
07:52:39,608 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.1289792276" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
07:53:24,679 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.335896825" to 8388608 bytes: No space left on device
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
07:53:37,134 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
        at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:246)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:71)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.759095010" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:69)
        ... 4 more
07:55:02,147 (Worker.java:501) WARN  - The DBMS rejected the transaction without an error code
org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
        at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:246)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:71)
        at com.oltpbenchmark.benchmarks.chbenchmark.CHBenCHmarkWorker.executeWork(CHBenCHmarkWorker.java:37)
        at com.oltpbenchmark.api.Worker.doWork(Worker.java:388)
        at com.oltpbenchmark.api.Worker.run(Worker.java:296)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.postgresql.util.PSQLException: ERROR: could not resize shared memory segment "/PostgreSQL.67511109" to 8388608 bytes: No space left on device
  Where: parallel worker
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:158)
        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.GenericQuery.run(GenericQuery.java:77)
        at com.oltpbenchmark.benchmarks.chbenchmark.queries.Q15.run(Q15.java:69)
        ... 4 more

根据这个问题,我们尝试了修改docker容器的shm_size,停下容器修改配置文件和重启容器时附带配置两种方式均没有能够成功修改这个值,想问下是我运行的方式不正确吗,还是需要修改某个特定参数呢

polardb-bot[bot] commented 2 years ago

Hi @rong961 ~ Thanks for opening this issue! 🎉

Please make sure you have provided enough information for subsequent discussion.

We will get back to you as soon as possible. ❤️

mrdrivingduck commented 2 years ago

@rong961 Does this work?

docker run -it \
    ...
    --shm-size=512m \
    ...
rong961 commented 2 years ago

添加这行命令是有效的,感谢,但是在测试的结果上我们发现TP和AP相互干扰较严重,测试是在容器外连接的数据库,TP和AP都连在读写节点的pg上,想问下是什么缘故呢

仅在读写节点运行AP时的查询性能 "Throughput (requests/second)": 0.15982239541273163, "isolation": "TRANSACTION_REPEATABLE_READ", "scalefactor": "10", "terminals": "2"

在读写节点同时运行TP和AP时的查询性能 "Throughput (requests/second)": 0.028856822919917844, "isolation": "TRANSACTION_REPEATABLE_READ", "scalefactor": "10", "terminals": "2"

mrdrivingduck commented 2 years ago

@rong961 Currently, we haven't implement any resource isolation between TP workload and AP workload, so they do interfere each other, because they will use the same hardware resources, and acquire locks on the same table if necessary.

Ideally, we recommend to make TP and AP isolated physically, which looks like: use RW and several RO to work on TP workloads, and use RW and several other RO to work on AP workloads. A proxy may be helpful to implement this isolation.

mrdrivingduck commented 1 year ago

/close