Closed arssher closed 2 years ago
I tried to reproduce your observations, got similar results, but for me it seems more like there is some problem with locks in postgres, rather than client issue.
From my observations, iteration usually have some progress in about ~1 second, or doesn't have progress at all for 10-20 seconds (that's why test is failing). What I tried to do was to collect various useful info once there is no progress in iteration. I do this by running psql -c "SELECT * FROM pg_stat_activity"
, and strangely enough, this SELECT hangs for 10-20 seconds and completes only after iteration starts to show progress. It was reproduced for several times, here's an example:
19:40:37 - new iteration has no progress (txes are stuck)
ps ax shows this:
114225 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43178) UPDATE waiting
114227 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43180) UPDATE waiting
114228 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43182) UPDATE waiting
114229 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43184) UPDATE waiting
114230 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43186) UPDATE waiting
114231 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43188) UPDATE
114232 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43190) UPDATE waiting
114233 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43192) UPDATE waiting
114235 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43194) UPDATE waiting
114236 ? Ss 0:18 postgres: zenith_admin postgres 127.0.0.1(43196) UPDATE
psql for pg_stat_activity responds only at 19:40:51, when iteration has already finished, reports this:
datid | datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query_id | query | backend_type
-------+----------+--------+------------+----------+--------------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+---------------------+-------------+--------------+----------+------------------------------------------------------------+------------------------------
| | 114046 | | | | | | | | 2022-03-08 19:38:26.821149+00 | | | | Activity | AutoVacuumMain | | | | | | autovacuum launcher
| | 114049 | | 10 | zenith_admin | | | | | 2022-03-08 19:38:26.821515+00 | | | | Activity | LogicalLauncherMain | | | | | | logical replication launcher
13008 | postgres | 114145 | | 10 | zenith_admin | | 127.0.0.1 | | 43154 | 2022-03-08 19:38:36.926781+00 | | 2022-03-08 19:40:47.030632+00 | 2022-03-08 19:40:47.030846+00 | Client | ClientRead | idle | | | | SELECT pg_current_wal_flush_lsn() | client backend
13008 | postgres | 114225 | | 10 | zenith_admin | | 127.0.0.1 | | 43178 | 2022-03-08 19:38:39.047488+00 | 2022-03-08 19:40:51.076809+00 | 2022-03-08 19:40:51.076809+00 | 2022-03-08 19:40:51.076885+00 | Client | ClientRead | idle in transaction | | | | BEGIN; | client backend
13008 | postgres | 114227 | | 10 | zenith_admin | | 127.0.0.1 | | 43180 | 2022-03-08 19:38:39.047659+00 | 2022-03-08 19:40:51.075756+00 | 2022-03-08 19:40:51.076598+00 | 2022-03-08 19:40:51.076717+00 | | | idle in transaction | 261679 | | | UPDATE bank_accs SET amount = amount - ($1) WHERE uid = $2 | client backend
13008 | postgres | 114228 | | 10 | zenith_admin | | 127.0.0.1 | | 43182 | 2022-03-08 19:38:39.048009+00 | 2022-03-08 19:40:50.459575+00 | 2022-03-08 19:40:51.076975+00 | 2022-03-08 19:40:51.076976+00 | IPC | SyncRep | active | 261578 | | | COMMIT; | client backend
13008 | postgres | 114230 | | 10 | zenith_admin | | 127.0.0.1 | | 43186 | 2022-03-08 19:38:39.048279+00 | 2022-03-08 19:40:51.076312+00 | 2022-03-08 19:40:51.076765+00 | 2022-03-08 19:40:51.076996+00 | Client | ClientRead | idle in transaction | 261681 | | | UPDATE bank_accs SET amount = amount + ($1) WHERE uid = $2 | client backend
13008 | postgres | 114232 | | 10 | zenith_admin | | 127.0.0.1 | | 43190 | 2022-03-08 19:38:39.048398+00 | 2022-03-08 19:40:50.813095+00 | 2022-03-08 19:40:50.813393+00 | 2022-03-08 19:40:50.813393+00 | Lock | transactionid | active | 261661 | 261467 | | UPDATE bank_accs SET amount = amount - ($1) WHERE uid = $2 | client backend
13008 | postgres | 114235 | | 10 | zenith_admin | | 127.0.0.1 | | 43194 | 2022-03-08 19:38:39.048663+00 | 2022-03-08 19:40:51.076451+00 | 2022-03-08 19:40:51.076968+00 | 2022-03-08 19:40:51.076969+00 | Client | ClientRead | active | | 261578 | | UPDATE bank_accs SET amount = amount + ($1) WHERE uid = $2 | client backend
13008 | postgres | 114233 | | 10 | zenith_admin | | 127.0.0.1 | | 43192 | 2022-03-08 19:38:39.048537+00 | 2022-03-08 19:40:51.075164+00 | 2022-03-08 19:40:51.076839+00 | 2022-03-08 19:40:51.076839+00 | IPC | SyncRep | active | 261677 | | | COMMIT; | client backend
13008 | postgres | 114231 | | 10 | zenith_admin | | 127.0.0.1 | | 43188 | 2022-03-08 19:38:39.049858+00 | 2022-03-08 19:40:51.075224+00 | 2022-03-08 19:40:51.076639+00 | 2022-03-08 19:40:51.076725+00 | IPC | SyncRep | idle in transaction | 261678 | | | INSERT INTO bank_log VALUES ($1, $2, $3) | client backend
13008 | postgres | 114236 | | 10 | zenith_admin | | 127.0.0.1 | | 43196 | 2022-03-08 19:38:39.048776+00 | 2022-03-08 19:40:51.075804+00 | 2022-03-08 19:40:51.076184+00 | 2022-03-08 19:40:51.076184+00 | Lock | transactionid | active | 261680 | 261578 | | UPDATE bank_accs SET amount = amount + ($1) WHERE uid = $2 | client backend
13008 | postgres | 114229 | | 10 | zenith_admin | | 127.0.0.1 | | 43184 | 2022-03-08 19:38:39.04795+00 | 2022-03-08 19:40:51.074831+00 | 2022-03-08 19:40:51.076655+00 | 2022-03-08 19:40:51.076656+00 | IPC | SyncRep | active | 261676 | | | COMMIT; | client backend
13008 | postgres | 115100 | | 10 | zenith_admin | psql | 127.0.0.1 | | 46174 | 2022-03-08 19:40:38.111575+00 | 2022-03-08 19:40:50.197456+00 | 2022-03-08 19:40:50.197456+00 | 2022-03-08 19:40:50.197457+00 | | | active | | 261578 | | SELECT * FROM pg_stat_activity | client backend
| | 114044 | | | | | | | | 2022-03-08 19:38:26.820909+00 | | | | Activity | BgWriterMain | | | | | | background writer
| | 114043 | | | | | | | | 2022-03-08 19:38:26.820495+00 | | | | Activity | CheckpointerMain | | | | | | checkpointer
| | 114045 | | | | | | | | 2022-03-08 19:38:26.820935+00 | | | | Activity | WalWriterMain | | | | | | walwriter
(17 rows)
Not sure if pg_stat_activity tries to acquire some common lock or postgres is just unresponsive for processing transactions and new connections.
Tried to dig deeper, seems that I know where bug is now, but not entirely sure yet.
The issue is still not stable to reproduce, but I managed to get backtrace of SELECT * FROM pg_stat_activity
backend when everything is stuck:
#0 0x00007f690dd7f58a in epoll_wait (epfd=10, events=0x55fd6c453568, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0x000055fd6ac23956 in WaitEventSetWaitBlock (set=0x55fd6c453508, cur_timeout=-1, occurred_events=0x7ffd5d97a050, nevents=1) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/ipc/latch.c:1450
#2 0x000055fd6ac237d1 in WaitEventSetWait (set=0x55fd6c453508, timeout=-1, occurred_events=0x7ffd5d97a050, nevents=1, wait_event_info=134217734) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/ipc/latch.c:1396
#3 0x000055fd6ac22b11 in WaitLatch (latch=0x7f690d6a8ff4, wakeEvents=33, timeout=-1, wait_event_info=134217734) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/ipc/latch.c:473
#4 0x000055fd6ac35ae6 in ConditionVariableTimedSleep (cv=0x7f690d2f2ca0, timeout=-1, wait_event_info=134217734) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/lmgr/condition_variable.c:163
#5 0x000055fd6ac35a14 in ConditionVariableSleep (cv=0x7f690d2f2ca0, wait_event_info=134217734) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/lmgr/condition_variable.c:100
#6 0x000055fd6ac0fffa in WaitIO (buf=0x7f690d1f1780) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/buffer/bufmgr.c:4427
#7 0x000055fd6ac10088 in StartBufferIO (buf=0x7f690d1f1780, forInput=true) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/buffer/bufmgr.c:4464
#8 0x000055fd6ac0a636 in BufferAlloc (smgr=0x55fd6c4f53f0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, strategy=0x0, foundPtr=0x7ffd5d97a261) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/buffer/bufmgr.c:1165
#9 0x000055fd6ac09aaf in ReadBuffer_common (smgr=0x55fd6c4f53f0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0, hit=0x7ffd5d97a383) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/buffer/bufmgr.c:864
#10 0x000055fd6ac09765 in ReadBufferExtended (reln=0x7f690d10a168, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/storage/buffer/bufmgr.c:771
#11 0x000055fd6a77da6d in heapgetpage (sscan=0x55fd6c5252a8, page=0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/access/heap/heapam.c:400
#12 0x000055fd6a77f63d in heapgettup_pagemode (scan=0x55fd6c5252a8, dir=ForwardScanDirection, nkeys=0, key=0x0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/access/heap/heapam.c:899
#13 0x000055fd6a780ea3 in heap_getnextslot (sscan=0x55fd6c5252a8, direction=ForwardScanDirection, slot=0x55fd6c524c80) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/access/heap/heapam.c:1396
#14 0x000055fd6aa3f537 in table_scan_getnextslot (sscan=0x55fd6c5252a8, direction=ForwardScanDirection, slot=0x55fd6c524c80) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/include/access/tableam.h:1044
#15 0x000055fd6aa3f603 in SeqNext (node=0x55fd6c524ad0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeSeqscan.c:80
#16 0x000055fd6a9febf5 in ExecScanFetch (node=0x55fd6c524ad0, accessMtd=0x55fd6aa3f56b <SeqNext>, recheckMtd=0x55fd6aa3f614 <SeqRecheck>) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execScan.c:133
#17 0x000055fd6a9fec9a in ExecScan (node=0x55fd6c524ad0, accessMtd=0x55fd6aa3f56b <SeqNext>, recheckMtd=0x55fd6aa3f614 <SeqRecheck>) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execScan.c:199
#18 0x000055fd6aa3f66a in ExecSeqScan (pstate=0x55fd6c524ad0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeSeqscan.c:112
#19 0x000055fd6a9faa97 in ExecProcNodeFirst (node=0x55fd6c524ad0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execProcnode.c:463
#20 0x000055fd6aa1df99 in ExecProcNode (node=0x55fd6c524ad0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/include/executor/executor.h:257
#21 0x000055fd6aa1e267 in MultiExecPrivateHash (node=0x55fd6c5ae670) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHash.c:166
#22 0x000055fd6aa1e1cb in MultiExecHash (node=0x55fd6c5ae670) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHash.c:115
#23 0x000055fd6a9fab8a in MultiExecProcNode (node=0x55fd6c5ae670) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execProcnode.c:519
#24 0x000055fd6aa24c6f in ExecHashJoinImpl (pstate=0x55fd6c5a0e18, parallel=false) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHashjoin.c:291
#25 0x000055fd6aa253da in ExecHashJoin (pstate=0x55fd6c5a0e18) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHashjoin.c:591
#26 0x000055fd6a9faa97 in ExecProcNodeFirst (node=0x55fd6c5a0e18) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execProcnode.c:463
#27 0x000055fd6aa24841 in ExecProcNode (node=0x55fd6c5a0e18) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/include/executor/executor.h:257
#28 0x000055fd6aa24ba6 in ExecHashJoinImpl (pstate=0x55fd6c5a0b10, parallel=false) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHashjoin.c:262
#29 0x000055fd6aa253da in ExecHashJoin (pstate=0x55fd6c5a0b10) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/nodeHashjoin.c:591
#30 0x000055fd6a9faa97 in ExecProcNodeFirst (node=0x55fd6c5a0b10) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execProcnode.c:463
#31 0x000055fd6a9eea73 in ExecProcNode (node=0x55fd6c5a0b10) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/include/executor/executor.h:257
#32 0x000055fd6a9f14ac in ExecutePlan (estate=0x55fd6c5a08a0, planstate=0x55fd6c5a0b10, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x55fd6c456e18, execute_once=true) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execMain.c:1551
#33 0x000055fd6a9ef14d in standard_ExecutorRun (queryDesc=0x55fd6c52c570, direction=ForwardScanDirection, count=0, execute_once=true) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execMain.c:361
#34 0x000055fd6a9eef5c in ExecutorRun (queryDesc=0x55fd6c52c570, direction=ForwardScanDirection, count=0, execute_once=true) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/executor/execMain.c:305
#35 0x000055fd6ac6124b in PortalRunSelect (portal=0x55fd6c4c0020, forward=true, count=0, dest=0x55fd6c456e18) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/tcop/pquery.c:921
#36 0x000055fd6ac60e84 in PortalRun (portal=0x55fd6c4c0020, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x55fd6c456e18, altdest=0x55fd6c456e18, qc=0x7ffd5d97ac30) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/tcop/pquery.c:765
#37 0x000055fd6ac5bbca in exec_execute_message (portal_name=0x55fd6c456a00 "", max_rows=9223372036854775807) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/tcop/postgres.c:2197
#38 0x000055fd6ac5ef06 in PostgresMain (argc=1, argv=0x7ffd5d97ae80, dbname=0x55fd6c484038 "postgres", username=0x55fd6c484010 "zenith_admin") at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/tcop/postgres.c:4579
#39 0x000055fd6ab845ff in BackendRun (port=0x55fd6c47d430) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/postmaster/postmaster.c:4536
#40 0x000055fd6ab83eb8 in BackendStartup (port=0x55fd6c47d430) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/postmaster/postmaster.c:4258
#41 0x000055fd6ab7ffa0 in ServerLoop () at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/postmaster/postmaster.c:1751
#42 0x000055fd6ab7f739 in PostmasterMain (argc=3, argv=0x55fd6c4524e0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/postmaster/postmaster.c:1423
#43 0x000055fd6aa74c68 in main (argc=3, argv=0x55fd6c4524e0) at /home/arthur/zenith/tmp_install/build/../../vendor/postgres/src/backend/main/main.c:216
It seems that this backend blocked on waiting for shared buffer, so I increased shared_buffers to 512MB and tried to reproduce issue again.
After that, issue started taking more time to reproduce, but test now fails on ~25 iteration and everything hangs for a substantial time (no progress for several minutes, at least). That's convinient for debug, and I got backtrace for all postgres processes when everything is stuck.
Here's ps ax
and backtraces of all postgres processes.
All processes are waiting for something, and that is one of:
After one minute in stuck state, pageserver errors with:
2022-03-11T09:41:28.615078Z ERROR pagestream{timeline=e0663c93c24bf813ca8fe4ee74805b87 tenant=6cb477bf02b94fde8fba3be27f7b7d5e}: error reading relation or page version: Timed out while waiting for WAL record at LSN 0/E814A78 to arrive, last_record_lsn 0/C89C390 disk consistent LSN=0/169AFA8
Caused by:
SeqWaitError
The most obvious explanation for this is that WAL is stuck on safekeeper and recent logs are not streamed to pageserver, and there's likely a bug somewhere nearby send_wal.rs
I tried bisecting the issue and it got me to fa8a6c0e94405f03345293f43a6b3ad74c734750, i.e. issue appeared when we disabled debug logs in safekeeper (that increased safekeeper perf in 1.5-2 times in local tests). Without logs issue is reproduced for commits from January, so it's not a recent bug.
I added metric for last_record_lsn
in pageserver and collected all prometheus metrics while test is running. Here're results for shared_buffers=512MB:
Red line here is last_record_lsn
in pageserver and green/yellow/blue are flush_lsn
from safekeepers.
So, it looks like pageserver isn't able to keep up with safekeepers and LSN gap is increasing between them. After some time (probably when shared_buffers are filled up) compute starts making requests which call wait_lsn()
in pageserver and transactions hang at this moment. On the graph it looks like blue line became flat in the end.
Same metrics when shared_buffers=1MB:
This time pageserver also falls behind, but wait_lsn()
is called more frequently and gap doesn't grow too big and usually is resolved in a minute. Sometimes gap is large enough to wait 10s or more, this is exactly the case when test_restarts_under_load fails.
The situation where transactions hang for 10+ seconds (because compute needs to wait for pageserver to catch up) doesn't look good, I wonder if backpressure supposed to solve this? For shared_buffers=1MB problem doesn't look too bad, because everything continues to work normally after 10 seconds. For shared_buffers=512MB situation is worse, because requests are timed out after 1 minute and this error aborts the transaction.
Not sure what is the next step for this issue.
Nice investigation. Looks like we should harden back pressure settings. CC @knizhnik
we'll harden the settings once we see a test failure again
Test failed on main, logs have nothing suspicious but it seems that pageserver is lagging behind:
pytest logs:
2022-03-28 11:03:59.839 INFO [test_wal_acceptor_async.py:122] Safekeeper at port 15302 has flush_lsn 0/38F8550, waiting for lsn 0/38F8428
pageserver logs:
2022-03-28T11:04:10.249026Z INFO WAL receiver{timeline=de7ca380458cd3988596165d9a630b8f tenant=08767d593179452881f06dbd8720f5b7}: caught up at LSN 0/312D2A8
From https://github.com/zenithdb/zenith/pull/1280#issuecomment-1059826743
After poking it I see that sometimes during the whole iteration period (10s) some backend(s) never manage to finish the transaction. Walproposer looks healthy, seems like it is a client problem, as if it just never sends next query. e.g.
So almost everyone waits for 2403614, and that backend did first update and now waits for the client to perform the second
Not sure how that's possible.