TritonDataCenter / pg_prefaulter

Faults pages into PostgreSQL shared_buffers or filesystem caches in advance of WAL apply
Apache License 2.0
56 stars 13 forks source link

close all files when PostgreSQL is not running #13

Closed jclulow closed 7 years ago

jclulow commented 7 years ago

When PostgreSQL is not running, the prefaulter appears to retain a large number of open files within the PostgreSQL data directory. So that we can cleanly unmount and remount the data file system in Manatee, the prefaulter needs to close these descriptors promptly, and avoid opening any new ones, until it is determined that PostgreSQL is running again. It would also be good to make sure the current working directory of the prefaulter process never resides within the PostgreSQL data directory.

jjelinek commented 7 years ago

Just to expand on Josh's comment, the above issues are a common bug that we repeatedly encounter and which will prevent the zone from being rebooted, leaving it in the "down" state (e.g. an operator accidentally leaving their shell cwd within a zone subdirectory). When this happens ops opens a ticket and involves an engineer who will have to manually track down the offending process and kill it before the zone can be restarted.

davepacheco commented 7 years ago

We appear to have seen this again a few times with the prefaulter running inside the Manatee zone. This was on Manatee image dd78c1de-9783-11e7-8420-572a658a6ad3.

sean- commented 7 years ago

Huh. When I wrote the initial version of this I apparently noticed this was missing and totally forgot about it, but filed an issue requesting the desired functionality required to get the eviction handlers to fire on a Purge event: bluele/gcache#37 I'll add this to our copy and the PR this upstream.

sean- commented 7 years ago
$ uname -a
SunOS fault-test 5.11 joyent_20170303T182017Z i86pc i386 i86pc

See pg_prefaulter when the database is up and there is no lag:

$ pfiles `pgrep pg_prefaulter`
34983:  ./pg_prefaulter run --config=pg_prefaulter-follower.toml --retry-db-in
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:3165767
   1: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:3165767
   2: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:3165767
   3: S_IFCHR mode:0644 dev:561,174 ino:4286469645 uid:0 gid:3 rdev:49,1
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /dev/urandom
      offset:16
   4: S_IFPORT mode:0000 dev:572,0 uid:907 gid:907 size:0
   5: S_IFDOOR mode:0444 dev:571,0 ino:666 uid:0 gid:0 rdev:570,0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[28218]
   6: S_IFSOCK mode:0666 dev:569,0 ino:60124 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576)
        sockname: AF_INET 127.0.0.1  port: 43223
   7: S_IFSOCK mode:0666 dev:569,0 ino:29156 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120)
        sockname: AF_UNIX
        peername: AF_UNIX /tmp/.s.PGSQL.5433
        peer: postgres[33045] zone: 8d5fc727-270e-c4bf-dc79-874b8f3d9d9d[361]

Generate some data:

$ make gendata
2>&1 PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres PGPASSWORD="`cat \".pwfile\"`" PGOPTIONS="-c synchronous_commit=off" "/opt/local/bin/psql" test -c 'INSERT INTO garbage SELECT s, md5(random()::text) FROM generate_series(1,1000000) s' | tee -a test.log
INSERT 0 1000000

Look in the prefaulter logs:

{"time":"2017-09-30T16:23:17.627980073Z","level":"debug","hit":14085,"miss":26,"lookup":14111,"hit-rate":0.9981574658068174,"message":"filehandle-stats"}
{"time":"2017-09-30T16:23:17.969357212Z","level":"debug","hit":8,"miss":18,"lookup":26,"hit-rate":0.3076923076923077,"message":"walcache-stats"}
{"time":"2017-09-30T16:23:18.146309635Z","level":"debug","hit":665194,"miss":901793,"lookup":1566987,"hit-rate":0.4245051171451965,"message":"iocache-stats"}

Look at pfiles(1):

$ pfiles `pgrep pg_prefaulter`
34983:  ./pg_prefaulter run --config=pg_prefaulter-follower.toml --retry-db-in
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:5374604
   1: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:5374604
   2: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:5374604
   3: S_IFCHR mode:0644 dev:561,174 ino:4286469645 uid:0 gid:3 rdev:49,1
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /dev/urandom
      offset:16
   4: S_IFPORT mode:0000 dev:572,0 uid:907 gid:907 size:0
   5: S_IFDOOR mode:0444 dev:571,0 ino:666 uid:0 gid:0 rdev:570,0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[28218]
   6: S_IFSOCK mode:0666 dev:569,0 ino:60124 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576)
        sockname: AF_INET 127.0.0.1  port: 43223
   7: S_IFSOCK mode:0666 dev:569,0 ino:29156 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120)
        sockname: AF_UNIX
        peername: AF_UNIX /tmp/.s.PGSQL.5433
        peer: postgres[33045] zone: 8d5fc727-270e-c4bf-dc79-874b8f3d9d9d[361]
   8: S_IFREG mode:0600 dev:90,68350 ino:233634 uid:907 gid:907 size:136536064
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/16394
      offset:0
   9: S_IFREG mode:0600 dev:90,68350 ino:233389 uid:907 gid:907 size:73728
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/1247
      offset:0
  10: S_IFREG mode:0600 dev:90,68350 ino:233434 uid:907 gid:907 size:16384
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2703
      offset:0
  11: S_IFREG mode:0600 dev:90,68350 ino:233565 uid:907 gid:907 size:450560
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2608
      offset:0
  12: S_IFREG mode:0600 dev:90,68350 ino:233485 uid:907 gid:907 size:40960
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2704
      offset:0
  13: S_IFREG mode:0600 dev:90,68350 ino:233483 uid:907 gid:907 size:352256
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2674
      offset:0
  14: S_IFREG mode:0600 dev:90,68350 ino:233428 uid:907 gid:907 size:319488
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2673
      offset:0
  15: S_IFREG mode:0600 dev:90,68350 ino:233380 uid:907 gid:907 size:40960
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2663
      offset:0
  16: S_IFREG mode:0600 dev:90,68350 ino:233412 uid:907 gid:907 size:90112
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/1259
      offset:0
  17: S_IFREG mode:0600 dev:90,68350 ino:233526 uid:907 gid:907 size:32768
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2662
      offset:0
  18: S_IFREG mode:0600 dev:90,68350 ino:233452 uid:907 gid:907 size:106496
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2658
      offset:0
  19: S_IFREG mode:0600 dev:90,68350 ino:233463 uid:907 gid:907 size:368640
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/1249
      offset:0
  20: S_IFREG mode:0600 dev:90,68350 ino:233367 uid:907 gid:907 size:32768
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/3455
      offset:0
  21: S_IFREG mode:0600 dev:90,68350 ino:233609 uid:907 gid:907 size:73728
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2659
      offset:0
  22: S_IFREG mode:0600 dev:90,68350 ino:233433 uid:907 gid:907 size:16384
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2679
      offset:0
  23: S_IFREG mode:0600 dev:90,68350 ino:233615 uid:907 gid:907 size:16384
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2678
      offset:0
  24: S_IFREG mode:0600 dev:90,68350 ino:233475 uid:907 gid:907 size:24576
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/2610
      offset:0
  25: S_IFREG mode:0600 dev:90,68350 ino:233636 uid:907 gid:907 size:8192
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16393/16399
      offset:0

Terminate postgres sitter-style:

$ pkill -9 postgres

Generate some more data to create a backlog and somewhere in the middle of this restart postgres (or generate load and then kill/restart postgres, doesn't matter - both tested):

$ make gendata
2>&1 PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres PGPASSWORD="`cat \".pwfile\"`" PGOPTIONS="-c synchronous_commit=off" "/opt/local/bin/psql" test -c 'INSERT INTO garbage SELECT s, md5(random()::text) FROM generate_series(1,1000000) s' | tee -a test.log
INSERT 0 1000000
$ make gendata
2>&1 PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres PGPASSWORD="`cat \".pwfile\"`" PGOPTIONS="-c synchronous_commit=off" "/opt/local/bin/psql" test -c 'INSERT INTO garbage SELECT s, md5(random()::text) FROM generate_series(1,1000000) s' | tee -a test.log
INSERT 0 1000000
$ make gendata
2>&1 PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres PGPASSWORD="`cat \".pwfile\"`" PGOPTIONS="-c synchronous_commit=off" "/opt/local/bin/psql" test -c 'INSERT INTO garbage SELECT s, md5(random()::text) FROM generate_series(1,1000000) s' | tee -a test.log
INSERT 0 1000000

Verifying the state of the prefaulter before starting PostgreSQL:

$ pfiles `pgrep pg_prefaulter`
20796:  ./pg_prefaulter run --config=pg_prefaulter-follower.toml --retry-db-in
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:90414
   1: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:90414
   2: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:90414
   3: S_IFCHR mode:0644 dev:561,174 ino:4286469645 uid:0 gid:3 rdev:49,1
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /dev/urandom
      offset:16
   4: S_IFPORT mode:0000 dev:572,0 uid:907 gid:907 size:0
   5: S_IFDOOR mode:0444 dev:571,0 ino:666 uid:0 gid:0 rdev:570,0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[28218]
   6: S_IFSOCK mode:0666 dev:569,0 ino:34274 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576)
        sockname: AF_INET 127.0.0.1  port: 35332

Look at the logs to see what's there:

{"time":"2017-09-30T20:38:46.59842733Z","level":"debug","hit":0,"miss":0,"lookup":0,"hit-rate":0,"message":"filehandle-stats"}
{"time":"2017-09-30T20:38:47.002619298Z","level":"debug","hit":0,"miss":0,"lookup":0,"hit-rate":0,"message":"iocache-stats"}
{"time":"2017-09-30T20:38:47.002642715Z","level":"debug","hit":0,"miss":0,"lookup":0,"hit-rate":0,"message":"walcache-stats"}
{"time":"2017-09-30T20:39:46.598609172Z","level":"debug","hit":7260,"miss":6,"lookup":7266,"hit-rate":0.9991742361684558,"message":"filehandle-stats"}
{"time":"2017-09-30T20:39:47.00377543Z","level":"debug","hit":6,"miss":10,"lookup":16,"hit-rate":0.375,"message":"walcache-stats"}
{"time":"2017-09-30T20:39:47.003772888Z","level":"debug","hit":343361,"miss":527554,"lookup":870915,"hit-rate":0.39425317051606645,"message":"iocache-stats"}

Check pfiles(1) again:

$ pfiles `pgrep pg_prefaulter`                                                                                                                                                                                          [302/1995]
33395:  ./pg_prefaulter run --config=pg_prefaulter-follower.toml --retry-db-in
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:6152242
   1: S_IFIFO mode:0000 dev:558,0 ino:480599548 uid:907 gid:907 rdev:0,0
      O_RDWR
   2: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:6152242
   3: S_IFCHR mode:0644 dev:561,174 ino:4286469645 uid:0 gid:3 rdev:49,1
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /dev/urandom
      offset:16
   4: S_IFPORT mode:0000 dev:572,0 uid:907 gid:907 size:0
   5: S_IFDOOR mode:0444 dev:571,0 ino:666 uid:0 gid:0 rdev:570,0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[28218]
   6: S_IFSOCK mode:0666 dev:569,0 ino:55086 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576)
        sockname: AF_INET 127.0.0.1  port: 39104
   7: S_IFSOCK mode:0666 dev:569,0 ino:10736 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120)
        sockname: AF_UNIX
        peername: AF_UNIX /tmp/.s.PGSQL.5433
        peer: postgres[34335] zone: 8d5fc727-270e-c4bf-dc79-874b8f3d9d9d[361]
   8: S_IFREG mode:0600 dev:90,68350 ino:260963 uid:907 gid:907 size:1073741824
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/16385
      offset:0
   9: S_IFREG mode:0600 dev:90,68350 ino:261021 uid:907 gid:907 size:147456
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/2619
      offset:0
  11: S_IFREG mode:0600 dev:90,68350 ino:260951 uid:907 gid:907 size:65536
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/2840
      offset:0
  12: S_IFREG mode:0600 dev:90,68350 ino:260927 uid:907 gid:907 size:16384
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/2696
      offset:0
  14: S_IFREG mode:0600 dev:90,68350 ino:260996 uid:907 gid:907 size:90112
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/1259
      offset:0
  15: S_IFREG mode:0600 dev:90,68350 ino:260869 uid:907 gid:907 size:16384
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /var/pgsql/go/src/github.com/joyent/pg_prefaulter/.pgdata_follower/base/16384/2841
      offset:0

Terminate PostgreSQL at will and when we do, check pfiles(1):

$ pfiles `pgrep pg_prefaulter`
33395:  ./pg_prefaulter run --config=pg_prefaulter-follower.toml --retry-db-in
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:6155692
   1: S_IFIFO mode:0000 dev:558,0 ino:480599548 uid:907 gid:907 rdev:0,0
      O_RDWR
   2: S_IFCHR mode:0620 dev:561,174 ino:3690371815 uid:0 gid:7 rdev:6,19
      O_RDWR|O_APPEND|O_NOCTTY|O_LARGEFILE
      /dev/pts/19
      offset:6155692
   3: S_IFCHR mode:0644 dev:561,174 ino:4286469645 uid:0 gid:3 rdev:49,1
      O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC
      /dev/urandom
      offset:16
   4: S_IFPORT mode:0000 dev:572,0 uid:907 gid:907 size:0
   5: S_IFDOOR mode:0444 dev:571,0 ino:666 uid:0 gid:0 rdev:570,0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[28218]
   6: S_IFSOCK mode:0666 dev:569,0 ino:55086 uid:0 gid:0 rdev:0,0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
        SOCK_STREAM
        SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576)
        sockname: AF_INET 127.0.0.1  port: 39104

Useful logging that is relevant:

{"time":"2017-09-30T20:45:08.352018215Z","level":"error","error":"unable to query PostgreSQL checkpoint information: dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory: (retriable: true, purge cache: true","next step":"retrying","message":"unable to find WAL files"}
{"time":"2017-09-30T20:45:08.352073436Z","level":"debug","message":"purging cache"}
{"time":"2017-09-30T20:45:09.558287638Z","level":"debug","walfile":"000000010000000000000067","message":"found WAL segment from /proc"}
{"time":"2017-09-30T20:45:09.86240054Z","level":"debug","walfile":"000000010000000000000067","message":"found WAL segment from /proc"}
{"time":"2017-09-30T20:45:11.144263402Z","level":"debug","walfile":"000000010000000000000067","message":"found WAL segment from /proc"}
{"time":"2017-09-30T20:45:12.379228418Z","level":"debug","walfile":"000000010000000000000068","message":"found WAL segment from /proc"}
{"time":"2017-09-30T20:45:22.760509194Z","level":"debug","backend-pid":45923,"version":"PostgreSQL 9.6.5 on x86_64-sun-solaris2.11, compiled by gcc (GCC) 4.9.4, 64-bit","message":"established DB connection"}
{"time":"2017-09-30T20:45:46.711599818Z","level":"debug","hit":15969,"miss":19,"lookup":15988,"hit-rate":0.9988116087065299,"message":"filehandle-stats"}
{"time":"2017-09-30T20:45:47.005034935Z","level":"debug","hit":785037,"miss":1131044,"lookup":1916081,"hit-rate":0.40970971477719365,"message":"iocache-stats"}
{"time":"2017-09-30T20:45:47.006109038Z","level":"debug","hit":20,"miss":20,"lookup":40,"hit-rate":0.5,"message":"walcache-stats"}
{"time":"2017-09-30T20:46:46.711745518Z","level":"debug","hit":15969,"miss":19,"lookup":15988,"hit-rate":0.9988116087065299,"message":"filehandle-stats"}
{"time":"2017-09-30T20:46:47.005231435Z","level":"debug","hit":785037,"miss":1131044,"lookup":1916081,"hit-rate":0.40970971477719365,"message":"iocache-stats"}
{"time":"2017-09-30T20:46:47.00627201Z","level":"debug","hit":20,"miss":20,"lookup":40,"hit-rate":0.5,"message":"walcache-stats"}

Of note, for non-SmartOS platforms that don't potentially have /proc support, pargs(1) is used as a fallback:

{"time":"2017-09-30T20:33:46.849351855Z","level":"error","error":"unable to query PostgreSQL checkpoint information: dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory: (retriable: true, purge cache: true","next step":"re
trying","message":"unable to find WAL files"}
{"time":"2017-09-30T20:33:46.849404092Z","level":"debug","message":"purging cache"}
{"time":"2017-09-30T20:33:48.302207541Z","level":"debug","walfile":"000000010000000000000060","message":"found WAL segment from pargs(1)"}
{"time":"2017-09-30T20:33:49.690287117Z","level":"debug","walfile":"000000010000000000000061","message":"found WAL segment from pargs(1)"}

Unexpected errors by the database will result in the process exiting. Similarly, if the open(2) count doesn't match the close(2) count, the process will panic. I abused PostgreSQL mightily today and am pretty confident in the error handling at this point and have run pg_prefaulter in a pretty abusive loop where the follower and primary both are regularly kill -9'ed, have their databases recreated regularly, and are under load via make gendata or make pgbench. Cheers.

davepacheco commented 7 years ago

As I understand it, a version of the prefaulter having this fix was running in production today and we ran into the same problem again.

sean- commented 7 years ago

Mantee Failover Testing

Test Procedure

  1. Intsall pg_prefaulter on three nodes in a shard. Ensure the version is at least as new as the October 14th build: https://us-east.manta.joyent.com/sean.chittenden/public/pg_prefaulter/pg_prefaulter_SNAPSHOT-21e0ec67ea7d97f4334b61517cbf0983c96a274a_Illumos_64-bit.tar.gz
    1. [X] Verify the state of the shard:
      
      [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show
      zookeeper:   172.25.14.11
      cluster:     lab0
      generation:  23 (0/7705788)
      mode:        normal
      freeze:      frozen since 2017-10-24T08:46:27.851Z
      freeze info: test

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/7991B28 0/7991B28 0/7991B28 -
sync 3ef75d54 ok async 0/7991B28 0/7991B28 0/7991B28 -
async a5ec1ae3 ok - - - - -

   1. [X] `primary`:

[root@headnode] # cd /zones/cd1aecdc-8634-699e-a6cc-81cfaf649ec5/root/var/tmp [root@headnode] # mkdir pg_prefaulter [root@headnode] # cd pg_prefaulter [root@headnode] # curl -kO https://us-east.manta.joyent.com/sean.chittenden/public/pg_prefaulter/pg_prefaulter_SNAPSHOT-21e0ec67ea7d97f4334b61517cbf0983c96a274a_Illumos_64-bit.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4459k 100 4459k 0 0 3519k 0 0:00:01 0:00:01 --:--:-- 3522k [root@headnode] # zlogin cd1aecdc-8634-699e-a6cc-81cfaf649ec5 [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# ll total 18738 -rw-r--r-- 1 root root 11358 Jul 4 03:06 LICENSE -rw-r--r-- 1 root root 17316 Oct 14 15:48 README.adoc -rwxr-xr-x 1 root root 14508897 Oct 14 16:33 pg_prefaulter* -rw-r--r-- 1 root root 1601 Oct 9 22:48 pg_prefaulter.defaults.toml -rw-r--r-- 1 root root 566 Oct 1 01:01 pg_prefaulter.toml.sample -rw-r--r-- 1 root root 4566316 Nov 6 16:56 pg_prefaulter_SNAPSHOT-21e0ec67ea7d97f4334b61517cbf0983c96a274a_Illumos_64-bit.tar.gz [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# cat <<'EOF' > pg_prefaulter.smf

[snip] EOF [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# $EDITOR pg_prefaulter.smf [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# cat pg_prefaulter.smf <?xml version='1.0'?> <!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>

<dependency name='network' grouping='require_all' restart_on='error' type='service'>
  <service_fmri value='svc:/milestone/network:default'/>
</dependency>

<dependency name='filesystem' grouping='require_all' restart_on='error' type='service'>
  <service_fmri value='svc:/system/filesystem/local'/>
</dependency>

<method_context/>
<exec_method name='start' type='method' exec='/var/tmp/pg_prefaulter/pg_prefaulter run --config=%{config_file}' timeout_seconds='60'>
  <method_context>
    <method_environment>
      <envvar name='HOME' value='/var/tmp'/>
      <envvar name='PATH' value='/usr/local/bin:/opt/local/bin:/usr/bin:/bin'/>
      <envvar name='USER' value='postgres'/>
      <envvar name='GOMAXPROCS' value='%{num_cpus}'/>
      <envvar name='LD_LIBRARY_PATH' value='/opt/local/lib/amd64'/>
    </method_environment>
  </method_context>
</exec_method>
<exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'/>

<property_group name='startd' type='framework'>
  <propval name='duration' type='astring' value='child'/>
  <propval name='ignore_error' type='astring' value='core,signal'/>
</property_group>

<property_group name='application' type='application'>
  <propval name='config_file' type='astring' value='/var/tmp/pg_prefaulter/pg_prefaulter.toml'/>
  <propval name='num_cpus' type='astring' value='24'/>
</property_group>

<stability value='Evolving'/>
<template>
  <common_name>
    <loctext xml:lang='C'>pg_prefaulter agent</loctext>
  </common_name>
</template>

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# cat <<'EOF' > pg_prefaulter.toml

[snip] EOF [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# $EDITOR pg_prefaulter.toml [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# cat pg_prefaulter.toml [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# cat pg_prefaulter.toml [log] level = "DEBUG"

[postgresql] host = "/tmp" mode = "auto" port = 5432 pgdata = "/manatee/pg/data/" user = "postgres" database = "postgres"

[postgresql.xlog] pg_xlogdump-path = "/opt/postgresql/9.6.3/bin/pg_xlogdump"

[circonus] enabled = true

[circonus.api]

token is a required value if circonus.enabled=true

token = "1234"

[circonus.broker]

id should be set to "35" (the public HTTPTrap broker) if you have enterprise

brokers configured in your account.

id = 35

[circonus.check] tags = "dc1:emy1" [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# svccfg import pg_prefaulter.smf [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# svcs pg_prefaulter STATE STIME FMRI online 19:23:26 svc:/network/cloudops/agent/pg_prefaulter:default [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# head svcs -L pg_prefaulter [ Nov 6 19:23:26 Enabled. ] [ Nov 6 19:23:26 Rereading configuration. ] [ Nov 6 19:23:26 Executing start method ("/var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefaulter/pg_prefaulter.toml"). ] {"time":"2017-11-06T19:23:26.816903977Z","level":"debug","config-file":"/var/tmp/pg_prefaulter/pg_prefaulter.toml"} {"time":"2017-11-06T19:23:26.817112595Z","level":"debug","message":"args: []"} {"time":"2017-11-06T19:23:26.823318705Z","level":"debug","postgresql.pgdata":"/manatee/pg/data/","postgresql.host":"/tmp","postgresql.port":5432,"postgresql.user":"postgres","postgresql.xlog.mode":"pg","postgresql.xlog.pg_xlogdump-path":"/opt/postgresql/9.6.3/bin/pg_xlogdump","postgresql.poll-interval":1000,"message":"flags"} {"time":"2017-11-06T19:23:26.824105750Z","level":"info","pid":75182,"message":"Starting pg_prefaulter"} {"time":"2017-11-06T19:23:26.832152028Z","level":"debug","message":"starting gops(1) agent"} {"time":"2017-11-06T19:23:26.832315904Z","level":"debug","pprof-port":4242,"message":"starting pprof endpoing agent"} {"time":"2017-11-06T19:23:26.834930515Z","level":"debug","rlimit-nofile":65536,"filehandle-cache-size":2000,"filehandle-cache-ttl":300000,"message":"filehandle cache initialized"} [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# tail svcs -L pg_prefaulter {"time":"2017-11-06T19:23:28.039824632Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1171, lsn: 0/070BF168, prev 0/070BF118, desc: INSERT_LEAF off 7, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T19:23:28.039890574Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1171, lsn: 0/070BF1B0, prev 0/070BF168, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"} {"time":"2017-11-06T19:24:26.835260451Z","level":"debug","hit":90,"miss":83,"lookup":173,"hit-rate":0.5202312138728323,"message":"filehandle-stats"} {"time":"2017-11-06T19:24:26.898057581Z","level":"debug","hit":3032,"miss":390,"lookup":3422,"hit-rate":0.886031560490941,"message":"iocache-stats"} {"time":"2017-11-06T19:24:26.898121893Z","level":"debug","hit":59,"miss":1,"lookup":60,"hit-rate":0.9833333333333333,"message":"walcache-stats"} {"time":"2017-11-06T19:24:26.900901716Z","level":"debug","sender-state":"","db-state":"primary","peer-sync-state":"","durability-lag":0,"flush-lag":0,"visibility-lag":0,"visibility-lag_ms":0,"message":"db-stats"} {"time":"2017-11-06T19:25:26.835491408Z","level":"debug","hit":90,"miss":83,"lookup":173,"hit-rate":0.5202312138728323,"message":"filehandle-stats"} {"time":"2017-11-06T19:25:26.898309091Z","level":"debug","hit":3032,"miss":390,"lookup":3422,"hit-rate":0.886031560490941,"message":"iocache-stats"} {"time":"2017-11-06T19:25:26.898332745Z","level":"debug","hit":119,"miss":1,"lookup":120,"hit-rate":0.9916666666666667,"message":"walcache-stats"} {"time":"2017-11-06T19:25:26.901052136Z","level":"debug","sender-state":"","db-state":"primary","peer-sync-state":"","durability-lag":0,"flush-lag":0,"visibility-lag":0,"visibility-lag_ms":0,"message":"db-stats"}

   2. [X] `sync`: Replicate the same procedure as above with `3ef75d54-c0d8-e579-f0df-b893b768c9ed`.

[root@3ef75d54-c0d8-e579-f0df-b893b768c9ed /var/tmp/pg_prefaulter]# head svcs -L pg_prefaulter [ Nov 6 21:40:46 Enabled. ] [ Nov 6 21:40:46 Rereading configuration. ] [ Nov 6 21:40:46 Executing start method ("/var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefaulter/pg_prefaulter.toml"). ] {"time":"2017-11-06T21:40:46.253909146Z","level":"debug","config-file":"/var/tmp/pg_prefaulter/pg_prefaulter.toml"} {"time":"2017-11-06T21:40:46.254034038Z","level":"debug","message":"args: []"} {"time":"2017-11-06T21:40:46.254526283Z","level":"debug","message":"starting gops(1) agent"} {"time":"2017-11-06T21:40:46.254617217Z","level":"debug","pprof-port":4242,"message":"starting pprof endpoing agent"} {"time":"2017-11-06T21:40:46.254504246Z","level":"debug","postgresql.pgdata":"/manatee/pg/data/","postgresql.host":"/tmp","postgresql.port":5432,"postgresql.user":"postgres","postgresql.xlog.mode":"pg","postgresql.xlog.pg_xlogdump-path":"/opt/postgresql/9.6.3/bin/pg_xlogdump","postgresql.poll-interval":1000,"message":"flags"} {"time":"2017-11-06T21:40:46.255194657Z","level":"info","pid":97910,"message":"Starting pg_prefaulter"} {"time":"2017-11-06T21:40:46.258205525Z","level":"debug","rlimit-nofile":65536,"filehandle-cache-size":2000,"filehandle-cache-ttl":300000,"message":"filehandle cache initialized"} [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed /var/tmp/pg_prefaulter]# tail svcs -L pg_prefaulter
{"time":"2017-11-06T21:40:46.748354419Z","level":"info","input":"rmgr: Heap len (rec/tot): 3/ 80, tx: 1171, lsn: 0/070BD848, prev 0/070BD800, desc: INSERT off 7, blkref #0: rel 1664/0/1214 blk 0","message":"database 0"} {"time":"2017-11-06T21:40:46.748410544Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1171, lsn: 0/070BD898, prev 0/070BD848, desc: INSERT_LEAF off 6, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T21:40:46.748448014Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1171, lsn: 0/070BD8E0, prev 0/070BD898, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"} {"time":"2017-11-06T21:40:46.749013902Z","level":"info","input":"rmgr: Heap len (rec/tot): 3/ 80, tx: 1171, lsn: 0/070BF118, prev 0/070BF0D0, desc: INSERT off 8, blkref #0: rel 1664/0/1214 blk 0","message":"database 0"} {"time":"2017-11-06T21:40:46.749079348Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1171, lsn: 0/070BF168, prev 0/070BF118, desc: INSERT_LEAF off 7, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T21:40:46.749108909Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1171, lsn: 0/070BF1B0, prev 0/070BF168, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"} {"time":"2017-11-06T21:41:46.258454929Z","level":"debug","hit":91,"miss":82,"lookup":173,"hit-rate":0.5260115606936416,"message":"filehandle-stats"} {"time":"2017-11-06T21:41:46.317581987Z","level":"debug","hit":118,"miss":2,"lookup":120,"hit-rate":0.9833333333333333,"message":"walcache-stats"} {"time":"2017-11-06T21:41:46.318613289Z","level":"debug","sender-state":"receiving","db-state":"follower","peer-sync-state":"applying","durability-lag":0,"flush-lag":0,"visibility-lag":0,"visibility-lag_ms":0,"message":"db-stats"} {"time":"2017-11-06T21:41:46.318616936Z","level":"debug","hit":3028,"miss":394,"lookup":3422,"hit-rate":0.8848626534190532,"message":"iocache-stats"}

   3. [X] `async`: Replicate the same procedure as above with `a5ec1ae3-9d69-4ae8-9918-abaf41da24d0`.

[root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 /var/tmp/pg_prefaulter]# head svcs -L pg_prefaulter [ Nov 6 21:43:18 Enabled. ] [ Nov 6 21:43:18 Rereading configuration. ] [ Nov 6 21:43:18 Executing start method ("/var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefaulter/pg_prefaulter.toml"). ] {"time":"2017-11-06T21:43:18.559327506Z","level":"debug","config-file":"/var/tmp/pg_prefaulter/pg_prefaulter.toml"} {"time":"2017-11-06T21:43:18.559471304Z","level":"debug","message":"args: []"} {"time":"2017-11-06T21:43:18.560034653Z","level":"debug","message":"starting gops(1) agent"} {"time":"2017-11-06T21:43:18.560316805Z","level":"debug","pprof-port":4242,"message":"starting pprof endpoing agent"} {"time":"2017-11-06T21:43:18.559974758Z","level":"debug","postgresql.pgdata":"/manatee/pg/data/","postgresql.host":"/tmp","postgresql.port":5432,"postgresql.user":"postgres","postgresql.xlog.mode":"pg","postgresql.xlog.pg_xlogdump-path":"/opt/postgresql/9.6.3/bin/pg_xlogdump","postgresql.poll-interval":1000,"message":"flags"} {"time":"2017-11-06T21:43:18.560727483Z","level":"info","pid":98980,"message":"Starting pg_prefaulter"} {"time":"2017-11-06T21:43:18.563906109Z","level":"debug","rlimit-nofile":65536,"filehandle-cache-size":2000,"filehandle-cache-ttl":300000,"message":"filehandle cache initialized"} [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 /var/tmp/pg_prefaulter]# tail svcs -L pg_prefaulter
{"time":"2017-11-06T21:43:19.590317327Z","level":"info","input":"rmgr: Heap len (rec/tot): 3/ 80, tx: 1169, lsn: 0/070BA468, prev 0/070BA418, desc: INSERT off 6, blkref #0: rel 1664/0/1214 blk 0","message":"database 0"} {"time":"2017-11-06T21:43:19.590401347Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1169, lsn: 0/070BA4B8, prev 0/070BA468, desc: INSERT_LEAF off 5, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T21:43:19.590464544Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1169, lsn: 0/070BA500, prev 0/070BA4B8, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"} {"time":"2017-11-06T21:43:19.590740162Z","level":"info","input":"rmgr: Heap len (rec/tot): 14/ 79, tx: 1170, lsn: 0/070BBD50, prev 0/070BBD18, desc: HOT_UPDATE off 4 xmax 1170 ; new off 5 xmax 0, blkref #0: rel 1664/0/1260 blk 0","message":"database 0"} {"time":"2017-11-06T21:43:19.591196494Z","level":"info","input":"rmgr: Heap len (rec/tot): 3/ 80, tx: 1171, lsn: 0/070BD848, prev 0/070BD800, desc: INSERT off 7, blkref #0: rel 1664/0/1214 blk 0","message":"database 0"} {"time":"2017-11-06T21:43:19.591277400Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1171, lsn: 0/070BD898, prev 0/070BD848, desc: INSERT_LEAF off 6, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T21:43:19.591324493Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1171, lsn: 0/070BD8E0, prev 0/070BD898, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"} {"time":"2017-11-06T21:43:19.591638034Z","level":"info","input":"rmgr: Heap len (rec/tot): 3/ 80, tx: 1171, lsn: 0/070BF118, prev 0/070BF0D0, desc: INSERT off 8, blkref #0: rel 1664/0/1214 blk 0","message":"database 0"} {"time":"2017-11-06T21:43:19.591716052Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 72, tx: 1171, lsn: 0/070BF168, prev 0/070BF118, desc: INSERT_LEAF off 7, blkref #0: rel 1664/0/1232 blk 1","message":"database 0"} {"time":"2017-11-06T21:43:19.591775684Z","level":"info","input":"rmgr: Btree len (rec/tot): 2/ 64, tx: 1171, lsn: 0/070BF1B0, prev 0/070BF168, desc: INSERT_LEAF off 3, blkref #0: rel 1664/0/1233 blk 1","message":"database 0"}

2. [X] Generate load against the Manatee `primary`.  Run the following in a
   `tmux` session:

$ env PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres /opt/postgresql/current/bin/pgbench -i -s 10 -F 90 $ while 1; do env PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres /opt/postgresql/current/bin/pgbench -j 64 -P 60 -r -T 900 --no-vacuum --protocol=prepared; done

3. [X] Unfreeze the shard and verify WAL entries are flowing:

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 23 (0/7705788) mode: normal freeze: frozen since 2017-10-24T08:46:27.851Z freeze info: test

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/FF06FE8 0/FF06FE8 0/FF06E58 -
sync 3ef75d54 ok async 0/FF06FE8 0/FF06FE8 0/FF06E58 0m00s async a5ec1ae3 ok - - - - 0m00s [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 23 (0/7705788) mode: normal freeze: frozen since 2017-10-24T08:46:27.851Z freeze info: test

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/FF9FBA8 0/FF9FBA8 0/FF9FA50 -
sync 3ef75d54 ok async 0/FF9FED8 0/FF9FD40 0/FF9FBA8 0m00s async a5ec1ae3 ok - - - - 0m00s [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# manatee-adm unfreeze Unfrozen. [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# manatee-adm show
zookeeper: 172.25.14.11 cluster: lab0 generation: 23 (0/7705788) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/108BA3D0 0/108BA238 0/108BA0A0 -
sync 3ef75d54 ok async 0/108BA0A0 0/108BA0A0 0/108B9EF0 0m00s async a5ec1ae3 ok - - - - 0m00s

4. [X] Validate replication.  Ensure updated WAL records are flowing through to:
   1. [X] `primary`.
      1. [X] Obtain a list of the open files:

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# pgrep pg_prefaulter | xargs -n1 pfiles | head -n 45 75182: /var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefault Current rlimit: 65536 file descriptors 0: S_IFCHR mode:0666 dev:559,138372 ino:2364281560 uid:0 gid:3 rdev:38,2 O_RDONLY|O_LARGEFILE /dev/null offset:0 1: S_IFREG mode:0644 dev:90,69592 ino:66286 uid:0 gid:0 size:109814 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:109814 2: S_IFREG mode:0644 dev:90,69592 ino:66286 uid:0 gid:0 size:109814 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:109814 3: S_IFCHR mode:0644 dev:559,138372 ino:1983103238 uid:0 gid:3 rdev:49,1 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /dev/urandom offset:16 4: S_IFPORT mode:0000 dev:570,0 uid:0 gid:0 size:0 5: S_IFSOCK mode:0666 dev:567,0 ino:47400 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 4242 6: S_IFSOCK mode:0666 dev:567,0 ino:35184 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 47375 7: S_IFIFO mode:0000 dev:556,0 ino:700086872 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC 8: S_IFREG mode:0600 dev:90,70221 ino:2189 uid:907 gid:907 size:106496 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12851/1259 offset:0 9: S_IFSOCK mode:0666 dev:567,0 ino:54637 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120) sockname: AF_UNIX peername: AF_UNIX /tmp/.s.PGSQL.5432 peer: postgres[50800] zone: cd1aecdc-8634-699e-a6cc-81cfaf649ec5[1576] 10: S_IFREG mode:0600 dev:90,70221 ino:2213 uid:907 gid:907 size:40960 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12851/2663 [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# pgrep pg_prefaulter | xargs -n1 pfiles | tail 101: S_IFREG mode:0600 dev:90,70221 ino:9119 uid:907 gid:0 size:204800 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/16430 offset:0 103: S_IFREG mode:0600 dev:90,70221 ino:9111 uid:907 gid:0 size:6848512 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/16421 offset:0 105: S_IFIFO mode:0000 dev:556,0 ino:700087532 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC

      2. [X] Obtain a list of the args and make sure they're changing (or examine the log output from the prefaulter):

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL && sleep 30 && pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL argv[0]: postgres: wal sender process postgres 172.25.14.134(41564) streaming 0/14364818 argv[0]: postgres: wal writer process argv[0]: postgres: wal sender process postgres 172.25.14.134(41564) streaming 0/1491A758 argv[0]: postgres: wal writer process

   2. [X] `sync`
      1. [X] Obtain a list of the open files:

[root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# pgrep pg_prefaulter | xargs -n1 pfiles | head -n 45 97910: /var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefault Current rlimit: 65536 file descriptors 0: S_IFCHR mode:0666 dev:559,138373 ino:1688795282 uid:0 gid:3 rdev:38,2 O_RDONLY|O_LARGEFILE /dev/null offset:0 1: S_IFREG mode:0644 dev:90,69579 ino:66266 uid:0 gid:0 size:65578 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:65578 2: S_IFREG mode:0644 dev:90,69579 ino:66266 uid:0 gid:0 size:65578 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:65578 3: S_IFCHR mode:0644 dev:559,138373 ino:3095488064 uid:0 gid:3 rdev:49,1 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /dev/urandom offset:16 4: S_IFPORT mode:0000 dev:570,0 uid:0 gid:0 size:0 5: S_IFSOCK mode:0666 dev:567,0 ino:64242 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 59224 6: S_IFSOCK mode:0666 dev:567,0 ino:18969 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 4242 7: S_IFREG mode:0600 dev:90,70223 ino:2591 uid:907 gid:907 size:8192 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/2606 offset:0 8: S_IFREG mode:0600 dev:90,70223 ino:2213 uid:907 gid:907 size:40960 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12851/2663 offset:0 9: S_IFSOCK mode:0666 dev:567,0 ino:53504 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120) sockname: AF_UNIX peername: AF_UNIX /tmp/.s.PGSQL.5432 peer: postgres[51316] zone: 3ef75d54-c0d8-e579-f0df-b893b768c9ed[1577] 10: S_IFREG mode:0600 dev:90,70223 ino:2189 uid:907 gid:907 size:106496 [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# pgrep pg_prefaulter | xargs -n1 pfiles | tail /manatee/pg/data/base/12852/16436 offset:0 110: S_IFREG mode:0600 dev:90,70223 ino:2507 uid:907 gid:907 size:24576 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/2610 offset:0 111: S_IFREG mode:0600 dev:90,70223 ino:1443 uid:907 gid:0 size:22487040 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/16438 offset:0

      2. [X] Obtain a list of the args and make sure they're changing (or examine the log output from the prefaulter):

[root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL && sleep 30 && pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL argv[0]: postgres: wal receiver process streaming 0/1843B070 argv[0]: postgres: wal sender process postgres 172.25.14.132(62171) streaming 0/1843D648 argv[0]: postgres: wal receiver process streaming 0/18A026B8 argv[0]: postgres: wal sender process postgres 172.25.14.132(62171) streaming 0/18A05880 [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 23 (0/7705788) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/1AB4D070 0/1AB4D070 0/1AB4D038 -
sync 3ef75d54 ok async 0/1AB4D070 0/1AB4D070 0/1AB4D038 0m16s async a5ec1ae3 ok - - - - 0m16s

   3. [X] `async`.
      1. [X] Obtain a list of the open files:

[root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# pgrep pg_prefaulter | xargs -n1 pfiles | head -n 45 98980: /var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefault Current rlimit: 65536 file descriptors 0: S_IFCHR mode:0666 dev:559,138371 ino:2116658781 uid:0 gid:3 rdev:38,2 O_RDONLY|O_LARGEFILE /dev/null offset:0 1: S_IFREG mode:0644 dev:90,69588 ino:66346 uid:0 gid:0 size:29215 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:29215 2: S_IFREG mode:0644 dev:90,69588 ino:66346 uid:0 gid:0 size:29215 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:29215 3: S_IFCHR mode:0644 dev:559,138371 ino:1051367549 uid:0 gid:3 rdev:49,1 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /dev/urandom offset:16 4: S_IFPORT mode:0000 dev:570,0 uid:0 gid:0 size:0 5: S_IFSOCK mode:0666 dev:567,0 ino:6812 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 65407 6: S_IFSOCK mode:0666 dev:567,0 ino:56979 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 4242 7: S_IFIFO mode:0000 dev:556,0 ino:700121633 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC 8: S_IFREG mode:0600 dev:90,70222 ino:2213 uid:907 gid:907 size:40960 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12851/2663 offset:0 9: S_IFSOCK mode:0666 dev:567,0 ino:23000 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_BROADCAST,SO_SNDBUF(16384),SO_RCVBUF(5120) sockname: AF_UNIX peername: AF_UNIX /tmp/.s.PGSQL.5432 peer: postgres[50860] zone: a5ec1ae3-9d69-4ae8-9918-abaf41da24d0[1575] 10: S_IFREG mode:0600 dev:90,70222 ino:2296 uid:907 gid:907 size:32768 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12851/2662 [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# pgrep pg_prefaulter | xargs -n1 pfiles | tail
/manatee/pg/data/base/12852/16434 offset:0 110: S_IFREG mode:0600 dev:90,70222 ino:1449 uid:907 gid:0 size:22487040 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/16438 offset:0 111: S_IFREG mode:0600 dev:90,70222 ino:1448 uid:907 gid:0 size:90112 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /manatee/pg/data/base/12852/16436 offset:0

      2. [X] Obtain a list of the args and make sure they're changing (or examine the log output from the prefaulter):

[root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL && sleep 30 && pgrep postgres | 2>&1 xargs -n1 pargs | grep -i WAL
argv[0]: postgres: wal receiver process streaming 0/1AB4D3F0 argv[0]: postgres: wal receiver process streaming 0/1AB4D4D0

5. [X] Stop the `sync`:

[root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# svcs manatee-sitter STATE STIME FMRI online 10:30:16 svc:/manta/application/manatee-sitter:default [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# svcadm disable manatee-sitter [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# svcs manatee-sitter STATE STIME FMRI disabled 22:17:18 svc:/manta/application/manatee-sitter:default [root@3ef75d54-c0d8-e579-f0df-b893b768c9ed ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 23 (0/7705788) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok - - - - -
sync 3ef75d54 fail - - - - -
async a5ec1ae3 ok - - - - 9m24s

error: cannot query postgres on sync: peer "3ef75d54": ECONNREFUSED [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 24 (0/1AB4E1D0) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok - - - - -
sync a5ec1ae3 fail - - - - -

error: cannot query postgres on sync: peer "a5ec1ae3": ECONNREFUSED warning: cluster has no async peers [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 24 (0/1AB4E1D0) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok potential 0/F0E0000 0/F000000 0/E0FE358 -
sync a5ec1ae3 ok - - - - 27m16s

error: peer "cd1aecdc": downstream replication not yet established (expected state "streaming", found "catchup") warning: cluster has no async peers [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 24 (0/1AB4E1D0) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok potential 0/14F80000 0/14000000 0/123235C0 -
sync a5ec1ae3 ok - - - - 21m03s

error: peer "cd1aecdc": downstream replication not yet established (expected state "streaming", found "catchup") warning: cluster has no async peers [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 /var/tmp/pg_prefaulter]# env PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres /opt/postgresql/current/bin/pgbench -j 64 -P 60 -r -T 900 --no-vacuum --protocol=prepared progress: 60.0 s, 533.0 tps, lat 1.873 ms stddev 0.853 progress: 120.0 s, 498.7 tps, lat 2.003 ms stddev 1.007 progress: 180.0 s, 536.5 tps, lat 1.862 ms stddev 4.018 progress: 240.0 s, 514.6 tps, lat 1.941 ms stddev 1.187 progress: 300.0 s, 516.2 tps, lat 1.935 ms stddev 1.334 progress: 360.0 s, 491.0 tps, lat 2.035 ms stddev 1.137 progress: 420.0 s, 477.4 tps, lat 2.092 ms stddev 5.804 progress: 480.0 s, 535.7 tps, lat 1.864 ms stddev 1.084 progress: 540.0 s, 506.1 tps, lat 1.974 ms stddev 3.084 progress: 600.0 s, 490.5 tps, lat 2.036 ms stddev 2.268 progress: 660.0 s, 490.4 tps, lat 2.037 ms stddev 3.119 progress: 720.0 s, 490.9 tps, lat 2.035 ms stddev 1.134 progress: 780.0 s, 503.1 tps, lat 1.986 ms stddev 1.595 progress: 840.0 s, 501.1 tps, lat 1.993 ms stddev 0.907 transaction type: <builtin: TPC-B (sort of)> scaling factor: 10 query mode: prepared number of clients: 1 number of threads: 1 duration: 900 s number of transactions actually processed: 454097 latency average = 1.980 ms latency stddev = 2.814 ms tps = 504.559554 (including connections establishing) tps = 504.574002 (excluding connections establishing) script statistics:

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/1AB4E9E8 0/1AB4E9E8 0/1AB4E9E8 -
sync a5ec1ae3 ok async 0/1AB4E9E8 0/1AB4E9E8 0/1AB4E9E8 14m22s async 3ef75d54 ok - - - - -

echo "Lag caused because the above pgbench finished. Restarting pgbench reset the time-based lag metric to 0."

Lag caused because the above pgbench finished. Restarting pgbench reset the time-based lag metric to 0.

7. [X] Restart the deposed `sync` (it should now become the `async`):

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 24 (0/1AB4E1D0) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary cd1aecdc ok sync 0/1ACB2668 0/1ACB2668 0/1ACB24D8 -
sync a5ec1ae3 ok async 0/1ACB2668 0/1ACB2668 0/1ACB24D8 0m00s async 3ef75d54 ok - - - - 0m00s

8. [X] Wait until lag comes down to acceptable levels.
9. [X] Stop the `primary`:

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# svcs manatee-sitter STATE STIME FMRI online 10:29:16 svc:/manta/application/manatee-sitter:default [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# svcadm disable manatee-sitter [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# svcs manatee-sitter STATE STIME FMRI disabled 22:31:12 svc:/manta/application/manatee-sitter:default [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# env PGSSLMODE=disable PGHOST=/tmp PGUSER=postgres /opt/postgresql/current/bin/pgbench -j 64 -P 60 -r -T 900 --no-vacuum --protocol=prepared progress: 60.0 s, 607.7 tps, lat 1.643 ms stddev 0.988 progress: 120.0 s, 498.6 tps, lat 2.003 ms stddev 3.468 progress: 180.0 s, 509.1 tps, lat 1.962 ms stddev 1.905 progress: 240.0 s, 484.3 tps, lat 2.062 ms stddev 0.958 progress: 300.0 s, 519.1 tps, lat 1.924 ms stddev 1.128 client 0 aborted in state 8; perhaps the backend died while processing transaction type: <builtin: TPC-B (sort of)> scaling factor: 10 query mode: prepared number of clients: 1 number of threads: 1 duration: 900 s number of transactions actually processed: 161806 latency average = 1.911 ms latency stddev = 1.906 ms tps = 522.243830 (including connections establishing) tps = 522.303625 (excluding connections establishing) script statistics:

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/1EBAE028 0/1EBAE028 0/1EBADFD8 -
sync 3ef75d54 ok - - - - 1m04s deposed cd1aecdc ok - - - - -

warning: cluster has a deposed peer warning: cluster has no async peers [root@a5ec1ae3-9d69-4ae8-9918-abaf41da24d0 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/1EE61B78 0/1EE61B78 0/1EE619E0 -
sync 3ef75d54 ok - - - - 0m00s deposed cd1aecdc ok - - - - -

warning: cluster has a deposed peer warning: cluster has no async peers

10. [X] Watch the `sync` take over for the now failed `primary`.
11. [X] Watch the `async` take over as the new `sync`:

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show
zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/2185D210 0/2185D080 0/2185CEE8 -
sync 3ef75d54 ok - - - - 0m00s deposed cd1aecdc ok - - - - -

warning: cluster has a deposed peer warning: cluster has no async peers [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm rebuild This peer is a deposed former primary and may have state that needs to be removed before it can successfully rejoin the cluster. This operation will remove all local data and then unmark this peer as deposed so that it can rejoin the cluster. This operation is destructive, and you should only proceed after confirming that this peer's copy of the data is no longer required. (yes/no)? prompt: no: yes Disabling manatee-sitter SMF service to ensure that Manatee is not running. Waiting up to 90 seconds for this Manatee instance's ZooKeeper session to expire | Attempting to remove ZFS dataset's contents Enabling sitter Watching sitter for postgres startup
assert.js:92 throw new assert.AssertionError({ ^ AssertionError: operation "tcp://postgres@172.25.14.171:5432/postgres" is already pending at Barrier.start (/opt/smartdc/manatee/node_modules/manatee/node_modules/vasync/lib/vasync.js:466:13) at /opt/smartdc/manatee/node_modules/manatee/lib/adm.js:374:21 at Array.forEach (native) at /opt/smartdc/manatee/node_modules/manatee/lib/adm.js:371:15 at Array.forEach (native) at Object._addPostgresStatus [as func] (/opt/smartdc/manatee/node_modules/manatee/lib/adm.js:351:26) at Object._onImmediate (/opt/smartdc/manatee/node_modules/manatee/node_modules/vasync/lib/vasync.js:213:20) at processImmediate [as _immediateCallback] (timers.js:330:15) [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# pwd /root [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show
zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/22D043A8 0/22D043A8 0/22D04218 -
sync 3ef75d54 ok async 0/206A0000 0/20000000 0/1E1F30F8 0m00s async cd1aecdc ok - - - - 6m50s

warning: peer "3ef75d54": downstream replication not yet established (expected state "streaming", found "catchup") [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/2341C4B8 0/2341C4B8 0/2341C320 -
sync 3ef75d54 ok async 0/2341C4B8 0/2341C4B8 0/2341C320 0m00s async cd1aecdc ok - - - - 0m00s

14. [X] Stop the `async`:

[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# pgrep pg_prefaulter | xargs -n1 pfiles pg_prefaulter pfiles: cannot examine pg_prefaulter: no such core file 62985: /var/tmp/pg_prefaulter/pg_prefaulter run --config=/var/tmp/pg_prefault Current rlimit: 65536 file descriptors 0: S_IFCHR mode:0666 dev:559,138372 ino:2364281560 uid:0 gid:3 rdev:38,2 O_RDONLY|O_LARGEFILE /dev/null offset:0 1: S_IFREG mode:0644 dev:90,69592 ino:66286 uid:0 gid:0 size:8058192 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:8058192 2: S_IFREG mode:0644 dev:90,69592 ino:66286 uid:0 gid:0 size:8058192 O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE /var/svc/log/network-cloudops-agent-pg_prefaulter:default.log offset:8058192 3: S_IFCHR mode:0644 dev:559,138372 ino:1983103238 uid:0 gid:3 rdev:49,1 O_RDONLY|O_NONBLOCK|O_LARGEFILE FD_CLOEXEC /dev/urandom offset:16 4: S_IFPORT mode:0000 dev:570,0 uid:0 gid:0 size:2 5: S_IFSOCK mode:0666 dev:567,0 ino:12914 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 4242 6: S_IFSOCK mode:0666 dev:567,0 ino:21818 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_REUSEADDR,SO_BROADCAST,SO_SNDBUF(128000),SO_RCVBUF(1048576) sockname: AF_INET 127.0.0.1 port: 53305 7: S_IFSOCK mode:0666 dev:567,0 ino:1628 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_DGRAM SO_BROADCAST,SO_DGRAM_ERRIND,SO_SNDBUF(57344),SO_RCVBUF(57344) sockname: AF_INET 172.25.14.133 port: 43659 peername: AF_INET 172.25.14.11 port: 53 8: S_IFSOCK mode:0666 dev:567,0 ino:26429 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_DGRAM SO_BROADCAST,SO_DGRAM_ERRIND,SO_SNDBUF(57344),SO_RCVBUF(57344) sockname: AF_INET 172.25.14.133 port: 42632 peername: AF_INET 172.25.14.11 port: 53 [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show
zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/240689A0 0/240689A0 0/24068810 -
sync 3ef75d54 ok - - - - 0m00s

warning: cluster has no async peers [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# svcadm enable manatee-sitter
[root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show
zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/2428A500 0/2428A500 0/2428A370 -
sync 3ef75d54 ok - - - - 0m00s async cd1aecdc fail - - - - -

warning: peer "3ef75d54": downstream replication peer not connected [root@cd1aecdc-8634-699e-a6cc-81cfaf649ec5 ~]# manatee-adm show zookeeper: 172.25.14.11 cluster: lab0 generation: 25 (0/1EBADF68) mode: normal freeze: not frozen

ROLE PEER PG REPL SENT FLUSH REPLAY LAG
primary a5ec1ae3 ok sync 0/243AAE78 0/243AAE78 0/243AACE0 -
sync 3ef75d54 ok async 0/243AAE78 0/243AAE78 0/23CA4C58 0m00s async cd1aecdc ok - - - - 0m31s


15. [X] Restart the deposed `async` (it should just rejoin the cluster as the `async`).