MasahikoSawada / pg_keeper

Simplified clustering module for PostgreSQL
33 stars 4 forks source link

after following the steps, when i start postgres db it throws error. #3

Open priyankur05 opened 7 years ago

priyankur05 commented 7 years ago

Hi,

I have followed the steps and when i start postgres db, it throws error: pg_keeper directory not found in log. Please help.

Thanks, Ankur

MasahikoSawada commented 7 years ago

I think this error message is emitted by postgres not pg_keeper. Could you give me more information such as pg_keeper version you are using, other logs?

priyankur05 commented 7 years ago

Thanks for your quick response. Here i will explain all the detail what i have done and what error i see.

Replication is in place on my master and slave server in synchronous mode. (read only mode in slave server). Version of Postgres: 9.5 CentOS: 7.2 Pg_keeper: version 1.0 from: https://github.com/MasahikoSawada/pg_keeper/tree/REL1_0_STABLE

I copied pg_keeper-REL1_0.tar.gz under /usr folder, unzip it and cd pg_keeper-REL1_0 then ran make USE_PGXS=1

It gave me error:

_In file included from pg_keeper.c:12:0: pg_keeper.h:13:33: fatal error: postmaster/bgworker.h: No such file or directory

include "postmaster/bgworker.h"

                             ^

compilation terminated.

I copied bgworker.h header file from postgres source code folder to /usr/include/pgsql/server/postmaster then this error gone.

Again i ran the same command it gave me error for rpm not found, i installed required rpm and ran the command again.

It ran now but with warnings:

Thanks for ypur quick response. Here i will explain all the detail what i have done and what error i see.

Version of Postgres: 9.5 CentOS: 7.2

I copied pg_keeper-REL1_0.tar.gz under /usr folder, unzip it and cd pg_keeper-REL1_0 then ran make USE_PGXS=1

It gave me error:

_In file included from pg_keeper.c:12:0: pg_keeper.h:13:33: fatal error: postmaster/bgworker.h: No such file or directory

include "postmaster/bgworker.h"

                             ^

compilation terminated.

I copied bgworker.h header file from postgres source code folder to /usr/include/pgsql/server/postmaster then this error gone.

Again i ran the same command it gave me error for rpm not found, i installed required rpm and ran the command again.

It ran now but with warnings:

pg_keeper.c: In function ‘KeeperMain’: pg_keeper.c:196:2: warning: implicit declaration of function ‘pqsignal’ [-Wimplicit-function-declaration] pqsignal(SIGHUP, pg_keeper_sighup); ^ pg_keeper.c:243:11: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] proc_exit(ret);

I ignored it, and followed next command.

$ su

make USE_PGXS=1 install

it gave me output as:

[root@clm-pun-29531 pg_keeper-REL1_0]# make USE_PGXS=1 install /usr/bin/mkdir -p '/usr/lib64/pgsql' /bin/sh /usr/lib64/pgsql/pgxs/src/makefiles/../../config/install-sh -c -m 755 pg_keeper.so '/usr/lib64/pgsql/pg_keeper.so'

I am not sure it has got installed or not.

I followed same configuration for master and slave as below in postgresql.conf:

shared_preload_libraries = 'pg_keeper' max_worker_processes = 8 pg_keeper.keepalive_time = 5 pg_keeper.keepalive_count = 3 pg_keeper.node2_conninfo = 'host=10.133.72.17 port=5432 dbname=postgres' pg_keeper.node1_conninfo = 'host=10.133.79.112 port=5432 user=replicador application_name=postgresql2' //as suggested in doc that this content should be same as that of recovery.conf on slave server

master server file.zip

//(pg_keeper.node1_conninfo(*) //Specifies a connection string to be used for pg_keeper to connect to the first master - which is used by //standby mode server. //It should be the same as the primary_conninfo in recovery.conf on first standby server.)

After making this change it stopped postgres and now when i try to restart i see error:


[root@clm-pun-029531 data]# journalctl -xe -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit postgresql-9.5.service has begun shutting down. Feb 16 10:52:36 clm-pun-029531.bmc.com systemd[1]: Stopped PostgreSQL 9.5 database server. -- Subject: Unit postgresql-9.5.service has finished shutting down -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit postgresql-9.5.service has finished shutting down. Feb 16 10:52:36 clm-pun-029531.bmc.com polkitd[727]: Unregistered Authentication Agent for unix-process:29097:6110107 (system bus name :1.3 Feb 16 10:52:38 clm-pun-029531.bmc.com dhclient[742]: DHCPREQUEST on ens192 to 172.29.65.137 port 67 (xid=0x273266d1) Feb 16 10:52:39 clm-pun-029531.bmc.com polkitd[727]: Registered Authentication Agent for unix-process:29205:6110443 (system bus name :1.317 Feb 16 10:52:39 clm-pun-029531.bmc.com systemd[1]: Starting PostgreSQL 9.5 database server... -- Subject: Unit postgresql-9.5.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit postgresql-9.5.service has begun starting up. Feb 16 10:52:39 clm-pun-029531.bmc.com pg_ctl[29227]: < 2017-02-16 10:52:39.233 IST >FATAL: could not access file "pg_keeper": No such fil Feb 16 10:52:40 clm-pun-029531.bmc.com pg_ctl[29227]: pg_ctl: could not start server Feb 16 10:52:40 clm-pun-029531.bmc.com pg_ctl[29227]: Examine the log output. Feb 16 10:52:40 clm-pun-029531.bmc.com systemd[1]: postgresql-9.5.service: control process exited, code=exited status=1 Feb 16 10:52:40 clm-pun-029531.bmc.com systemd[1]: Failed to start PostgreSQL 9.5 database server. -- Subject: Unit postgresql-9.5.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit postgresql-9.5.service has failed.

-- The result is failed. Feb 16 10:52:40 clm-pun-029531.bmc.com systemd[1]: Unit postgresql-9.5.service entered failed state. Feb 16 10:52:40 clm-pun-029531.bmc.com systemd[1]: postgresql-9.5.service failed. Feb 16 10:52:40 clm-pun-029531.bmc.com polkitd[727]: Unregistered Authentication Agent for unix-process:29205:6110443 (system bus name :1.3 Feb 16 10:52:52 clm-pun-029531.bmc.com dhclient[742]: DHCPREQUEST on ens192 to 172.29.65.137 port 67 (xid=0x273266d1) Feb 16 10:53:05 clm-pun-029531.bmc.com dhclient[742]: DHCPREQUEST on ens192 to 172.29.65.137 port 67 (xid=0x273266d1) Feb 16 10:53:13 clm-pun-029531.bmc.com dhclient[742]: DHCPREQUEST on ens192 to 172.29.65.137 port 67 (xid=0x273266d1)


Attaching files for master server as zip file attachment.

I wish to set-up automated fail over for my machines. Help will be appreciated.

MasahikoSawada commented 7 years ago

Thank you for the more detail information.

Regarding installation of pg_keeper, you need to have postgresql-devel rpm package, and to set pg_config command to PATH environment variable, instead of copying header file manually. You can do rpm -qa | grep postgresql and which pg_config to check if done successfully. For example, in my environment I checked the above things;

$ rpm -qa | grep postgresql
postgresql95-contrib-9.5.3-2PGDG.rhel7.x86_64
postgresql95-devel-9.5.3-2PGDG.rhel7.x86_64
postgresql95-libs-9.5.3-2PGDG.rhel7.x86_64
postgresql95-server-9.5.3-2PGDG.rhel7.x86_64
postgresql95-9.5.3-2PGDG.rhel7.x86_64
$ which pg_config
/usr/pgsql-9.5/bin/pg_config

After you confirmed them, you can do make USE_PGXS=1 on pg_keeper directory. (Maybe krb5-devel and openssl-devel packages are required).

You can check if pg_keeper has been installed successfully by executing SHOW shared_preload_libraries. Also you need to check if pg_keeper process has been launched successfully by ps command. As documentation says, pg_keeper process appears on the result of ps command like follows.

$ ps x | grep pg_keeper | grep -v grep
33525 ?        Ss     0:00 postgres: bgworker: pg_keeper   (master mode:connected)
priyankur05 commented 7 years ago

After i exported path:

[root@clm-pun-029531 pg_keeper-REL1_0]# ps x | grep pg_keeper | grep -v grep [root@clm-pun-029531 pg_keeper-REL1_0]# SHOW shared_preload_libraries bash: SHOW: command not found... [root@clm-pun-029531 pg_keeper-REL1_0]# export PATH=/usr/pgsql-9.5/bin:$PATH [root@clm-pun-029531 pg_keeper-REL1_0]# SHOW shared_preload_libraries bash: SHOW: command not found... [root@clm-pun-029531 pg_keeper-REL1_0]# make USE_PGXS=1 Makefile:12: /usr/pgsql-9.5/lib/pgxs/src/makefiles/pgxs.mk: No such file or directory make: *** No rule to make target `/usr/pgsql-9.5/lib/pgxs/src/makefiles/pgxs.mk'. Stop. [root@clm-pun-029531 pg_keeper-REL1_0]#

Now make is not working.

MasahikoSawada commented 7 years ago

Well, some comments.

priyankur05 commented 7 years ago

Here is the output of:

[root@clm-pun-029531 pg_keeper-REL1_0]# rpm -qa|grep postgres

postgresql-contrib-9.2.18-1.el7.x86_64 postgresql95-server-9.5.5-1PGDG.rhel7.x86_64 postgresql-devel-9.2.18-1.el7.x86_64 postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64 postgresql-9.2.18-1.el7.x86_64 postgresql95-9.5.5-1PGDG.rhel7.x86_64 postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64 postgresql-libs-9.2.18-1.el7.x86_64

When i run make USE_PGXS=1 by non root user

-bash-4.2$ make USE_PGXS=1

gcc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DLINUX_OOM_SCORE_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -I/usr/include -I. -I. -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -D_GNU_SOURCE -I/usr/include/libxml2 -c -o pg_keeper.o pg_keeper.c In file included from pg_keeper.c:12:0: pg_keeper.h:13:33: fatal error: postmaster/bgworker.h: No such file or directory

include "postmaster/bgworker.h"

                             ^

compilation terminated. make: *** [pg_keeper.o] Error 1

Please help.

MasahikoSawada commented 7 years ago

You should install postgresql95-devel RPM package.

priyankur05 commented 7 years ago

As suggested by you i have installed above package, installation went fine.

Now when i fire : [root@master data]# ps x | grep pg_keeper | grep -v grep

i dont see any result

[root@master data]# ps -eaf | grep pg_keeper | grep -v grep postgres 10073 10035 0 16:40 ? 00:00:00 postgres: bgworker: pg_keeper (master:ready)

And same status is shown on slave and i see that recovery.conf has changed to recovery.done. [root@slave data]# ps -eaf | grep pg_keeper | grep -v grep postgres 10073 10035 0 16:40 ? 00:00:00 postgres: bgworker: pg_keeper (master:ready)

What I am doing wrong?

MasahikoSawada commented 7 years ago

As suggested by you i have installed above package, installation went fine.

Great.

And same status is shown on slave and i see that recovery.conf has changed to recovery.done.

What information has been reported in log file on slave server? I guess that the connection from slave server to primary server had failed.

priyankur05 commented 7 years ago

here is the detail content of each file:

postgres.conf on master:

listen_addresses = '*' max_connections = 300 shared_buffers = 128MB dynamic_shared_memory_type = posix wal_level = 'hot_standby' max_wal_senders = 1 synchronous_standby_names = 'postgresql2' wal_keep_segments = 100 shared_preload_libraries = 'pg_keeper' max_worker_processes = 8 pg_keeper.keepalive_time = 5 pg_keeper.keepalive_count = 3 pg_keeper.node1_conninfo = 'host=10.133.79.112 port=5432 dbname=postgres'

master ip add

pg_keeper.node2_conninfo = 'host=10.133.72.17 port=5432 dbname=postgres'

slave ip add

pg_hba on master:

local all all md5

IPv4 local connections:

host all all 0.0.0.0/0 md5 host replication replicador 10.133.72.17/32 trust

IPv6 local connections:

host all all ::1/128 ident

log on master: < 2017-02-24 00:14:12.440 IST >WARNING: canceling wait for synchronous replication due to user request < 2017-02-24 00:14:12.440 IST >DETAIL: The transaction has already committed locally, but might not have been replicated to the standby. < 2017-02-24 00:33:14.812 IST >LOG: received fast shutdown request < 2017-02-24 00:33:14.812 IST >LOG: aborting any active transactions < 2017-02-24 00:33:14.813 IST >FATAL: terminating connection due to administrator command < 2017-02-24 00:33:14.814 IST >LOG: autovacuum launcher shutting down < 2017-02-24 00:33:14.815 IST >LOG: worker process: pg_keeper (PID 19545) exited with exit code 1 < 2017-02-24 00:33:14.815 IST >LOG: shutting down < 2017-02-24 00:33:14.826 IST >LOG: database system is shut down < 2017-02-24 00:34:15.340 IST >LOG: database system was shut down at 2017-02-24 00:33:14 IST < 2017-02-24 00:34:15.342 IST >LOG: MultiXact member wraparound protections are now enabled < 2017-02-24 00:34:15.344 IST >LOG: database system is ready to accept connections < 2017-02-24 00:34:15.345 IST >LOG: autovacuum launcher started

postgresql.conf on slave:

listen_addresses = '*' max_connections = 300 hot_standby = on shared_preload_libraries = 'pg_keeper' max_worker_processes = 8 pg_keeper.keepalive_time = 5 pg_keeper.keepalive_count = 3 pg_keeper.node1_conninfo = 'host=10.133.79.112 port=5432 dbname=postgres' pg_keeper.node2_conninfo = 'host=10.133.72.17 port=5432 dbname=postgres'

pg_hba.conf on slave:

"local" is for Unix domain socket connections only

local all all md5

IPv4 local connections:

host all all 0.0.0.0/0 md5 host replication replicador 10.133.79.112/32 trust

IPv6 local connections:

host all all ::1/128 ident

recovery.done:

standby_mode = 'on' trigger_file = '/tmp/promotedb' primary_conninfo = 'host=10.133.79.112 port=5432 user=replicador application_name=postgresql2'

log on slave: < 2017-02-24 00:00:02.730 IST >LOG: database system was shut down at 2017-02-23 23:59:35 IST < 2017-02-24 00:00:02.740 IST >LOG: MultiXact member wraparound protections are now enabled < 2017-02-24 00:00:02.743 IST >LOG: database system is ready to accept connections < 2017-02-24 00:00:02.744 IST >LOG: autovacuum launcher started < 2017-02-24 00:11:38.113 IST >ERROR: syntax error at or near "select" at character 23 < 2017-02-24 00:11:38.113 IST >STATEMENT: select from company select from company; < 2017-02-24 00:23:22.886 IST >LOG: received fast shutdown request < 2017-02-24 00:23:22.886 IST >LOG: aborting any active transactions < 2017-02-24 00:23:22.887 IST >FATAL: terminating connection due to administrator command < 2017-02-24 00:23:22.889 IST >LOG: worker process: pg_keeper (PID 12359) exited with exit code 1 < 2017-02-24 00:23:22.890 IST >LOG: autovacuum launcher shutting down < 2017-02-24 00:23:22.891 IST >LOG: shutting down < 2017-02-24 00:23:22.917 IST >LOG: database system is shut down < 2017-02-24 00:23:26.691 IST >LOG: database system was shut down at 2017-02-24 00:23:22 IST < 2017-02-24 00:23:26.700 IST >LOG: MultiXact member wraparound protections are now enabled < 2017-02-24 00:23:26.702 IST >LOG: database system is ready to accept connections < 2017-02-24 00:23:26.708 IST >LOG: autovacuum launcher started

On Thu, Feb 23, 2017 at 6:02 PM, masahiko notifications@github.com wrote:

As suggested by you i have installed above package, installation went fine.

Great.

And same status is shown on slave and i see that recovery.conf has changed to recovery.done.

What information has been reported in log file on slave server? I guess that the connection from slave server to primary server had failed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MasahikoSawada/pg_keeper/issues/3#issuecomment-281980009, or mute the thread https://github.com/notifications/unsubscribe-auth/AYmaXJWbOexZF1aZ85ELl-ZBMXARDAWnks5rfXxZgaJpZM4MBuje .