apache / cloudberry

Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.
https://cloudberrydb.org/
Apache License 2.0
398 stars 98 forks source link

[Bug] cloudberrydb failed to initialize standby master instance #144

Open liyxbeijing opened 1 year ago

liyxbeijing commented 1 year ago

Cloudberry Database version


warehouse=# select version();
                                                                                                           version                                                          

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------
 PostgreSQL 14.4 (Cloudberry Database 1.4.0 build commit:e83e3ffc22d538deb2dbceeeae0138ca2de064e6) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20210130 (Red Hat 10
.2.1-11), 64-bit compiled on Aug  3 2023 10:15:47
(1 row)

What happened

Initialization of standby master instance failed:

below is gpinitsystem_20230811.log:

20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-Failed to start Cloudberry instance; please review gpinitsystem log to determine failure.
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function FORCE_FTS_PROBE
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function CREATE_STANDBY_QD
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Starting initialization of standby coordinator bigdata-019
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-Failed to complete standby coordinator initialization
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function CREATE_STANDBY_QD
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Start Function SCAN_LOG
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-*******************************************************
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-were generated during the array creation
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Please review contents of log file
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20230811.log
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-To determine level of criticality
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-These messages could be from a previous run of the utility
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-that was called today!
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-*******************************************************
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Function SCAN_LOG
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Cloudberry Database instance successfully created
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-------------------------------------------------------
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-To complete the environment configuration, please
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-2. Add "export COORDINATOR_DATA_DIRECTORY=/data/hashdata/master/gpseg-1"
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-   to access the Cloudberry scripts for this instance:
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-   or, use -d /data/hashdata/master/gpseg-1 option for the Cloudberry scripts
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-   Example gpstate -d /data/hashdata/master/gpseg-1
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20230811.log
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-Standby Coordinator failed to initialize
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-------------------------------------------------------
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-The Coordinator /data/hashdata/master/gpseg-1/pg_hba.conf post gpinitsystem
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-new array must be explicitly added to this file
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Refer to the Cloudberry Admin support guide which is
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-located in the /usr/local/cloudberry-db/docs directory
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-------------------------------------------------------
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-*******************************************************
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-Cluster setup finished, but Standby Coordinator failed to initialize. Review contents of log files for errors.
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-Use gpinitstandby to create a Standby Coordinator
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[WARN]:-*******************************************************
20230811:09:19:38:498139 gpinitsystem:bigdata-018:gpadmin-[INFO]:-End Main

What you think should happen instead

RD replied that this seems to be a gpinitsystem issue and prepared a patch :

1.patch

How to reproduce

initialize a cluster with standby instance.

Operating System

Centos 7.9

Anything else

No response

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 1 year ago

Hey, @liyxbeijing welcome!🎊 Thanks for taking the time to point this out.🙌

tuhaihe commented 1 year ago

Hi @liyxbeijing: we prefer all communications in English, it would be better if you can rewrite your issue in English, thanks!

liyxbeijing commented 1 year ago

Hi @liyxbeijing: we prefer all communications in English, it would be better if you can rewrite your issue in English, thanks!

sure and English version is done.

my-ship-it commented 3 months ago

Do you have details logs for under DATA DIRECTORY?