oracle / docker-images

Official source of container configurations, images, and examples for Oracle products and projects
https://developer.oracle.com/use-cases/#containers
Universal Permissive License v1.0
6.58k stars 5.44k forks source link

Error has occured in Grid Setup, Please verify! #2253

Closed ifrankrui closed 2 years ago

ifrankrui commented 2 years ago

Hi,

I am creating RAC cluster using all four images in this repository. When I start the DB node container, I received "Error has occured in Grid Setup, Please verify!" Looking through the logs, they don't reveal too much info on my error. Could you please point me to the right direction? Thanks!

Here is the log when starting the racnode1 container:

PATH=/bin:/usr/bin:/sbin:/usr/sbin
HOSTNAME=racnode1
TERM=xterm
NODE_VIP=172.16.1.160
PUBLIC_IP=172.16.1.150
OP_TYPE=INSTALL
ASM_DISCOVERY_DIR=/oradata
CMAN_IP=172.16.1.15
PUBLIC_HOSTNAME=racnode1
SCAN_NAME=racnode-scan
ASM_DEVICE_LIST=/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img
DNS_SERVERS=172.16.1.25
PRIV_IP=192.168.17.150
DOMAIN=example.com
CMAN_HOSTNAME=racnode-cman1
COMMON_OS_PWD_FILE=common_os_pwdfile.enc
VIP_HOSTNAME=racnode1-vip
PRIV_HOSTNAME=racnode1-priv
PWD_KEY=pwd.key
SETUP_LINUX_FILE=setupLinuxEnv.sh
INSTALL_DIR=/opt/scripts
GRID_BASE=/u01/app/grid
GRID_HOME=/u01/app/21.3.0/grid
INSTALL_FILE_1=LINUX.X64_213000_grid_home.zip
GRID_INSTALL_RSP=gridsetup_21c.rsp
GRID_SW_INSTALL_RSP=grid_sw_install_21c.rsp
GRID_SETUP_FILE=setupGrid.sh
FIXUP_PREQ_FILE=fixupPreq.sh
INSTALL_GRID_BINARIES_FILE=installGridBinaries.sh
INSTALL_GRID_PATCH=applyGridPatch.sh
INVENTORY=/u01/app/oraInventory
CONFIGGRID=configGrid.sh
ADDNODE=AddNode.sh
DELNODE=DelNode.sh
ADDNODE_RSP=grid_addnode_21c.rsp
SETUPSSH=setupSSH.expect
DOCKERORACLEINIT=dockeroracleinit
GRID_USER_HOME=/home/grid
SETUPGRIDENV=setupGridEnv.sh
RESET_OS_PASSWORD=resetOSPassword.sh
MULTI_NODE_INSTALL=MultiNodeInstall.py
DB_BASE=/u01/app/oracle
DB_HOME=/u01/app/oracle/product/21.3.0/dbhome_1
INSTALL_FILE_2=LINUX.X64_213000_db_home.zip
DB_INSTALL_RSP=db_sw_install_21c.rsp
DBCA_RSP=dbca_21c.rsp
DB_SETUP_FILE=setupDB.sh
PWD_FILE=setPassword.sh
RUN_FILE=runOracle.sh
STOP_FILE=stopOracle.sh
ENABLE_RAC_FILE=enableRAC.sh
CHECK_DB_FILE=checkDBStatus.sh
USER_SCRIPTS_FILE=runUserScripts.sh
REMOTE_LISTENER_FILE=remoteListener.sh
INSTALL_DB_BINARIES_FILE=installDBBinaries.sh
GRID_HOME_CLEANUP=GridHomeCleanup.sh
ORACLE_HOME_CLEANUP=OracleHomeCleanup.sh
DB_USER=oracle
GRID_USER=grid
FUNCTIONS=functions.sh
COMMON_SCRIPTS=/common_scripts
CHECK_SPACE_FILE=checkSpace.sh
RESET_FAILED_UNITS=resetFailedUnits.sh
SET_CRONTAB=setCrontab.sh
CRONTAB_ENTRY=crontabEntry
EXPECT=/usr/bin/expect
BIN=/usr/sbin
container=true
INSTALL_SCRIPTS=/opt/scripts/install
SCRIPT_DIR=/opt/scripts/startup
GRID_PATH=/u01/app/21.3.0/grid/bin:/u01/app/21.3.0/grid/OPatch/:/u01/app/21.3.0/grid/perl/bin:/usr/sbin:/bin:/sbin
DB_PATH=/u01/app/oracle/product/21.3.0/dbhome_1/bin:/u01/app/oracle/product/21.3.0/dbhome_1/OPatch/:/u01/app/oracle/product/21.3.0/dbhome_1/perl/bin:/usr/sbin:/bin:/sbin
GRID_LD_LIBRARY_PATH=/u01/app/21.3.0/grid/lib:/usr/lib:/lib
DB_LD_LIBRARY_PATH=/u01/app/oracle/product/21.3.0/dbhome_1/lib:/usr/lib:/lib
HOME=/home/grid
Failed to parse kernel command line, ignoring: No such file or directory
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization other.
Detected architecture x86-64.

Welcome to Oracle Linux Server 7.9!

Set hostname to <racnode1>.
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
/usr/lib/systemd/system-generators/systemd-fstab-generator failed with error code 1.
[/usr/lib/systemd/system/systemd-pstore.service:22] Unknown lvalue 'StateDirectory' in section 'Service'
Cannot add dependency job for unit display-manager.service, ignoring: Unit not found.
[  OK  ] Reached target Swap.
[  OK  ] Reached target RPC Port Mapper.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Created slice Root Slice.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Listening on Journal Socket.
[  OK  ] Created slice System Slice.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Reached target Slices.
         Starting Read and set NIS domainname from /etc/sysconfig/network...
         Starting Journal Service...
[  OK  ] Listening on Delayed Shutdown Socket.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
Couldn't determine result for ConditionKernelCommandLine=|rd.modules-load for systemd-modules-load.service, assuming failed: No such file or directory
Couldn't determine result for ConditionKernelCommandLine=|modules-load for systemd-modules-load.service, assuming failed: No such file or directory
         Starting Remount Root and Kernel File Systems...
[  OK  ] Started Journal Service.
[  OK  ] Started Read and set NIS domainname from /etc/sysconfig/network.
[  OK  ] Started Remount Root and Kernel File Systems.
         Starting Configure read-only root support...
[  OK  ] Reached target Local File Systems (Pre).
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Configure read-only root support.
[  OK  ] Reached target Local File Systems.
         Starting Preprocess NFS configuration...
         Starting Load/Save Random Seed...
         Starting Create Volatile Files and Directories...
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Preprocess NFS configuration.
[  OK  ] Started Create Volatile Files and Directories.
         Mounting RPC Pipe File System...
         Starting Update UTMP about System Boot/Shutdown...
[FAILED] Failed to mount RPC Pipe File System.
See 'systemctl status var-lib-nfs-rpc_pipefs.mount' for details.
[DEPEND] Dependency failed for rpc_pipefs.target.
[DEPEND] Dependency failed for RPC security service for NFS client and server.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on RPCbind Server Activation Socket.
         Starting RPC bind service...
[  OK  ] Reached target Sockets.
[  OK  ] Started Flexible branding.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
         Starting Resets System Activity Logs...
         Starting LSB: Bring up/down networking...
         Starting Login Service...
         Starting Self Monitoring and Reporting Technology (SMART) Daemon...
[  OK  ] Started D-Bus System Message Bus.
         Starting GSSAPI Proxy Daemon...
[  OK  ] Started RPC bind service.
         Starting Cleanup of Temporary Directories...
[  OK  ] Started Login Service.
[  OK  ] Started Resets System Activity Logs.
[  OK  ] Started Cleanup of Temporary Directories.
[  OK  ] Started GSSAPI Proxy Daemon.
[  OK  ] Reached target NFS client services.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting Permit User Sessions...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Command Scheduler.
[  OK  ] Started LSB: Bring up/down networking.
[  OK  ] Reached target Network.
[  OK  ] Reached target Network is Online.
         Starting Notify NFS peers of a restart...
         Starting OpenSSH server daemon...
         Starting /etc/rc.d/rc.local Compatibility...
[  OK  ] Started Notify NFS peers of a restart.
[  OK  ] Started /etc/rc.d/rc.local Compatibility.
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started OpenSSH server daemon.
02-23-2022 15:28:47 UTC :  : Process id of the program : 
02-23-2022 15:28:47 UTC :  : #################################################
02-23-2022 15:28:47 UTC :  :  Starting Grid Installation          
02-23-2022 15:28:47 UTC :  : #################################################
02-23-2022 15:28:47 UTC :  : Pre-Grid Setup steps are in process
02-23-2022 15:28:47 UTC :  : Process id of the program : 
02-23-2022 15:28:47 UTC :  : Disable failed service var-lib-nfs-rpc_pipefs.mount
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
02-23-2022 15:28:47 UTC :  : Resetting Failed Services
02-23-2022 15:28:47 UTC :  : Sleeping for 60 seconds
[  OK  ] Started Self Monitoring and Reporting Technology (SMART) Daemon.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Oracle Linux Server 7.9
Kernel 5.4.17-2136.304.4.1.el7uek.x86_64 on an x86_64

racnode1 login: 02-23-2022 15:29:47 UTC :  : Systemctl state is running!
02-23-2022 15:29:47 UTC :  : Setting correct permissions for /bin/ping
02-23-2022 15:29:47 UTC :  : Public IP is set to 172.16.1.150
02-23-2022 15:29:47 UTC :  : RAC Node PUBLIC Hostname is set to racnode1
02-23-2022 15:29:47 UTC :  : racnode1 already exists : 172.16.1.150 racnode1.example.com    racnode1
192.168.17.150  racnode1-priv.example.com   racnode1-priv
172.16.1.160    racnode1-vip.example.com    racnode1-vip, no  update required
02-23-2022 15:29:47 UTC :  : racnode1-priv already exists : 192.168.17.150  racnode1-priv.example.com   racnode1-priv, no  update required
02-23-2022 15:29:47 UTC :  : racnode1-vip already exists : 172.16.1.160 racnode1-vip.example.com    racnode1-vip, no  update required
02-23-2022 15:29:47 UTC :  : Preparing host line for racnode-scan
02-23-2022 15:29:47 UTC :  : racnode-cman1 already exists : 172.16.1.15 racnode-cman1.example.com   racnode-cman1, no  update required
02-23-2022 15:29:47 UTC :  : Preapring Device list
02-23-2022 15:29:47 UTC :  : Changing Disk permission and ownership /oradata/asm_disk01.img
02-23-2022 15:29:47 UTC :  : Changing Disk permission and ownership /oradata/asm_disk02.img
02-23-2022 15:29:47 UTC :  : Changing Disk permission and ownership /oradata/asm_disk03.img
02-23-2022 15:29:47 UTC :  : Changing Disk permission and ownership /oradata/asm_disk04.img
02-23-2022 15:29:47 UTC :  : Changing Disk permission and ownership /oradata/asm_disk05.img
02-23-2022 15:29:47 UTC :  : Preapring Dns Servers list
02-23-2022 15:29:47 UTC :  : Setting DNS Servers
02-23-2022 15:29:47 UTC :  : Adding nameserver 172.16.1.25 in /etc/resolv.conf.
02-23-2022 15:29:47 UTC :  : #####################################################################
02-23-2022 15:29:47 UTC :  :  RAC setup will begin in 2 minutes                                   
02-23-2022 15:29:47 UTC :  : ####################################################################
02-23-2022 15:30:17 UTC :  : ###################################################
02-23-2022 15:30:17 UTC :  : Pre-Grid Setup steps completed
02-23-2022 15:30:17 UTC :  : ###################################################
02-23-2022 15:30:17 UTC :  : Checking if grid is already configured
02-23-2022 15:30:17 UTC :  : Process id of the program : 
02-23-2022 15:30:17 UTC :  : Public IP is set to 172.16.1.150
02-23-2022 15:30:17 UTC :  : RAC Node PUBLIC Hostname is set to racnode1
02-23-2022 15:30:17 UTC :  : Domain is defined to example.com
02-23-2022 15:30:17 UTC :  : Default setting of AUTO GNS VIP set to false. If you want to use AUTO GNS VIP, please pass DHCP_CONF as an env parameter set to true
02-23-2022 15:30:17 UTC :  : RAC VIP set to 172.16.1.160
02-23-2022 15:30:17 UTC :  : RAC Node VIP hostname is set to racnode1-vip 
02-23-2022 15:30:17 UTC :  : SCAN_NAME name is racnode-scan
02-23-2022 15:30:17 UTC :  : SCAN PORT is set to empty string. Setting it to 1521 port.
02-23-2022 15:30:17 UTC :  : 172.16.1.172
172.16.1.170
172.16.1.171
02-23-2022 15:30:17 UTC :  : SCAN Name resolving to IP. Check Passed!
02-23-2022 15:30:17 UTC :  : SCAN_IP set to the empty string
02-23-2022 15:30:17 UTC :  : RAC Node PRIV IP is set to 192.168.17.150 
02-23-2022 15:30:17 UTC :  : RAC Node private hostname is set to racnode1-priv
02-23-2022 15:30:17 UTC :  : CMAN_HOSTNAME name is racnode-cman1
02-23-2022 15:30:17 UTC :  : CMAN_IP name is 172.16.1.15
02-23-2022 15:30:17 UTC :  : Cluster Name is not defined
02-23-2022 15:30:17 UTC :  : Cluster name is set to 'racnode-c'
02-23-2022 15:30:17 UTC :  : Password file generated
02-23-2022 15:30:17 UTC :  : Common OS Password string is set for Grid user
02-23-2022 15:30:17 UTC :  : Common OS Password string is set for  Oracle user
02-23-2022 15:30:17 UTC :  : Common OS Password string is set for Oracle Database
02-23-2022 15:30:17 UTC :  : Setting CONFIGURE_GNS to false
02-23-2022 15:30:17 UTC :  : GRID_RESPONSE_FILE env variable set to empty. configGrid.sh will use standard cluster responsefile
02-23-2022 15:30:17 UTC :  : Location for User script SCRIPT_ROOT set to /common_scripts
02-23-2022 15:30:17 UTC :  : IGNORE_CVU_CHECKS is set to true
02-23-2022 15:30:17 UTC :  : Oracle SID is set to ORCLCDB
02-23-2022 15:30:17 UTC :  : Oracle PDB name is set to ORCLPDB
02-23-2022 15:30:17 UTC :  : Check passed for network card eth1 for public IP 172.16.1.150
02-23-2022 15:30:17 UTC :  : Public Netmask : 255.255.255.0
02-23-2022 15:30:17 UTC :  : Check passed for network card eth0 for private IP 192.168.17.150
02-23-2022 15:30:17 UTC :  : Building NETWORK_STRING to set  networkInterfaceList in Grid Response File
02-23-2022 15:30:17 UTC :  : Network InterfaceList  set to eth1:172.16.1.0:1,eth0:192.168.17.0:5
02-23-2022 15:30:17 UTC :  : Setting random password for grid user
02-23-2022 15:30:17 UTC :  : Setting random password for oracle user
02-23-2022 15:30:17 UTC :  : Calling setupSSH function
02-23-2022 15:30:17 UTC :  : SSh will be setup among racnode1 nodes
02-23-2022 15:30:17 UTC :  : Running SSH setup for grid user between nodes racnode1
02-23-2022 15:30:52 UTC :  : Running SSH setup for oracle user between nodes racnode1
02-23-2022 15:30:57 UTC :  : SSH check fine for the racnode1
02-23-2022 15:30:58 UTC :  : SSH check fine for the oracle@racnode1
02-23-2022 15:30:58 UTC :  : Preapring Device list
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : ASM Disk size : 0
02-23-2022 15:30:58 UTC :  : ASM Device list will be with failure groups /oradata/asm_disk01.img,,/oradata/asm_disk02.img,,/oradata/asm_disk03.img,,/oradata/asm_disk04.img,,/oradata/asm_disk05.img,
02-23-2022 15:30:58 UTC :  : ASM Device list will be groups /oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img
02-23-2022 15:30:58 UTC :  : CLUSTER_TYPE env variable is set to STANDALONE, will not process GIMR DEVICE list as default Diskgroup is set to DATA. GIMR DEVICE List will be processed when CLUSTER_TYPE is set to DOMAIN for DSC
02-23-2022 15:30:58 UTC :  : Nodes in the cluster racnode1
02-23-2022 15:30:58 UTC :  : Setting Device permissions for RAC Install  on racnode1
02-23-2022 15:30:58 UTC :  : Preapring ASM Device list
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Populate Rac Env Vars on Remote Hosts
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode1
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Populate Rac Env Vars on Remote Hosts
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode1
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode1
02-23-2022 15:30:58 UTC :  : Populate Rac Env Vars on Remote Hosts
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode1
02-23-2022 15:30:58 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:58 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode1
02-23-2022 15:30:59 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode1
02-23-2022 15:30:59 UTC :  : Populate Rac Env Vars on Remote Hosts
02-23-2022 15:30:59 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode1
02-23-2022 15:30:59 UTC :  : Changing Disk permission and ownership
02-23-2022 15:30:59 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode1
02-23-2022 15:30:59 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode1
02-23-2022 15:30:59 UTC :  : Populate Rac Env Vars on Remote Hosts
02-23-2022 15:30:59 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode1
02-23-2022 15:30:59 UTC :  : Generating Reponsefile
02-23-2022 15:30:59 UTC :  : Running cluvfy Checks
02-23-2022 15:30:59 UTC :  : Performing Cluvfy Checks
02-23-2022 15:31:22 UTC :  : Checking /tmp/cluvfy_check.txt if there is any failed check.
This standalone version of CVU is "230" days old. The latest release of standalone CVU can be obtained from the Oracle support site. Refer to MOS note 2731675.1 for more details.

Performing following verification checks ...

  Physical Memory ...PASSED
  Available Physical Memory ...PASSED
  Swap Size ...PASSED
  Free Space: racnode1:/usr,racnode1:/var,racnode1:/etc,racnode1:/sbin,racnode1:/tmp ...PASSED
  User Existence: grid ...
    Users With Same UID: 54332 ...PASSED
  User Existence: grid ...PASSED
  Group Existence: asmadmin ...PASSED
  Group Existence: asmdba ...PASSED
  Group Existence: oinstall ...PASSED
  Group Membership: asmdba ...PASSED
  Group Membership: asmadmin ...PASSED
  Group Membership: oinstall(Primary) ...PASSED
  Run Level ...PASSED
  Hard Limit: maximum open file descriptors ...PASSED
  Soft Limit: maximum open file descriptors ...PASSED
  Hard Limit: maximum user processes ...PASSED
  Soft Limit: maximum user processes ...PASSED
  Soft Limit: maximum stack size ...PASSED
  Architecture ...PASSED
  OS Kernel Version ...PASSED
  OS Kernel Parameter: semmsl ...PASSED
  OS Kernel Parameter: semmns ...PASSED
  OS Kernel Parameter: semopm ...PASSED
  OS Kernel Parameter: semmni ...PASSED
  OS Kernel Parameter: shmmax ...PASSED
  OS Kernel Parameter: shmmni ...PASSED
  OS Kernel Parameter: shmall ...PASSED
  OS Kernel Parameter: file-max ...PASSED
  OS Kernel Parameter: ip_local_port_range ...PASSED
  OS Kernel Parameter: rmem_default ...PASSED
  OS Kernel Parameter: rmem_max ...PASSED
  OS Kernel Parameter: wmem_default ...PASSED
  OS Kernel Parameter: wmem_max ...PASSED
  OS Kernel Parameter: aio-max-nr ...PASSED
  OS Kernel Parameter: panic_on_oops ...PASSED
  Package: kmod-20-21 (x86_64) ...PASSED
  Package: kmod-libs-20-21 (x86_64) ...PASSED
  Package: binutils-2.23.52.0.1 ...PASSED
  Package: libgcc-4.8.2 (x86_64) ...PASSED
  Package: libstdc++-4.8.2 (x86_64) ...PASSED
  Package: sysstat-10.1.5 ...PASSED
  Package: ksh ...PASSED
  Package: make-3.82 ...PASSED
  Package: glibc-2.17 (x86_64) ...PASSED
  Package: glibc-devel-2.17 (x86_64) ...PASSED
  Package: libaio-0.3.109 (x86_64) ...PASSED
  Package: nfs-utils-1.2.3-15 ...PASSED
  Package: smartmontools-6.2-4 ...PASSED
  Package: net-tools-2.0-0.17 ...PASSED
  Package: policycoreutils-2.5-17 ...PASSED
  Package: policycoreutils-python-2.5-17 ...PASSED
  Port Availability for component "Oracle Remote Method Invocation (ORMI)" ...PASSED
  Port Availability for component "Oracle Notification Service (ONS)" ...PASSED
  Port Availability for component "Oracle Cluster Synchronization Services (CSSD)" ...PASSED
  Port Availability for component "Oracle Notification Service (ONS) Enterprise Manager support" ...PASSED
  Port Availability for component "Oracle Database Listener" ...PASSED
  Users With Same UID: 0 ...PASSED
  Current Group ID ...PASSED
  Root user consistency ...PASSED
  Host name ...PASSED
  Node Connectivity ...
    Hosts File ...PASSED
    Check that maximum (MTU) size packet goes through subnet ...PASSED
  Node Connectivity ...PASSED
  Multicast or broadcast check ...PASSED
  ASM Network ...PASSED
  Device Checks for ASM ...
    Package: cvuqdisk-1.0.10-1 ...PASSED
    ASM device sharedness check ...
      Shared Storage Accessibility:/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk05.img,/oradata/asm_disk04.img ...PASSED
    ASM device sharedness check ...PASSED
    Access Control List check ...PASSED
    I/O scheduler ...PASSED
  Device Checks for ASM ...PASSED
  Same core file name pattern ...PASSED
  User Mask ...PASSED
  User Not In Group "root": grid ...PASSED
  Time zone consistency ...PASSED
  Path existence, ownership, permissions and attributes ...
    Path "/var" ...PASSED
    Path "/var/lib/oracle" ...PASSED
    Path "/dev/shm" ...PASSED
  Path existence, ownership, permissions and attributes ...PASSED
  VIP Subnet configuration check ...PASSED
  resolv.conf Integrity ...PASSED
  DNS/NIS name service ...
    Name Service Switch Configuration File Integrity ...PASSED
  DNS/NIS name service ...PASSED
  Single Client Access Name (SCAN) ...PASSED
  Domain Sockets ...PASSED
  Daemon "avahi-daemon" not configured and running ...PASSED
  Daemon "proxyt" not configured and running ...PASSED
  loopback network interface address ...PASSED
  Oracle base: /u01/app/grid ...
    '/u01/app/grid' ...PASSED
  Oracle base: /u01/app/grid ...PASSED
  User Equivalence ...PASSED
  RPM Package Manager database ...INFORMATION (PRVG-11250)
  Network interface bonding status of private interconnect network interfaces ...PASSED
  /dev/shm mounted as temporary file system ...PASSED
  File system mount options for path /var ...PASSED
  DefaultTasksMax parameter ...PASSED
  zeroconf check ...PASSED
  ASM Filter Driver configuration ...PASSED
  Systemd login manager IPC parameter ...PASSED
  Systemd status ...PASSED

Pre-check for cluster services setup was successful. 
RPM Package Manager database ...INFORMATION
PRVG-11250 : The check "RPM Package Manager database" was not performed because
it needs 'root' user privileges.

Refer to My Oracle Support notes "2548970.1" for more details regarding errors 
PRVG-11250".

CVU operation performed:      stage -pre crsinst
Date:                         Feb 23, 2022 3:30:59 PM
CVU home:                     /u01/app/21.3.0/grid
User:                         grid
Operating system:             Linux5.4.17-2136.304.4.1.el7uek.x86_64
02-23-2022 15:31:22 UTC :  : CVU Checks are ignored as IGNORE_CVU_CHECKS set to true. It is recommended to set IGNORE_CVU_CHECKS to false and meet all the cvu checks requirement. RAC installation might fail, if there are failed cvu checks.
02-23-2022 15:31:22 UTC :  : Running Grid Installation
02-23-2022 15:32:02 UTC :  : Running root.sh
02-23-2022 15:32:02 UTC :  : Nodes in the cluster racnode1
02-23-2022 15:32:02 UTC :  : Running root.sh on racnode1
02-23-2022 15:32:02 UTC :  : Running post root.sh steps
02-23-2022 15:32:02 UTC :  : Running post root.sh steps to setup Grid env
02-23-2022 15:32:07 UTC :  : Checking Cluster Status
02-23-2022 15:32:07 UTC :  : Nodes in the cluster 
02-23-2022 15:32:07 UTC :  : Removing /tmp/cluvfy_check.txt as cluster check has passed
02-23-2022 15:32:07 UTC :  : Generating DB Responsefile Running DB creation
02-23-2022 15:32:07 UTC :  : Running DB creation
02-23-2022 15:32:07 UTC :  : Workaround for Bug 32449232 : Removing /u01/app/grid/kfod
02-23-2022 15:32:12 UTC :  : Checking DB status
02-23-2022 15:32:12 UTC : : ORCLCDB is not up and running on racnode1
02-23-2022 15:32:12 UTC : : Error has occurred in Grid Setup, Please verify!

Here is my system and docker info:

# uname -a
Linux vm-oracle 5.4.17-2136.304.4.1.el7uek.x86_64 #2 SMP Tue Feb 8 11:44:31 PST 2022 x86_64 x86_64 x86_64 GNU/Linux
# free -g
              total        used        free      shared  buff/cache   available
Mem:             27           9           0           0          16          16
Swap:             7           0           7
# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 106
model name  : Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
stepping    : 6
microcode   : 0xffffffff
cpu MHz     : 2793.438
cache size  : 49152 KB
physical id : 0
siblings    : 8
core id     : 0
cpu cores   : 8
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 21
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves nt_good md_clear
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips    : 5586.87
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 106
model name  : Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
stepping    : 6
microcode   : 0xffffffff
cpu MHz     : 2793.438
cache size  : 49152 KB
physical id : 0
siblings    : 8
core id     : 1
cpu cores   : 8
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 21
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves nt_good md_clear
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips    : 5586.87
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

.....

processor   : 7
vendor_id   : GenuineIntel
cpu family  : 6
model       : 106
model name  : Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
stepping    : 6
microcode   : 0xffffffff
cpu MHz     : 2793.438
cache size  : 49152 KB
physical id : 0
siblings    : 8
core id     : 7
cpu cores   : 8
apicid      : 7
initial apicid  : 7
fpu     : yes
fpu_exception   : yes
cpuid level : 21
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves nt_good md_clear
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit
bogomips    : 5586.87
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:
# docker info
Client:
 Debug Mode: false

Server:
 Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
 Images: 63
 Server Version: 19.03.11-ol
 Storage Driver: btrfs
  Build Version: Btrfs v4.9.1
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7eba5930496d9bbe375fdf71603e610ad737d2b2
 runc version: 2856f01
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.17-2136.304.4.1.el7uek.x86_64
 Operating System: Oracle Linux Server 7.9
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 27.14GiB
 Name: vm-oracle
 ID: W5EK:Z2WI:KGEH:OJTT:7QFB:NR4J:TT6D:YTDG:CRT4:SIAW:2IJC:JAPL
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Registries: 
# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2022-02-23 10:16:42 UTC; 6h ago
     Docs: https://docs.docker.com
 Main PID: 1664 (dockerd)
    Tasks: 24
   Memory: 207.7M
   CGroup: /system.slice/docker.service
           ├─ 1664 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --cpu-rt-runtime=950000
           └─10069 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1521 -container-ip 172.16.1.15 -container-port 1521

Feb 23 10:16:42 vm-oracle dockerd[1664]: time="2022-02-23T10:16:42.609381007Z" level=info msg="Docker daemon" commit=9bb540d graphdriver(s)=btrfs version=19.03.11-ol
Feb 23 10:16:42 vm-oracle dockerd[1664]: time="2022-02-23T10:16:42.609507408Z" level=info msg="Daemon has completed initialization"
Feb 23 10:16:42 vm-oracle dockerd[1664]: time="2022-02-23T10:16:42.735841671Z" level=info msg="API listen on /var/run/docker.sock"
Feb 23 10:16:42 vm-oracle systemd[1]: Started Docker Application Container Engine.
Feb 23 11:20:04 vm-oracle dockerd[1664]: time="2022-02-23T11:20:04.881615157Z" level=info msg="Container 54a6651090f20989612fa63f2c13afa15cedafad60d4d29f5b4b38e70b8e49... the force"
Feb 23 11:20:05 vm-oracle dockerd[1664]: time="2022-02-23T11:20:04.995378022Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete ...TaskDelete"
Feb 23 14:30:42 vm-oracle dockerd[1664]: time="2022-02-23T14:30:42.014311769Z" level=info msg="Container 598aae8006a40b7d030eef87c269233760789aa774673ab8625eb9a736247e... the force"
Feb 23 14:30:42 vm-oracle dockerd[1664]: time="2022-02-23T14:30:42.121131428Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete ...TaskDelete"
Feb 23 15:28:31 vm-oracle dockerd[1664]: time="2022-02-23T15:28:31.054721440Z" level=info msg="Container 862a86a69a8b7139352015cbec0fd05f050d2f004f79996bb66d898e866456... the force"
Feb 23 15:28:31 vm-oracle dockerd[1664]: time="2022-02-23T15:28:31.139909953Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete ...TaskDelete"
Hint: Some lines were ellipsized, use -l to show in full.
vegatara commented 2 years ago

Please share the logs from containers for Grid and DB. Also, can you check the following from Grid User:

$GRID_HOME/bin/crsctl check cluster
psaini79 commented 2 years ago

Please share the output of following

crsctl stat res -t init
crsctl check crs
ps -u oracle
getenforce
systemctl status firewalld

Also, please upload logs from the diag directory of RDBMS and Grid from both nodes.

ifrankrui commented 2 years ago

@vegatara @psaini79 Thanks for replying.

I started the process again from the beginning with a bigger machine. This time I can get the cluster up running. But there are still some issues when joining the second node. I have a few ideas to try and will come back to ask for more assistance if they don't work.

ifrankrui commented 2 years ago

To keep this tidy, I have raised another issue: https://github.com/oracle/docker-images/issues/2255 to replace this one.