oracle / docker-images

Official source of container configurations, images, and examples for Oracle products and projects
https://developer.oracle.com/use-cases/#containers
Universal Permissive License v1.0
6.52k stars 5.42k forks source link

Adding second node to cluster: Error has occurred in Grid Setup, Please verify! #2255

Open ifrankrui opened 2 years ago

ifrankrui commented 2 years ago

Hi, I am following the steps in the repo to build a cluster with docker images. I have the first node up running and able to connect to the database. But when I add the second node to the cluster I came across the error. Here is the output and other logs:

docker create -t -i \
>   --hostname racnode2 \
>   --volume /dev/shm \
>   --tmpfs /dev/shm:rw,exec,size=4G  \
>   --volume /boot:/boot:ro \
>   --dns-search=example.com  \
>   --volume /opt/containers/rac_host_file:/etc/hosts \
>   --volume /opt/.secrets:/run/secrets:ro \
>   --dns=172.16.1.25 \
>   --dns-search=example.com \
>   --privileged=false \
>   --volume racstorage:/oradata \
>   --cap-add=SYS_NICE \
>   --cap-add=SYS_RESOURCE \
>   --cap-add=NET_ADMIN \
>   -e DNS_SERVERS="172.16.1.25" \
>   -e EXISTING_CLS_NODES=racnode1 \
>   -e NODE_VIP=172.16.1.161  \
>   -e VIP_HOSTNAME=racnode2-vip  \
>   -e PRIV_IP=192.168.17.151  \
>   -e PRIV_HOSTNAME=racnode2-priv \
>   -e PUBLIC_IP=172.16.1.151  \
>   -e PUBLIC_HOSTNAME=racnode2  \
>   -e DOMAIN=example.com \
>   -e SCAN_NAME=racnode-scan \
>   -e ASM_DISCOVERY_DIR=/oradata \
>   -e ASM_DEVICE_LIST=/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img \
>   -e ORACLE_SID=ORCLCDB \
>   -e OP_TYPE=ADDNODE \
>   -e COMMON_OS_PWD_FILE=common_os_pwdfile.enc \
>   -e PWD_KEY=pwd.key \
>   --tmpfs=/run -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
>   --cpu-rt-runtime=95000 \
>   --ulimit rtprio=99  \
>   --restart=always \
>   --name racnode2 \
>   oracle/database-rac:21.3.0
6a1aca67d17b80c6b32a9110e686db8d7d3a96ed67437ce3a1577e0d56535fdf
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# docker network disconnect bridge racnode2
[root@vm-oracle dockerfiles]# docker network connect rac_pub1_nw --ip 172.16.1.151 racnode2
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# docker network connect rac_priv1_nw --ip 192.168.17.151 racnode2
[root@vm-oracle dockerfiles]# docker start racnode2
racnode2
[root@vm-oracle dockerfiles]# docker logs -f racnode2
PATH=/bin:/usr/bin:/sbin:/usr/sbin
HOSTNAME=racnode2
TERM=xterm
PRIV_IP=192.168.17.151
PUBLIC_IP=172.16.1.151
DNS_SERVERS=172.16.1.25
EXISTING_CLS_NODES=racnode1
DOMAIN=example.com
SCAN_NAME=racnode-scan
ASM_DEVICE_LIST=/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img
NODE_VIP=172.16.1.161
VIP_HOSTNAME=racnode2-vip
PRIV_HOSTNAME=racnode2-priv
COMMON_OS_PWD_FILE=common_os_pwdfile.enc
PUBLIC_HOSTNAME=racnode2
ASM_DISCOVERY_DIR=/oradata
ORACLE_SID=ORCLCDB
OP_TYPE=ADDNODE
PWD_KEY=pwd.key
SETUP_LINUX_FILE=setupLinuxEnv.sh
INSTALL_DIR=/opt/scripts
GRID_BASE=/u01/app/grid
GRID_HOME=/u01/app/21.3.0/grid
INSTALL_FILE_1=LINUX.X64_213000_grid_home.zip
GRID_INSTALL_RSP=gridsetup_21c.rsp
GRID_SW_INSTALL_RSP=grid_sw_install_21c.rsp
GRID_SETUP_FILE=setupGrid.sh
FIXUP_PREQ_FILE=fixupPreq.sh
INSTALL_GRID_BINARIES_FILE=installGridBinaries.sh
INSTALL_GRID_PATCH=applyGridPatch.sh
INVENTORY=/u01/app/oraInventory
CONFIGGRID=configGrid.sh
ADDNODE=AddNode.sh
DELNODE=DelNode.sh
ADDNODE_RSP=grid_addnode_21c.rsp
SETUPSSH=setupSSH.expect
DOCKERORACLEINIT=dockeroracleinit
GRID_USER_HOME=/home/grid
SETUPGRIDENV=setupGridEnv.sh
RESET_OS_PASSWORD=resetOSPassword.sh
MULTI_NODE_INSTALL=MultiNodeInstall.py
DB_BASE=/u01/app/oracle
DB_HOME=/u01/app/oracle/product/21.3.0/dbhome_1
INSTALL_FILE_2=LINUX.X64_213000_db_home.zip
DB_INSTALL_RSP=db_sw_install_21c.rsp
DBCA_RSP=dbca_21c.rsp
DB_SETUP_FILE=setupDB.sh
PWD_FILE=setPassword.sh
RUN_FILE=runOracle.sh
STOP_FILE=stopOracle.sh
ENABLE_RAC_FILE=enableRAC.sh
CHECK_DB_FILE=checkDBStatus.sh
USER_SCRIPTS_FILE=runUserScripts.sh
REMOTE_LISTENER_FILE=remoteListener.sh
INSTALL_DB_BINARIES_FILE=installDBBinaries.sh
GRID_HOME_CLEANUP=GridHomeCleanup.sh
ORACLE_HOME_CLEANUP=OracleHomeCleanup.sh
DB_USER=oracle
GRID_USER=grid
FUNCTIONS=functions.sh
COMMON_SCRIPTS=/common_scripts
CHECK_SPACE_FILE=checkSpace.sh
RESET_FAILED_UNITS=resetFailedUnits.sh
SET_CRONTAB=setCrontab.sh
CRONTAB_ENTRY=crontabEntry
EXPECT=/usr/bin/expect
BIN=/usr/sbin
container=true
INSTALL_SCRIPTS=/opt/scripts/install
SCRIPT_DIR=/opt/scripts/startup
GRID_PATH=/u01/app/21.3.0/grid/bin:/u01/app/21.3.0/grid/OPatch/:/u01/app/21.3.0/grid/perl/bin:/usr/sbin:/bin:/sbin
DB_PATH=/u01/app/oracle/product/21.3.0/dbhome_1/bin:/u01/app/oracle/product/21.3.0/dbhome_1/OPatch/:/u01/app/oracle/product/21.3.0/dbhome_1/perl/bin:/usr/sbin:/bin:/sbin
GRID_LD_LIBRARY_PATH=/u01/app/21.3.0/grid/lib:/usr/lib:/lib
DB_LD_LIBRARY_PATH=/u01/app/oracle/product/21.3.0/dbhome_1/lib:/usr/lib:/lib
HOME=/home/grid
Failed to parse kernel command line, ignoring: No such file or directory
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization other.
Detected architecture x86-64.

Welcome to Oracle Linux Server 7.9!

Set hostname to <racnode2>.
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
/usr/lib/systemd/system-generators/systemd-fstab-generator failed with error code 1.
[/usr/lib/systemd/system/systemd-pstore.service:22] Unknown lvalue 'StateDirectory' in section 'Service'
Cannot add dependency job for unit display-manager.service, ignoring: Unit not found.
[  OK  ] Reached target Swap.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Created slice Root Slice.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Listening on Delayed Shutdown Socket.
[  OK  ] Listening on Journal Socket.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Created slice System Slice.
[  OK  ] Reached target Slices.
         Starting Read and set NIS domainname from /etc/sysconfig/network...
Couldn't determine result for ConditionKernelCommandLine=|rd.modules-load for systemd-modules-load.service, assuming failed: No such file or directory
Couldn't determine result for ConditionKernelCommandLine=|modules-load for systemd-modules-load.service, assuming failed: No such file or directory
[  OK  ] Created slice system-getty.slice.
         Starting Journal Service...
         Starting Rebuild Hardware Database...
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Reached target RPC Port Mapper.
         Starting Configure read-only root support...
[  OK  ] Started Read and set NIS domainname from /etc/sysconfig/network.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Configure read-only root support.
         Starting Load/Save Random Seed...
[  OK  ] Reached target Local File Systems.
         Starting Preprocess NFS configuration...
         Starting Rebuild Journal Catalog...
         Starting Mark the need to relabel after reboot...
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Mark the need to relabel after reboot.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Preprocess NFS configuration.
[  OK  ] Started Rebuild Journal Catalog.
[  OK  ] Started Create Volatile Files and Directories.
         Mounting RPC Pipe File System...
         Starting Update UTMP about System Boot/Shutdown...
[FAILED] Failed to mount RPC Pipe File System.
See 'systemctl status var-lib-nfs-rpc_pipefs.mount' for details.
[DEPEND] Dependency failed for rpc_pipefs.target.
[DEPEND] Dependency failed for RPC security service for NFS client and server.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started Rebuild Hardware Database.
         Starting Update is Completed...
[  OK  ] Started Update is Completed.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on RPCbind Server Activation Socket.
[  OK  ] Reached target Sockets.
         Starting RPC bind service...
[  OK  ] Started Flexible branding.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
         Starting OpenSSH Server Key Generation...
[  OK  ] Started D-Bus System Message Bus.
         Starting Resets System Activity Logs...
         Starting GSSAPI Proxy Daemon...
         Starting LSB: Bring up/down networking...
         Starting Login Service...
         Starting Self Monitoring and Reporting Technology (SMART) Daemon...
[  OK  ] Started RPC bind service.
         Starting Cleanup of Temporary Directories...
[  OK  ] Started Resets System Activity Logs.
[  OK  ] Started Login Service.
[  OK  ] Started Cleanup of Temporary Directories.
[  OK  ] Started GSSAPI Proxy Daemon.
[  OK  ] Reached target NFS client services.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting Permit User Sessions...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Command Scheduler.
[  OK  ] Started LSB: Bring up/down networking.
[  OK  ] Reached target Network.
         Starting /etc/rc.d/rc.local Compatibility...
[  OK  ] Reached target Network is Online.
         Starting Notify NFS peers of a restart...
[  OK  ] Started /etc/rc.d/rc.local Compatibility.
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Notify NFS peers of a restart.
02-28-2022 17:24:35 UTC :  : Process id of the program : 
02-28-2022 17:24:35 UTC :  : #################################################
02-28-2022 17:24:35 UTC :  :  Starting Grid Installation          
02-28-2022 17:24:35 UTC :  : #################################################
02-28-2022 17:24:35 UTC :  : Pre-Grid Setup steps are in process
02-28-2022 17:24:35 UTC :  : Process id of the program : 
[  OK  ] Started OpenSSH Server Key Generation.
         Starting OpenSSH server daemon...
[  OK  ] Started OpenSSH server daemon.
02-28-2022 17:24:35 UTC :  : Disable failed service var-lib-nfs-rpc_pipefs.mount
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
02-28-2022 17:24:35 UTC :  : Resetting Failed Services
02-28-2022 17:24:35 UTC :  : Sleeping for 60 seconds
[  OK  ] Started Self Monitoring and Reporting Technology (SMART) Daemon.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Oracle Linux Server 7.9
Kernel 5.4.17-2136.304.4.1.el7uek.x86_64 on an x86_64

racnode2 login: 02-28-2022 17:25:35 UTC :  : Systemctl state is running!
02-28-2022 17:25:35 UTC :  : Setting correct permissions for /bin/ping
02-28-2022 17:25:35 UTC :  : Public IP is set to 172.16.1.151
02-28-2022 17:25:35 UTC :  : RAC Node PUBLIC Hostname is set to racnode2
02-28-2022 17:25:35 UTC :  : Preparing host line for racnode2
02-28-2022 17:25:35 UTC :  : Adding \n172.16.1.151\tracnode2.example.com\tracnode2 to /etc/hosts
02-28-2022 17:25:35 UTC :  : Preparing host line for racnode2-priv
02-28-2022 17:25:35 UTC :  : Adding \n192.168.17.151\tracnode2-priv.example.com\tracnode2-priv to /etc/hosts
02-28-2022 17:25:35 UTC :  : Preparing host line for racnode2-vip
02-28-2022 17:25:35 UTC :  : Adding \n172.16.1.161\tracnode2-vip.example.com\tracnode2-vip to /etc/hosts
02-28-2022 17:25:35 UTC :  : Preparing host line for racnode-scan
02-28-2022 17:25:35 UTC :  : Preapring Device list
02-28-2022 17:25:35 UTC :  : Changing Disk permission and ownership /oradata/asm_disk01.img
02-28-2022 17:25:35 UTC :  : Changing Disk permission and ownership /oradata/asm_disk02.img
02-28-2022 17:25:35 UTC :  : Changing Disk permission and ownership /oradata/asm_disk03.img
02-28-2022 17:25:35 UTC :  : Changing Disk permission and ownership /oradata/asm_disk04.img
02-28-2022 17:25:35 UTC :  : Changing Disk permission and ownership /oradata/asm_disk05.img
02-28-2022 17:25:36 UTC :  : Preapring Dns Servers list
02-28-2022 17:25:36 UTC :  : Setting DNS Servers
02-28-2022 17:25:36 UTC :  : Adding nameserver 172.16.1.25 in /etc/resolv.conf.
02-28-2022 17:25:36 UTC :  : #####################################################################
02-28-2022 17:25:36 UTC :  :  RAC setup will begin in 2 minutes                                   
02-28-2022 17:25:36 UTC :  : ####################################################################
02-28-2022 17:26:06 UTC :  : ###################################################
02-28-2022 17:26:06 UTC :  : Pre-Grid Setup steps completed
02-28-2022 17:26:06 UTC :  : ###################################################
02-28-2022 17:26:06 UTC :  : Checking if grid is already configured
02-28-2022 17:26:06 UTC :  : Public IP is set to 172.16.1.151
02-28-2022 17:26:06 UTC :  : RAC Node PUBLIC Hostname is set to racnode2
02-28-2022 17:26:06 UTC :  : Domain is defined to example.com
02-28-2022 17:26:06 UTC :  : Setting Existing Cluster Node for node addition operation. This will be retrieved from racnode1
02-28-2022 17:26:06 UTC :  : Existing Node Name of the cluster is set to racnode1
02-28-2022 17:26:06 UTC :  : 172.16.1.150
02-28-2022 17:26:06 UTC :  : Existing Cluster node resolved to IP. Check passed
02-28-2022 17:26:06 UTC :  : Default setting of AUTO GNS VIP set to false. If you want to use AUTO GNS VIP, please pass DHCP_CONF as an env parameter set to true
02-28-2022 17:26:06 UTC :  : RAC VIP set to 172.16.1.161
02-28-2022 17:26:06 UTC :  : RAC Node VIP hostname is set to racnode2-vip 
02-28-2022 17:26:06 UTC :  : SCAN_NAME name is racnode-scan
02-28-2022 17:26:06 UTC :  : 172.16.1.172
172.16.1.171
172.16.1.170
02-28-2022 17:26:06 UTC :  : SCAN Name resolving to IP. Check Passed!
02-28-2022 17:26:06 UTC :  : SCAN_IP set to the empty string
02-28-2022 17:26:06 UTC :  : RAC Node PRIV IP is set to 192.168.17.151 
02-28-2022 17:26:06 UTC :  : RAC Node private hostname is set to racnode2-priv
02-28-2022 17:26:06 UTC :  : CMAN_NAME set to the empty string
02-28-2022 17:26:06 UTC :  : CMAN_IP set to the empty string
02-28-2022 17:26:06 UTC :  : Password file generated
02-28-2022 17:26:06 UTC :  : Common OS Password string is set for Grid user
02-28-2022 17:26:06 UTC :  : Common OS Password string is set for  Oracle user
02-28-2022 17:26:06 UTC :  : GRID_RESPONSE_FILE env variable set to empty. AddNode.sh will use standard cluster responsefile
02-28-2022 17:26:06 UTC :  : Location for User script SCRIPT_ROOT set to /common_scripts
02-28-2022 17:26:06 UTC :  : ORACLE_SID is set to ORCLCDB
02-28-2022 17:26:06 UTC :  : Setting random password for root/grid/oracle user
02-28-2022 17:26:06 UTC :  : Setting random password for grid user
02-28-2022 17:26:06 UTC :  : Setting random password for oracle user
02-28-2022 17:26:06 UTC :  : Setting random password for root user
02-28-2022 17:26:06 UTC :  : Cluster Nodes are racnode1 racnode2
02-28-2022 17:26:06 UTC :  : Running SSH setup for grid user between nodes racnode1 racnode2
02-28-2022 17:26:17 UTC :  : Running SSH setup for oracle user between nodes racnode1 racnode2
02-28-2022 17:26:27 UTC :  : SSH check fine for the racnode1
02-28-2022 17:26:28 UTC :  : SSH check fine for the racnode2
02-28-2022 17:26:28 UTC :  : SSH check fine for the racnode2
02-28-2022 17:26:28 UTC :  : SSH check fine for the oracle@racnode1
02-28-2022 17:26:28 UTC :  : SSH check fine for the oracle@racnode2
02-28-2022 17:26:28 UTC :  : SSH check fine for the oracle@racnode2
02-28-2022 17:26:28 UTC :  : Setting Device permission to grid and asmadmin on all the cluster nodes
02-28-2022 17:26:28 UTC :  : Nodes in the cluster racnode2
02-28-2022 17:26:28 UTC :  : Setting Device permissions for RAC Install  on racnode2
02-28-2022 17:26:28 UTC :  : Preapring ASM Device list
02-28-2022 17:26:28 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:28 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:28 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:28 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:28 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:28 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:28 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:28 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Checking Cluster Status on racnode1
02-28-2022 17:26:29 UTC :  : Checking Cluster
02-28-2022 17:26:30 UTC :  : Cluster Check on remote node passed
02-28-2022 17:26:30 UTC :  : Cluster Check went fine
02-28-2022 17:26:30 UTC :  : CRSD Check went fine
02-28-2022 17:26:30 UTC :  : CSSD Check went fine
02-28-2022 17:26:30 UTC :  : EVMD Check went fine
02-28-2022 17:26:30 UTC :  : Generating Responsefile for node addition
02-28-2022 17:26:30 UTC :  : Clustered Nodes are set to racnode2:racnode2-vip:HUB
02-28-2022 17:26:30 UTC :  : Running Cluster verification utility for new node racnode2 on racnode1
02-28-2022 17:26:30 UTC :  : Nodes in the cluster racnode2
02-28-2022 17:26:30 UTC :  : ssh to the node racnode1 and executing cvu checks on racnode2
02-28-2022 17:27:13 UTC :  : Checking /tmp/cluvfy_check.txt if there is any failed check.
This software is "235" days old. It is a best practice to update the CRS home by downloading and applying the latest release update. Refer to MOS note 2731675.1 for more details.

Performing following verification checks ...

  Physical Memory ...PASSED
  Available Physical Memory ...PASSED
  Swap Size ...FAILED (PRVF-7573)
  Free Space: racnode2:/usr,racnode2:/var,racnode2:/etc,racnode2:/u01/app/21.3.0/grid,racnode2:/sbin,racnode2:/tmp ...PASSED
  Free Space: racnode1:/usr,racnode1:/var,racnode1:/etc,racnode1:/u01/app/21.3.0/grid,racnode1:/sbin,racnode1:/tmp ...PASSED
  User Existence: oracle ...
    Users With Same UID: 54321 ...PASSED
  User Existence: oracle ...PASSED
  User Existence: grid ...
    Users With Same UID: 54332 ...PASSED
  User Existence: grid ...PASSED
  User Existence: root ...
    Users With Same UID: 0 ...PASSED
  User Existence: root ...PASSED
  Group Existence: asmadmin ...PASSED
  Group Existence: asmoper ...PASSED
  Group Existence: asmdba ...PASSED
  Group Existence: oinstall ...PASSED
  Group Membership: oinstall ...PASSED
  Group Membership: asmdba ...PASSED
  Group Membership: asmadmin ...PASSED
  Group Membership: asmoper ...PASSED
  Run Level ...PASSED
  Hard Limit: maximum open file descriptors ...PASSED
  Soft Limit: maximum open file descriptors ...PASSED
  Hard Limit: maximum user processes ...PASSED
  Soft Limit: maximum user processes ...PASSED
  Soft Limit: maximum stack size ...PASSED
  Architecture ...PASSED
  OS Kernel Version ...PASSED
  OS Kernel Parameter: semmsl ...PASSED
  OS Kernel Parameter: semmns ...PASSED
  OS Kernel Parameter: semopm ...PASSED
  OS Kernel Parameter: semmni ...PASSED
  OS Kernel Parameter: shmmax ...PASSED
  OS Kernel Parameter: shmmni ...PASSED
  OS Kernel Parameter: shmall ...PASSED
  OS Kernel Parameter: file-max ...PASSED
  OS Kernel Parameter: ip_local_port_range ...PASSED
  OS Kernel Parameter: rmem_default ...PASSED
  OS Kernel Parameter: rmem_max ...PASSED
  OS Kernel Parameter: wmem_default ...PASSED
  OS Kernel Parameter: wmem_max ...PASSED
  OS Kernel Parameter: aio-max-nr ...FAILED (PRVH-0521)
  OS Kernel Parameter: panic_on_oops ...PASSED
  Package: kmod-20-21 (x86_64) ...PASSED
  Package: kmod-libs-20-21 (x86_64) ...PASSED
  Package: binutils-2.23.52.0.1 ...PASSED
  Package: libgcc-4.8.2 (x86_64) ...PASSED
  Package: libstdc++-4.8.2 (x86_64) ...PASSED
  Package: sysstat-10.1.5 ...PASSED
  Package: ksh ...PASSED
  Package: make-3.82 ...PASSED
  Package: glibc-2.17 (x86_64) ...PASSED
  Package: glibc-devel-2.17 (x86_64) ...PASSED
  Package: libaio-0.3.109 (x86_64) ...PASSED
  Package: nfs-utils-1.2.3-15 ...PASSED
  Package: smartmontools-6.2-4 ...PASSED
  Package: net-tools-2.0-0.17 ...PASSED
  Package: policycoreutils-2.5-17 ...PASSED
  Package: policycoreutils-python-2.5-17 ...PASSED
  Users With Same UID: 0 ...PASSED
  Current Group ID ...PASSED
  Root user consistency ...PASSED
  Node Addition ...
    CRS Integrity ...PASSED
    Clusterware Version Consistency ...PASSED
    '/u01/app/21.3.0/grid' ...PASSED
  Node Addition ...PASSED
  Host name ...PASSED
  Node Connectivity ...
    Hosts File ...PASSED
    Check that maximum (MTU) size packet goes through subnet ...PASSED
    subnet mask consistency for subnet "172.16.1.0" ...PASSED
    subnet mask consistency for subnet "192.168.17.0" ...PASSED
  Node Connectivity ...PASSED
  Multicast or broadcast check ...PASSED
  ASM Network ...PASSED
  Device Checks for ASM ...
    Package: cvuqdisk-1.0.10-1 ...PASSED
    ASM device sharedness check ...
      Shared Storage Accessibility:/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img ...PASSED
    ASM device sharedness check ...PASSED
    Access Control List check ...PASSED
  Device Checks for ASM ...PASSED
  Database home availability ...PASSED
  OCR Integrity ...PASSED
  Time zone consistency ...PASSED
  User Not In Group "root": grid ...PASSED
  Time offset between nodes ...PASSED
  resolv.conf Integrity ...PASSED
  DNS/NIS name service ...PASSED
  User Equivalence ...PASSED
  Software home: /u01/app/21.3.0/grid ...PASSED
  /dev/shm mounted as temporary file system ...PASSED
  zeroconf check ...PASSED

Pre-check for node addition was unsuccessful on all the nodes. 

Failures were encountered during execution of CVU verification request "stage -pre nodeadd".

Swap Size ...FAILED
racnode2: PRVF-7573 : Sufficient swap size is not available on node "racnode2"
          [Required = 16GB (1.6777216E7KB) ; Found = 0.0 bytes]

racnode1: PRVF-7573 : Sufficient swap size is not available on node "racnode1"
          [Required = 16GB (1.6777216E7KB) ; Found = 0.0 bytes]

OS Kernel Parameter: aio-max-nr ...FAILED
racnode2: PRVH-0521 : OS kernel parameter "aio-max-nr" does not have expected
          current value on node "racnode2" [Expected = "1048576" ; Current =
          "65536";].

racnode1: PRVH-0521 : OS kernel parameter "aio-max-nr" does not have expected
          current value on node "racnode1" [Expected = "1048576" ; Current =
          "65536";].

CVU operation performed:      stage -pre nodeadd
Date:                         Feb 28, 2022 5:26:31 PM
Clusterware version:          21.0.0.0.0
CVU home:                     /u01/app/21.3.0/grid
Grid home:                    /u01/app/21.3.0/grid
User:                         grid
Operating system:             Linux5.4.17-2136.304.4.1.el7uek.x86_64
02-28-2022 17:27:13 UTC :  : CVU Checks are ignored as IGNORE_CVU_CHECKS set to true. It is recommended to set IGNORE_CVU_CHECKS to false and meet all the cvu checks requirement. RAC installation might fail, if there are failed cvu checks.
02-28-2022 17:27:13 UTC :  : Running Node Addition and cluvfy test for node racnode2
02-28-2022 17:27:13 UTC :  : Copying /tmp/grid_addnode_21c.rsp on remote node racnode1
02-28-2022 17:27:13 UTC :  : Running GridSetup.sh on racnode1 to add the node to existing cluster
02-28-2022 17:27:57 UTC :  : Node Addition performed. removing Responsefile
02-28-2022 17:27:57 UTC :  : Running root.sh on node racnode2
02-28-2022 17:27:57 UTC :  : Nodes in the cluster racnode2
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
02-28-2022 17:41:21 UTC :  : Checking Cluster
02-28-2022 17:41:21 UTC :  : Cluster Check passed
02-28-2022 17:41:21 UTC :  : Cluster Check went fine
02-28-2022 17:41:21 UTC : : CRSD Check failed!
02-28-2022 17:41:21 UTC : : Error has occurred in Grid Setup, Please verify!
^C
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# 
[root@vm-oracle dockerfiles]# docker exec -i -t racnode2 /bin/bash
[grid@racnode2 ~]$ tail -n 50 /tmp/orod.log
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Changing Disk permission and ownership
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
02-28-2022 17:26:29 UTC :  : Populate Rac Env Vars on Remote Hosts
02-28-2022 17:26:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
02-28-2022 17:26:29 UTC :  : Checking Cluster Status on racnode1
02-28-2022 17:26:29 UTC :  : Checking Cluster
02-28-2022 17:26:30 UTC :  : Cluster Check on remote node passed
02-28-2022 17:26:30 UTC :  : Cluster Check went fine
02-28-2022 17:26:30 UTC :  : CRSD Check went fine
02-28-2022 17:26:30 UTC :  : CSSD Check went fine
02-28-2022 17:26:30 UTC :  : EVMD Check went fine
02-28-2022 17:26:30 UTC :  : Generating Responsefile for node addition
02-28-2022 17:26:30 UTC :  : Clustered Nodes are set to racnode2:racnode2-vip:HUB
02-28-2022 17:26:30 UTC :  : Running Cluster verification utility for new node racnode2 on racnode1
02-28-2022 17:26:30 UTC :  : Nodes in the cluster racnode2
02-28-2022 17:26:30 UTC :  : ssh to the node racnode1 and executing cvu checks on racnode2
02-28-2022 17:27:13 UTC :  : Checking /tmp/cluvfy_check.txt if there is any failed check.
02-28-2022 17:27:13 UTC :  : CVU Checks are ignored as IGNORE_CVU_CHECKS set to true. It is recommended to set IGNORE_CVU_CHECKS to false and meet all the cvu checks requirement. RAC installation might fail, if there are failed cvu checks.
02-28-2022 17:27:13 UTC :  : Running Node Addition and cluvfy test for node racnode2
02-28-2022 17:27:13 UTC :  : Copying /tmp/grid_addnode_21c.rsp on remote node racnode1
02-28-2022 17:27:13 UTC :  : Running GridSetup.sh on racnode1 to add the node to existing cluster
Launching Oracle Grid Infrastructure Setup Wizard...

As a root user, execute the following script(s):
    1. /u01/app/21.3.0/grid/root.sh

Execute /u01/app/21.3.0/grid/root.sh on the following nodes: 
[racnode2]

The scripts can be executed in parallel on all the nodes.

Successfully Setup Software.
02-28-2022 17:27:57 UTC :  : Node Addition performed. removing Responsefile
02-28-2022 17:27:57 UTC :  : Running root.sh on node racnode2
02-28-2022 17:27:57 UTC :  : Nodes in the cluster racnode2
02-28-2022 17:41:21 UTC :  : Checking Cluster
02-28-2022 17:41:21 UTC :  : Cluster Check passed
02-28-2022 17:41:21 UTC :  : Cluster Check went fine
02-28-2022 17:41:21 UTC : : CRSD Check failed!
02-28-2022 17:41:21 UTC : : Error has occurred in Grid Setup, Please verify!

Checking crsctl:

[grid@racnode2 debug]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[grid@racnode2 debug]$ $GRID_HOME/bin/crsctl check cluster
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
crsctl stat res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               STABLE
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                               STABLE
ora.crf
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  OFFLINE                               STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.ctssd
      1        ONLINE  OFFLINE                               STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE racnode2                 STABLE
ora.gipcd
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.gpnpd
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.mdnsd
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.storage
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------
[grid@racnode2 debug]$ crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00086:) in /u01/app/grid/diag/crs/racnode2/crs/trace/onmd.trc.
CRS-2674: Start of 'ora.cssd' on 'racnode2' failed
CRS-2679: Attempting to clean 'ora.cssd' on 'racnode2'
CRS-2681: Clean of 'ora.cssd' on 'racnode2' succeeded
CRS-4000: Command Start failed, or completed with errors.

Here is some more information from the trace:

tail -n 50 /u01/app/grid/diag/crs/racnode2/crs/trace/onmd.trc
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitGNS_READY (0x00040000) set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitHAVE_ICIN (0x00200000) not set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitACTTHRD_DONE (0x00800000) set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitOPENBUSS (0x01000000) not set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitBCCM_COMPL (0x02000000) set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitCOMPLETE (0x20000000) set
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] Initialization not complete !Error!
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] #### End diagnostic data for the Core layer ####
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] ### Begin diagnostic data for the GM Peer layer ###
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] GMP Status:  State CMStateINIT, incarnation 0, holding incoming requests 0
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] Status for active hub node racnode2, number 2: 
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO]   Connect: Started 1   completed 1   Ready 1   Fully Connected  0   !Error!
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] #### End diagnostic data for the GM Peer layer ####
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] ### Begin diagnostic data for the NM layer ###
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] Local node racnode2, number 2, state is clssnmNodeStateJOINING
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] Status for node racnode1, number 1, uniqueness 1646067289, node ID 0
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO]   State clssnmNodeStateINACTIVE,   Connect: started 1   completed 0   OK
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] Status for node racnode2, number 2, uniqueness 1646070855, node ID 0
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO]   State clssnmNodeStateJOINING,   Connect: started 1   completed 1   OK
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] #### End diagnostic data for the NM layer ####
2022-02-28 18:02:16.118 :    ONMD:140039320811264: [     INFO] ######## End Diagnostic Dump ########
2022-02-28 18:02:16.119 :    ONMD:140039320811264: [     INFO] clsscssdcmExit: Status: 4, Abort flag: 0, Core flag: 0, Don't abort: 0, flag: 112
2022-02-28 18:02:16.119 :    ONMD:140039320811264: scls_dump_stack_all_threads - entry

2022-02-28 18:02:16.119 :    ONMD:140039320811264: scls_dump_stack_all_threads - stat of /usr/bin/gdb failed with errno 2

2022-02-28 18:02:16.119 :    ONMD:140039320811264: [     INFO] clsscssdcmExit: Now aborting
    CLSB:140039320811264: [    ERROR] Oracle Clusterware infrastructure error in ONMD (OS PID 31069): Fatal signal 6 has occurred in program onmd thread 140039320811264; nested signal count is 1
Trace file /u01/app/grid/diag/crs/racnode2/crs/trace/onmd.trc
Oracle Database 21c Clusterware Release 21.0.0.0.0 - Production
Version 21.3.0.0.0 Copyright 1996, 2021 Oracle. All rights reserved.
DDE: Flood control is not active
2022-02-28T18:02:16.136201+00:00
Incident 17 created, dump file: /u01/app/grid/diag/crs/racnode2/crs/incident/incdir_17/onmd_i17.trc
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []
2022-02-28 18:02:16.499 :    ONMD:140038136391424: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541533305, wrtcnt, 4047, LATS 9721354, lastSeqNo 4046, uniqueness 1646067289, timestamp 1646071336/9721084
2022-02-28 18:02:16.514 :GIPCGMOD:140039297148672: [     INFO]  gipcmodGipcCallbackEndpClosed: [gipc]  Endpoint close for endp 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x563dd06c1e20, ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }
2022-02-28 18:02:16.514 :GIPCHDEM:140039295571712: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7f5d300fc560 type gipchaClientReqTypeDeleteName (12)
2022-02-28 18:02:16.514 :GIPCGMOD:140038128506624: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7f5d04050db0 [0000000000008ea3] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7f5d040399b0, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-02-28 18:02:16.514 :GIPCGMOD:140038128506624: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7f5d04050db0 [0000000000008ea3] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7f5d040399b0, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-02-28 18:02:16.514 :    ONMD:140038128506624: [     INFO] clssnmeventhndlr: Disconnecting endp 0x8e40 ninf 0x563dd06c3250
2022-02-28 18:02:16.514 :    ONMD:140038128506624: [     INFO] clssnmDiscHelper: racnode1, node(1) connection failed, endp (0x8e40), probe(0x7f5d00000000), ninf->endp 0x7f5d00008e40
2022-02-28 18:02:16.514 :    ONMD:140038128506624: [     INFO] clssnmDiscHelper: node 1 clean up, endp (0x8e40), init state 0, cur state 0
2022-02-28 18:02:16.514 :GIPCXCPT:140038128506624: [     INFO]  gipcInternalDissociate: obj 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-02-28 18:02:16.514 :GIPCXCPT:140038128506624: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4488]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-02-28 18:02:16.514 :GIPCXCPT:140038128506624: [     INFO]  gipcInternalDissociate: obj 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-02-28 18:02:16.514 :GIPCXCPT:140038128506624: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4645]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-02-28 18:02:16.515 :    ONMD:140038128506624: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-02-28 18:02:16.515 : GIPCTLS:140038128506624: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7f5d040399b0 [0000000000008e40] { gipcEndpoint : localAddr 'gipcha://racnode2:ef23-2c4e-e14a-209e', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/0095-cc7b-01e4-4be7', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x563dd06a18a0, ready 1, wobj 0x7f5d0403c6e0, sendp (nil) status 0flags 0x2603860e, flags-2 0x50, usrFlags 0x0 }
2022-02-28 18:02:16.515 :    ONMD:140038128506624: [     INFO] clssnmDiscEndp: gipcDestroy 0x8e40 

It looks like the racnode2 can't communicate with racnode1. Do I need to specify the connection manager when creating racnode2 container?

  -e CMAN_HOSTNAME=racnode-cman1 \
  -e CMAN_IP=172.16.1.15 \

Many thanks!

psaini79 commented 2 years ago

@ifrankrui

Please update with the following from both the containers:

route -n
ifconfig
ping racnode1
ping racnode2

Please provide the following from both the docker host:

route -n
ifconfig
docker network ls
docker inspect <pub network>
docker inspect <priv network>
ifrankrui commented 2 years ago

racnode1:

[grid@racnode1 ~]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.17.1    0.0.0.0         UG    0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.224.0   U     0      0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.17.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
[grid@racnode1 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.17.150  netmask 255.255.255.0  broadcast 192.168.17.255
        ether 02:42:c0:a8:11:96  txqueuelen 0  (Ethernet)
        RX packets 8702  bytes 1852302 (1.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9893  bytes 3049899 (2.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.2.240  netmask 255.255.224.0  broadcast 169.254.31.255
        ether 02:42:c0:a8:11:96  txqueuelen 0  (Ethernet)

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.150  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)
        RX packets 16916  bytes 3377502 (3.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 19019  bytes 42043230 (40.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.160  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

eth1:2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.172  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

eth1:3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.171  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

eth1:4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.170  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 245368  bytes 691762059 (659.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 245368  bytes 691762059 (659.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[grid@racnode1 ~]$ ping racnode1
PING racnode1.example.com (172.16.1.150) 56(84) bytes of data.
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=1 ttl=64 time=0.025 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=3 ttl=64 time=0.035 ms

64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=4 ttl=64 time=0.059 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=5 ttl=64 time=0.030 ms
^C
--- racnode1.example.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4104ms
rtt min/avg/max/mdev = 0.025/0.038/0.059/0.013 ms
[grid@racnode1 ~]$ ping racnode2
PING racnode2.example.com (172.16.1.151) 56(84) bytes of data.
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=1 ttl=64 time=0.055 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=2 ttl=64 time=0.081 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=3 ttl=64 time=0.076 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=4 ttl=64 time=0.100 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=5 ttl=64 time=0.112 ms
^C
--- racnode2.example.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4113ms
rtt min/avg/max/mdev = 0.055/0.084/0.112/0.022 ms

racnode 2:

[grid@racnode2 ~]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.17.1    0.0.0.0         UG    0      0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.17.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
[grid@racnode2 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.17.151  netmask 255.255.255.0  broadcast 192.168.17.255
        ether 02:42:c0:a8:11:97  txqueuelen 0  (Ethernet)
        RX packets 8677  bytes 1980401 (1.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8645  bytes 1811090 (1.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.151  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:97  txqueuelen 0  (Ethernet)
        RX packets 18487  bytes 41965250 (40.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16469  bytes 3312173 (3.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 12482  bytes 1433143 (1.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12482  bytes 1433143 (1.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[grid@racnode2 ~]$ ping racnode1
PING racnode1.example.com (172.16.1.150) 56(84) bytes of data.
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=2 ttl=64 time=0.099 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=3 ttl=64 time=0.060 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=4 ttl=64 time=0.104 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=5 ttl=64 time=0.051 ms
^C
--- racnode1.example.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4089ms
rtt min/avg/max/mdev = 0.051/0.073/0.104/0.024 ms
[grid@racnode2 ~]$ ping racnode2
PING racnode2.example.com (172.16.1.151) 56(84) bytes of data.
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=1 ttl=64 time=0.024 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=2 ttl=64 time=0.039 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=3 ttl=64 time=0.029 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=4 ttl=64 time=0.078 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=5 ttl=64 time=0.077 ms
^C
--- racnode2.example.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4100ms
rtt min/avg/max/mdev = 0.024/0.049/0.078/0.024 ms
[grid@racnode2 ~]$ ping racnode-cman1
PING racnode-cman1.example.com (172.16.1.15) 56(84) bytes of data.
64 bytes from racnode-cman1.example.com (172.16.1.15): icmp_seq=1 ttl=64 time=0.106 ms
64 bytes from racnode-cman1.example.com (172.16.1.15): icmp_seq=2 ttl=64 time=0.093 ms
64 bytes from racnode-cman1.example.com (172.16.1.15): icmp_seq=3 ttl=64 time=0.090 ms
64 bytes from racnode-cman1.example.com (172.16.1.15): icmp_seq=4 ttl=64 time=0.085 ms
64 bytes from racnode-cman1.example.com (172.16.1.15): icmp_seq=5 ttl=64 time=0.046 ms
[grid@racnode2 ~]$ ping 192.168.17.25  
PING 192.168.17.25 (192.168.17.25) 56(84) bytes of data.
64 bytes from 192.168.17.25: icmp_seq=1 ttl=64 time=0.093 ms
64 bytes from 192.168.17.25: icmp_seq=2 ttl=64 time=0.073 ms
64 bytes from 192.168.17.25: icmp_seq=3 ttl=64 time=0.063 ms
64 bytes from 192.168.17.25: icmp_seq=4 ttl=64 time=0.115 ms

Docker host:

[root@vm-oracle ansible]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.10.10.1      0.0.0.0         UG    100    0        0 eth0
10.10.10.0      0.0.0.0         255.255.255.0   U     100    0        0 eth0
168.63.129.16   10.10.10.1      255.255.255.255 UGH   100    0        0 eth0
169.254.169.254 10.10.10.1      255.255.255.255 UGH   100    0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     0      0        0 br-8cb17468844a
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
192.168.17.0    0.0.0.0         255.255.255.0   U     0      0        0 br-c3348c4e73ef
[root@vm-oracle ansible]# ifconfig
br-8cb17468844a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.1  netmask 255.255.255.0  broadcast 172.16.1.255
        inet6 fe80::42:37ff:fecd:839f  prefixlen 64  scopeid 0x20<link>
        ether 02:42:37:cd:83:9f  txqueuelen 0  (Ethernet)
        RX packets 84  bytes 6922 (6.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 84  bytes 6922 (6.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-c3348c4e73ef: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.17.1  netmask 255.255.255.0  broadcast 192.168.17.255
        inet6 fe80::42:60ff:fe27:b599  prefixlen 64  scopeid 0x20<link>
        ether 02:42:60:27:b5:99  txqueuelen 0  (Ethernet)
        RX packets 271  bytes 28746 (28.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 505  bytes 72444 (70.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:8dff:fefa:ff77  prefixlen 64  scopeid 0x20<link>
        ether 02:42:8d:fa:ff:77  txqueuelen 0  (Ethernet)
        RX packets 13265  bytes 796296 (777.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24032  bytes 402760634 (384.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.10.101  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::222:48ff:fe3f:740  prefixlen 64  scopeid 0x20<link>
        ether 00:22:48:3f:07:40  txqueuelen 1000  (Ethernet)
        RX packets 5534803  bytes 8086939316 (7.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2470473  bytes 185886276 (177.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 84  bytes 6922 (6.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 84  bytes 6922 (6.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth2548638: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::1c4d:a5ff:febc:5f9e  prefixlen 64  scopeid 0x20<link>
        ether 1e:4d:a5:bc:5f:9e  txqueuelen 0  (Ethernet)
        RX packets 10366  bytes 3115093 (2.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9174  bytes 1917246 (1.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth70cae52: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::682a:b9ff:fe8b:9022  prefixlen 64  scopeid 0x20<link>
        ether 6a:2a:b9:8b:90:22  txqueuelen 0  (Ethernet)
        RX packets 16562  bytes 3324362 (3.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 18583  bytes 41975589 (40.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth7a5ae2a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::e867:f1ff:fe1a:9de0  prefixlen 64  scopeid 0x20<link>
        ether ea:67:f1:1a:9d:e0  txqueuelen 0  (Ethernet)
        RX packets 9025  bytes 1863290 (1.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9060  bytes 2032935 (1.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth9d443d8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::b8e9:63ff:fec9:49c4  prefixlen 64  scopeid 0x20<link>
        ether ba:e9:63:c9:49:c4  txqueuelen 0  (Ethernet)
        RX packets 271  bytes 28746 (28.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 505  bytes 72444 (70.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vetha061b16: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::90e6:faff:fee7:8946  prefixlen 64  scopeid 0x20<link>
        ether 92:e6:fa:e7:89:46  txqueuelen 0  (Ethernet)
        RX packets 19118  bytes 42054189 (40.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 17011  bytes 3390102 (3.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vethc2bb90f: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::d8eb:16ff:fec4:bb15  prefixlen 64  scopeid 0x20<link>
        ether da:eb:16:c4:bb:15  txqueuelen 0  (Ethernet)
        RX packets 652  bytes 89366 (87.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 800  bytes 135457 (132.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vethe3b9deb: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::184c:55ff:fef6:d13a  prefixlen 64  scopeid 0x20<link>
        ether 1a:4c:55:f6:d1:3a  txqueuelen 0  (Ethernet)
        RX packets 920642  bytes 4235196164 (3.9 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1072028  bytes 6854864577 (6.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@vm-oracle ansible]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
cd5b36c86309        bridge              bridge              local
087f6de1905d        host                host                local
4c0ba1e8832b        none                null                local
c3348c4e73ef        rac_priv1_nw        bridge              local
8cb17468844a        rac_pub1_nw         bridge              local
[root@vm-oracle ansible]# docker network inspect rac_pub1_nw 
[
    {
        "Name": "rac_pub1_nw",
        "Id": "8cb17468844a86f008b46c95b2dacf85e37c31835a80495444b0a3df18256fff",
        "Created": "2022-03-01T10:37:17.704421146Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.16.1.0/24"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "4d31d422f5f3ae7d5198cf3e6e7f7dfcee3db35a5175b7bd9fab054d4b11dd6e": {
                "Name": "racnode-cman",
                "EndpointID": "a8ca92dd92c3b9d8dc2d0b7d828e2678187bb65f0c1b2d2fb9280bcf5644afa1",
                "MacAddress": "02:42:ac:10:01:0f",
                "IPv4Address": "172.16.1.15/24",
                "IPv6Address": ""
            },
            "7764952e8f96d1a1d9369de7f7a979f4a085c5517ff818d6017ac2c6d88fb60c": {
                "Name": "racnode1",
                "EndpointID": "a92b93779a641ee49340000fb803249e0703acf5d4736449ce567dec31951ee2",
                "MacAddress": "02:42:ac:10:01:96",
                "IPv4Address": "172.16.1.150/24",
                "IPv6Address": ""
            },
            "82bdae1fb4b778669a307a77eaadc22d4cefce1629ff3f345dbe136bd725a189": {
                "Name": "racnode2",
                "EndpointID": "de541a06c92875ee2193f9e115a49d8fdbe3d14ac3ea6f6486425a7539060af3",
                "MacAddress": "02:42:ac:10:01:97",
                "IPv4Address": "172.16.1.151/24",
                "IPv6Address": ""
            },
            "b835a5d904970a7a5ffcf81c101e978a66490628c5757f0bfd44da6210b483f2": {
                "Name": "racdns",
                "EndpointID": "60f0e85c3459544bfe4ce2e99f13d339c1d77a3272aebfe034156c5dfe7d14ef",
                "MacAddress": "02:42:ac:10:01:19",
                "IPv4Address": "172.16.1.25/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]
[root@vm-oracle ansible]# docker network inspect rac_priv1_nw 
[
    {
        "Name": "rac_priv1_nw",
        "Id": "c3348c4e73ef1c1ee07d7d4a58427b0208ae753e768d0bfb88b76a5c5aad4efd",
        "Created": "2022-03-01T10:37:24.28653065Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "192.168.17.0/24"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "7764952e8f96d1a1d9369de7f7a979f4a085c5517ff818d6017ac2c6d88fb60c": {
                "Name": "racnode1",
                "EndpointID": "d5c34199ca7b0aa509bfad61322bfac5921ba9b41cd795effda5c39006569232",
                "MacAddress": "02:42:c0:a8:11:96",
                "IPv4Address": "192.168.17.150/24",
                "IPv6Address": ""
            },
            "82bdae1fb4b778669a307a77eaadc22d4cefce1629ff3f345dbe136bd725a189": {
                "Name": "racnode2",
                "EndpointID": "96f78b8103e6201233145af341e19a392d0216c5247dc068f6805c3f38294cae",
                "MacAddress": "02:42:c0:a8:11:97",
                "IPv4Address": "192.168.17.151/24",
                "IPv6Address": ""
            },
            "ddad8ec035f2d794bfa67d7f0dee5512c19cfda62ea57169904425527124c1b6": {
                "Name": "racnode-storage",
                "EndpointID": "57d853322c83472bb397e05993c6c73314124ee9f511cf549a1ec3db0e036f7f",
                "MacAddress": "02:42:c0:a8:11:19",
                "IPv4Address": "192.168.17.25/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]
ifrankrui commented 2 years ago

I can connect to db via cman on racnode2:


[grid@racnode2 ~]$ sqlplus sys/Welcome1@//racnode-cman1.example.com:1521/ORCLCDB as sysdba

SQL*Plus: Release 21.0.0.0.0 - Production on Tue Mar 1 16:11:03 2022
Version 21.3.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0

SQL> 
ifrankrui commented 2 years ago

Here is the trace file:

[grid@racnode2 ~]$ tail -n 200 /u01/app/grid/diag/crs/racnode2/crs/trace/onmd.trc
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [     INFO] clssnmRcfgMgrThread: Local Join
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: begin on node(2), waittime 193000
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: set curtime (13541794) for my node
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: scanning 32 nodes
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: Node racnode1, number 1, is in an existing cluster with disk state 3
2022-03-01 13:08:54.369 :    ONMD:140347053082368: [  WARNING] clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL for endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: peerUser: NULL
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_7019844,O=Oracle Clusterware, 
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_1646133336,O=Oracle_Clusterware, 
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: endpoint 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }, auth state: gipcmodTlsAuthStateReady (3)
2022-03-01 13:08:54.369 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthReady: TLS Auth completed Successfully
2022-03-01 13:08:54.369 :    ONMD:140347051505408: [     INFO] clssscSelect: conn complete ctx 0x55f28910ad30 endp 0x864e
2022-03-01 13:08:54.369 :    ONMD:140347051505408: [     INFO] clssnmInitialMsg: node 1, racnode1, endp (0x7fa50000864e)
2022-03-01 13:08:54.370 :    ONMD:140347051505408: [     INFO] clssnmeventhndlr: CONNCOMPLETE node(1), endp(0x864e) sending InitialMsg, conrc=2
2022-03-01 13:08:54.539 :    ONMD:140347054659328: [     INFO] clssnmSendingThread: sending join msg to all nodes
2022-03-01 13:08:54.539 :    ONMD:140347054659328: [     INFO] clssnmSendingThread: sent 5 join msgs to all nodes
2022-03-01 13:08:54.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:08:55.355 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6482, LATS 13542784, lastSeqNo 6481, uniqueness 1646133623, timestamp 1646140135/13542664
2022-03-01 13:08:55.370 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperProcessDisconnect: processing DISCONNECT for hendp 0x7fa4a8057160 [00000000000086a7] { gipchaEndpoint : port 'd355-49ad-3da1-7d50', peer 'racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', srcCid 00000000-000086a7,  dstCid 00000000-0002ce7a, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x4204 }
2022-03-01 13:08:55.370 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperMsgComplete: completing with ret gipcretConnectionLost (12), umsg 0x7fa4d80ea0a0 { msg 0x7fa4d80e4600, ret gipcretRequestPending (15), flags 0x2 }, msg 0x7fa4d80e4600 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-000086a7, dstCid 00000000-00000000 } dataLen 0
2022-03-01 13:08:55.370 :GIPCGMOD:140347078379264: [     INFO]  gipcmodGipcCallbackDisconnect: [gipc]  Disconnect forced for endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:55.370 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7fa4d80f1f80 [0000000000008705] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8052260, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:55.370 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7fa4d80f1f80 [0000000000008705] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8052260, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:55.370 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:55.370 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcDisconnect: [gipc]  Issued endpoint close for endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:55.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:08:56.356 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6483, LATS 13543784, lastSeqNo 6482, uniqueness 1646133623, timestamp 1646140136/13543694
2022-03-01 13:08:56.370 :GIPCGMOD:140347078379264: [     INFO]  gipcmodGipcCallbackEndpClosed: [gipc]  Endpoint close for endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:56.371 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4d80f5120 type gipchaClientReqTypeDeleteName (12)
2022-03-01 13:08:56.371 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7fa4a8042870 [00000000000086b3] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8052260, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:56.371 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7fa4a8042870 [00000000000086b3] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8052260, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssnmeventhndlr: Disconnecting endp 0x864e ninf 0x55f28910ad30
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssnmDiscHelper: racnode1, node(1) connection failed, endp (0x864e), probe(0x7fa500000000), ninf->endp 0x7fa50000864e
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssnmDiscHelper: node 1 clean up, endp (0x864e), init state 0, cur state 0
2022-03-01 13:08:56.371 :GIPCXCPT:140347051505408: [     INFO]  gipcInternalDissociate: obj 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-03-01 13:08:56.371 :GIPCXCPT:140347051505408: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4488]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-03-01 13:08:56.371 :GIPCXCPT:140347051505408: [     INFO]  gipcInternalDissociate: obj 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-03-01 13:08:56.371 :GIPCXCPT:140347051505408: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4645]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:56.371 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa4a8052260 [000000000000864e] { gipcEndpoint : localAddr 'gipcha://racnode2:d355-49ad-3da1-7d50', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/3bc6-a96c-329f-49a9', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f2890e9380, ready 1, wobj 0x7fa4a8052ee0, sendp (nil) status 0flags 0x2603860e, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssnmDiscEndp: gipcDestroy 0x864e 
2022-03-01 13:08:56.371 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:56.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:08:57.357 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6484, LATS 13544784, lastSeqNo 6483, uniqueness 1646133623, timestamp 1646140137/13544704
2022-03-01 13:08:57.357 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:57.357 :    ONMD:140347051505408: [     INFO] clssnmconnect: connecting to addr gipcha://racnode1:nm2_racnode1-c
2022-03-01 13:08:57.363 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4a803c780 type gipchaClientReqTypePublish (1)
2022-03-01 13:08:57.363 :    ONMD:140347051505408: [     INFO] clssscConnect: endp 0x8729 - cookie 0x55f28910ad30 - addr gipcha://racnode1:nm2_racnode1-c
2022-03-01 13:08:57.363 :    ONMD:140347051505408: [     INFO] clssnmconnect: connecting to node(1), endp(0x8729), flags 0x10002
2022-03-01 13:08:57.363 :GIPCHTHR:140347078379264: [     INFO]  gipchaWorkerProcessClientConnect: starting resolve from connect for host:racnode1, port:nm2_racnode1-c, cookie:0x7fa4a803c780
2022-03-01 13:08:57.363 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4d80f5120 type gipchaClientReqTypeResolve (4)
2022-03-01 13:08:57.364 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonCreateResolveResponse: creating resolveResponse for host:racnode1, port:nm2_racnode1-c, haname:8764-6925-ffe7-13cb, ret:0
2022-03-01 13:08:57.364 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperConnect: initiated connect for umsg 0x7fa4d80ea0a0 { msg 0x7fa4d80e48f0, ret gipcretRequestPending (15), flags 0x6 }, msg 0x7fa4d80e48f0 { type gipchaMsgTypeConnect (3), srcPort '0a5f-d359-4cee-fff4', dstPort 'nm2_racnode1-c', srcCid 00000000-00008782, cookie 00007fa4-d80ea0a0 } dataLen 0, endp 0x7fa4a80568c0 [0000000000008782] { gipchaEndpoint : port '0a5f-d359-4cee-fff4', peer ':', srcCid 00000000-00008782,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 } node 0x7fa4cc0d12a0 { host 'racnode1', haName '8764-6925-ffe7-13cb', srcLuid 7b00be91-e82deec0, dstLuid bcc7bd2e-572238ad numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7fa4cc0b12f0 type gipchaNodeType12001 (20), nodeIncarnation 0be8266e-006b3202, incarnation 2, cssIncarnation 0, negDigest 7, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0xe10680c}
2022-03-01 13:08:57.364 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperCallbackConnect: completed CONNECT:SEND umsg 0x7fa4d80ea0a0 { msg 0x7fa4d80e48f0, ret gipcretSuccess (0), flags 0xe }, msg 0x7fa4d80e48f0 { type gipchaMsgTypeConnect (3), srcPort '0a5f-d359-4cee-fff4', dstPort 'nm2_racnode1-c', srcCid 00000000-00008782, cookie 00007fa4-d80ea0a0 } dataLen 0, hendp 0x7fa4a80568c0 [0000000000008782] { gipchaEndpoint : port '0a5f-d359-4cee-fff4', peer ':', srcCid 00000000-00008782,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 }
2022-03-01 13:08:57.364 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperProcessConnectAck: CONNACK completed umsg 0x7fa4d80ea0a0 { msg 0x7fa4d80e48f0, ret gipcretSuccess (0), flags 0xe }, msg 0x7fa4d80e48f0 { type gipchaMsgTypeConnect (3), srcPort '0a5f-d359-4cee-fff4', dstPort 'nm2_racnode1-c', srcCid 00000000-00008782, cookie 00007fa4-d80ea0a0 } dataLen 0, hendp 0x7fa4a80568c0 [0000000000008782] { gipchaEndpoint : port '0a5f-d359-4cee-fff4', peer 'racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', srcCid 00000000-00008782,  dstCid 00000000-0002cf32, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x204 } node 0x7fa4cc0d12a0 { host 'racnode1', haName '8764-6925-ffe7-13cb', srcLuid 7b00be91-e82deec0, dstLuid bcc7bd2e-572238ad numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7fa4cc0b12f0 type gipchaNodeType12001 (20), nodeIncarnation 0be8266e-006b3202, incarnation 2, cssIncarnation 0, negDigest 7, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0xe10680c}
2022-03-01 13:08:57.365 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteConnect: [gipc] completed connect on endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 0, wobj 0x7fa4a8053150, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }
2022-03-01 13:08:57.365 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthInit: creating connection context ...
2022-03-01 13:08:57.365 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthInit: tls context initialized successfully
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL for endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: peerUser: NULL
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_7019844,O=Oracle Clusterware, 
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_1646133336,O=Oracle_Clusterware, 
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: endpoint 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }, auth state: gipcmodTlsAuthStateReady (3)
2022-03-01 13:08:57.372 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthReady: TLS Auth completed Successfully
2022-03-01 13:08:57.372 :    ONMD:140347051505408: [     INFO] clssscSelect: conn complete ctx 0x55f28910ad30 endp 0x8729
2022-03-01 13:08:57.372 :    ONMD:140347051505408: [     INFO] clssnmInitialMsg: node 1, racnode1, endp (0x7fa500008729)
2022-03-01 13:08:57.372 :    ONMD:140347051505408: [     INFO] clssnmeventhndlr: CONNCOMPLETE node(1), endp(0x8729) sending InitialMsg, conrc=2
2022-03-01 13:08:57.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:08:58.358 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6485, LATS 13545784, lastSeqNo 6484, uniqueness 1646133623, timestamp 1646140138/13545714
2022-03-01 13:08:58.373 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperProcessDisconnect: processing DISCONNECT for hendp 0x7fa4a80568c0 [0000000000008782] { gipchaEndpoint : port '0a5f-d359-4cee-fff4', peer 'racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', srcCid 00000000-00008782,  dstCid 00000000-0002cf32, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x4204 }
2022-03-01 13:08:58.373 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperMsgComplete: completing with ret gipcretConnectionLost (12), umsg 0x7fa4d80d9380 { msg 0x7fa4d80e50a0, ret gipcretRequestPending (15), flags 0x2 }, msg 0x7fa4d80e50a0 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00008782, dstCid 00000000-00000000 } dataLen 0
2022-03-01 13:08:58.373 :GIPCGMOD:140347078379264: [     INFO]  gipcmodGipcCallbackDisconnect: [gipc]  Disconnect forced for endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:58.373 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7fa4d80e51c0 [00000000000087d9] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8056380, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:58.373 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7fa4d80e51c0 [00000000000087d9] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8056380, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:58.373 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:58.373 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcDisconnect: [gipc]  Issued endpoint close for endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:58.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:08:59.359 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6486, LATS 13546784, lastSeqNo 6485, uniqueness 1646133623, timestamp 1646140139/13546744
2022-03-01 13:08:59.373 :GIPCGMOD:140347078379264: [     INFO]  gipcmodGipcCallbackEndpClosed: [gipc]  Endpoint close for endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:59.373 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4d80f79c0 type gipchaClientReqTypeDeleteName (12)
2022-03-01 13:08:59.373 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7fa4a8041e80 [000000000000878c] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8056380, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:59.373 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7fa4a8041e80 [000000000000878c] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8056380, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssnmeventhndlr: Disconnecting endp 0x8729 ninf 0x55f28910ad30
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssnmDiscHelper: racnode1, node(1) connection failed, endp (0x8729), probe(0x7fa500000000), ninf->endp 0x7fa500008729
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssnmDiscHelper: node 1 clean up, endp (0x8729), init state 0, cur state 0
2022-03-01 13:08:59.374 :GIPCXCPT:140347051505408: [     INFO]  gipcInternalDissociate: obj 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-03-01 13:08:59.374 :GIPCXCPT:140347051505408: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4488]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-03-01 13:08:59.374 :GIPCXCPT:140347051505408: [     INFO]  gipcInternalDissociate: obj 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 } not associated with any container, ret gipcretFail (1)
2022-03-01 13:08:59.374 :GIPCXCPT:140347051505408: [     INFO]  gipcDissociateF [clssnmDiscHelper : clssnm.c : 4645]: EXCEPTION[ ret gipcretFail (1) ]  failed to dissociate obj 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2003860e, flags-2 0x50, usrFlags 0x0 }, flags 0x0
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:59.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa4a8056380 [0000000000008729] { gipcEndpoint : localAddr 'gipcha://racnode2:0a5f-d359-4cee-fff4', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/5494-ef9f-5bfb-7fbe', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f2890e9380, ready 1, wobj 0x7fa4a8053150, sendp (nil) status 0flags 0x2603860e, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssnmDiscEndp: gipcDestroy 0x8729 
2022-03-01 13:08:59.374 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:08:59.544 :    ONMD:140347054659328: [     INFO] clssnmSendingThread: sending join msg to all nodes
2022-03-01 13:08:59.544 :    ONMD:140347054659328: [     INFO] clssnmSendingThread: sent 5 join msgs to all nodes
2022-03-01 13:08:59.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:09:00.360 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6487, LATS 13547784, lastSeqNo 6486, uniqueness 1646133623, timestamp 1646140140/13547764
2022-03-01 13:09:00.360 :    ONMD:140347051505408: [     INFO] clssscSelect: gipcwait returned with status gipcretPosted (17)
2022-03-01 13:09:00.360 :    ONMD:140347051505408: [     INFO] clssnmconnect: connecting to addr gipcha://racnode1:nm2_racnode1-c
2022-03-01 13:09:00.366 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4a8047b70 type gipchaClientReqTypePublish (1)
2022-03-01 13:09:00.367 :    ONMD:140347051505408: [     INFO] clssscConnect: endp 0x8801 - cookie 0x55f28910ad30 - addr gipcha://racnode1:nm2_racnode1-c
2022-03-01 13:09:00.367 :    ONMD:140347051505408: [     INFO] clssnmconnect: connecting to node(1), endp(0x8801), flags 0x10002
2022-03-01 13:09:00.367 :GIPCHTHR:140347078379264: [     INFO]  gipchaWorkerProcessClientConnect: starting resolve from connect for host:racnode1, port:nm2_racnode1-c, cookie:0x7fa4a8047b70
2022-03-01 13:09:00.367 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonProcessClientReq: processing req 0x7fa4d80f79c0 type gipchaClientReqTypeResolve (4)
2022-03-01 13:09:00.367 :GIPCHDEM:140347076802304: [     INFO]  gipchaDaemonCreateResolveResponse: creating resolveResponse for host:racnode1, port:nm2_racnode1-c, haname:8764-6925-ffe7-13cb, ret:0
2022-03-01 13:09:00.367 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperConnect: initiated connect for umsg 0x7fa4d80ca120 { msg 0x7fa4d80e4da0, ret gipcretRequestPending (15), flags 0x6 }, msg 0x7fa4d80e4da0 { type gipchaMsgTypeConnect (3), srcPort '360b-2c8a-112c-67e0', dstPort 'nm2_racnode1-c', srcCid 00000000-0000885a, cookie 00007fa4-d80ca120 } dataLen 0, endp 0x7fa4a8057be0 [000000000000885a] { gipchaEndpoint : port '360b-2c8a-112c-67e0', peer ':', srcCid 00000000-0000885a,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 } node 0x7fa4cc0d12a0 { host 'racnode1', haName '8764-6925-ffe7-13cb', srcLuid 7b00be91-e82deec0, dstLuid bcc7bd2e-572238ad numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7fa4cc0b12f0 type gipchaNodeType12001 (20), nodeIncarnation 0be8266e-006b3202, incarnation 2, cssIncarnation 0, negDigest 7, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0xe10680c}
2022-03-01 13:09:00.367 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperCallbackConnect: completed CONNECT:SEND umsg 0x7fa4d80ca120 { msg 0x7fa4d80e4da0, ret gipcretSuccess (0), flags 0xe }, msg 0x7fa4d80e4da0 { type gipchaMsgTypeConnect (3), srcPort '360b-2c8a-112c-67e0', dstPort 'nm2_racnode1-c', srcCid 00000000-0000885a, cookie 00007fa4-d80ca120 } dataLen 0, hendp 0x7fa4a8057be0 [000000000000885a] { gipchaEndpoint : port '360b-2c8a-112c-67e0', peer ':', srcCid 00000000-0000885a,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 }
2022-03-01 13:09:00.368 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperProcessConnectAck: CONNACK completed umsg 0x7fa4d80ca120 { msg 0x7fa4d80e4da0, ret gipcretSuccess (0), flags 0xe }, msg 0x7fa4d80e4da0 { type gipchaMsgTypeConnect (3), srcPort '360b-2c8a-112c-67e0', dstPort 'nm2_racnode1-c', srcCid 00000000-0000885a, cookie 00007fa4-d80ca120 } dataLen 0, hendp 0x7fa4a8057be0 [000000000000885a] { gipchaEndpoint : port '360b-2c8a-112c-67e0', peer 'racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', srcCid 00000000-0000885a,  dstCid 00000000-0002cff3, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x204 } node 0x7fa4cc0d12a0 { host 'racnode1', haName '8764-6925-ffe7-13cb', srcLuid 7b00be91-e82deec0, dstLuid bcc7bd2e-572238ad numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7fa4cc0b12f0 type gipchaNodeType12001 (20), nodeIncarnation 0be8266e-006b3202, incarnation 2, cssIncarnation 0, negDigest 7, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0xe10680c}
2022-03-01 13:09:00.368 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteConnect: [gipc] completed connect on endp 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 0, wobj 0x7fa4a8042bc0, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }
2022-03-01 13:09:00.368 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthInit: creating connection context ...
2022-03-01 13:09:00.368 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthInit: tls context initialized successfully
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL for endp 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8042bc0, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: peerUser: NULL
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_7019844,O=Oracle Clusterware, 
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: name:CN=2ff6536af6467f6abffec4d933ce42de_1646133336,O=Oracle_Clusterware, 
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthStart: endpoint 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8042bc0, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x0 }, auth state: gipcmodTlsAuthStateReady (3)
2022-03-01 13:09:00.374 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsAuthReady: TLS Auth completed Successfully
2022-03-01 13:09:00.374 :    ONMD:140347051505408: [     INFO] clssscSelect: conn complete ctx 0x55f28910ad30 endp 0x8801
2022-03-01 13:09:00.374 :    ONMD:140347051505408: [     INFO] clssnmInitialMsg: node 1, racnode1, endp (0x7fa500008801)
2022-03-01 13:09:00.374 :    ONMD:140347051505408: [     INFO] clssnmeventhndlr: CONNCOMPLETE node(1), endp(0x8801) sending InitialMsg, conrc=2
2022-03-01 13:09:00.804 :    ONMD:140347102041856: [     INFO] clsscssd_ReadyBussTimeout_CB: GMCD ready for business timedout.Exiting.
2022-03-01 13:09:00.804 :    ONMD:140347102041856: [     INFO] clssscExit: Reason for exit: Init Shutdown. Now calling the respective exit function.
2022-03-01 13:09:00.804 :    ONMD:140347102041856: [     INFO] (:CSSSC00011:)clsscssdcmExit: A fatal error occurred during initialization
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssnmCheckForNetworkFailure: Entered 
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssnmCheckForNetworkFailure: skipping 0 defined 0 
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssnmCheckForNetworkFailure: expiring 1  evicted 0 evicting node 0 this node 1
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssnmCheckForNetworkFailure: network failure
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clsscssdcmSendNlsMsgToGMCDFromQ: sending NLS msgid =1609
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssscSendToLocalBCCTL: msgtype 2 foundpipe TRUE
2022-03-01 13:09:00.805 :    ONMD:140347102041856: [     INFO] clssscSendToLocalBCCTL: Sent a msg type 2 
2022-03-01 13:09:00.806 :    ONMD:140347098887936: [     INFO] clssscServerBCCMHandler: send complete for type 2
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscShmSetKey: key set = css.nls.data successfully
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clsscssdcmExit: Call to clscal flush successful and clearing the CLSSSCCTX_INIT_CALOG flag so that no further CA logging happens
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] ####### Begin Diagnostic Dump #######
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] ### Begin diagnostic data for the Core layer ###
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitNODENUM (0x00000001) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitMAINCTX_DONE (0x00000002) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitPROF_PARMS (0x00000004) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitSKGXN_DONE (0x00000008) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitGMP_ENDPT (0x00000020) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitNM_MIN (0x00000040) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitNM_COMPL (0x00000100) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitGMP_MIN (0x00000200) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitBNMS_COMPL (0x00000400) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitALARM_DONE (0x00001000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitCTRL_COMPL (0x00002000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitFRST_RCFG (0x00004000) not set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitHAVE_DBINFO (0x00020000) not set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitGNS_READY (0x00040000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitHAVE_ICIN (0x00200000) not set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitACTTHRD_DONE (0x00800000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitOPENBUSS (0x01000000) not set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitBCCM_COMPL (0x02000000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] clssscCheckInitCmpl: Initialization state clssscInitCOMPLETE (0x20000000) set
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] Initialization not complete !Error!
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] #### End diagnostic data for the Core layer ####
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] ### Begin diagnostic data for the GM Peer layer ###
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] GMP Status:  State CMStateINIT, incarnation 0, holding incoming requests 0
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] Status for active hub node racnode2, number 2: 
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO]   Connect: Started 1   completed 1   Ready 1   Fully Connected  0   !Error!
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] #### End diagnostic data for the GM Peer layer ####
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] ### Begin diagnostic data for the NM layer ###
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] Local node racnode2, number 2, state is clssnmNodeStateJOINING
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] Status for node racnode1, number 1, uniqueness 1646133623, node ID 0
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO]   State clssnmNodeStateINACTIVE,   Connect: started 1   completed 0   OK
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] Status for node racnode2, number 2, uniqueness 1646139660, node ID 0
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO]   State clssnmNodeStateJOINING,   Connect: started 1   completed 1   OK
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] #### End diagnostic data for the NM layer ####
2022-03-01 13:09:00.806 :    ONMD:140347102041856: [     INFO] ######## End Diagnostic Dump ########
2022-03-01 13:09:00.807 :    ONMD:140347102041856: [     INFO] clsscssdcmExit: Status: 4, Abort flag: 0, Core flag: 0, Don't abort: 0, flag: 112
2022-03-01 13:09:00.807 :    ONMD:140347102041856: scls_dump_stack_all_threads - entry

2022-03-01 13:09:00.807 :    ONMD:140347102041856: scls_dump_stack_all_threads - stat of /usr/bin/gdb failed with errno 2

2022-03-01 13:09:00.807 :    ONMD:140347102041856: [     INFO] clsscssdcmExit: Now aborting
    CLSB:140347102041856: [    ERROR] Oracle Clusterware infrastructure error in ONMD (OS PID 19446): Fatal signal 6 has occurred in program onmd thread 140347102041856; nested signal count is 1
Trace file /u01/app/grid/diag/crs/racnode2/crs/trace/onmd.trc
Oracle Database 21c Clusterware Release 21.0.0.0.0 - Production
Version 21.3.0.0.0 Copyright 1996, 2021 Oracle. All rights reserved.
DDE: Flood control is not active
2022-03-01T13:09:00.821785+00:00
Incident 1 created, dump file: /u01/app/grid/diag/crs/racnode2/crs/incident/incdir_1/onmd_i1.trc
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []
2022-03-01 13:09:00.996 :    ONMD:140347092584192: [     INFO] clssscWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 1000 with cvtimewait status 4294967186
2022-03-01 13:09:01.361 :    ONMD:140347059402496: [     INFO] clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 541599638, wrtcnt, 6488, LATS 13548784, lastSeqNo 6487, uniqueness 1646133623, timestamp 1646140141/13548764
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [     INFO] clssnmRcfgMgrThread: Local Join
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: begin on node(2), waittime 193000
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: set curtime (13548794) for my node
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: scanning 32 nodes
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [     INFO] clssnmLocalJoinEvent: Node racnode1, number 1, is in an existing cluster with disk state 3
2022-03-01 13:09:01.374 :    ONMD:140347053082368: [  WARNING] clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
2022-03-01 13:09:01.375 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperProcessDisconnect: processing DISCONNECT for hendp 0x7fa4a8057be0 [000000000000885a] { gipchaEndpoint : port '360b-2c8a-112c-67e0', peer 'racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', srcCid 00000000-0000885a,  dstCid 00000000-0002cff3, numSend 0, maxSend 100, groupListType 1, hagroup 0x55f2890e9c70, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x4204 }
2022-03-01 13:09:01.375 :GIPCHAUP:140347078379264: [     INFO]  gipchaUpperMsgComplete: completing with ret gipcretConnectionLost (12), umsg 0x7fa4d80ea0a0 { msg 0x7fa4d80e5ad0, ret gipcretRequestPending (15), flags 0x2 }, msg 0x7fa4d80e5ad0 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-0000885a, dstCid 00000000-00000000 } dataLen 0
2022-03-01 13:09:01.375 :GIPCGMOD:140347078379264: [     INFO]  gipcmodGipcCallbackDisconnect: [gipc]  Disconnect forced for endp 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8042bc0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:09:01.375 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRequest: [gipc] completing req 0x7fa4d80e5bf0 [00000000000088d5] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8051f00, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:09:01.375 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcCompleteRecv: [gipc]  Completed recv for req 0x7fa4d80e5bf0 [00000000000088d5] { gipcReceiveRequest : peerName '', data (nil), len 0, olen 0, off 0, parentEndp 0x7fa4a8051f00, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x2 }
2022-03-01 13:09:01.375 : GIPCTLS:140347051505408: [     INFO]  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8042bc0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
2022-03-01 13:09:01.375 :GIPCGMOD:140347051505408: [     INFO]  gipcmodGipcDisconnect: [gipc]  Issued endpoint close for endp 0x7fa4a8051f00 [0000000000008801] { gipcEndpoint : localAddr 'gipcha://racnode2:360b-2c8a-112c-67e0', remoteAddr 'gipcha://racnode1:nm2_racnode1-c/815c-7b12-d9d8-945b', numPend 1, numReady 0, numDone 2, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x55f289109900, ready 1, wobj 0x7fa4a8042bc0, sendp (nil) status 0flags 0x20038606, flags-2 0x50, usrFlags 0x0 }
ifrankrui commented 2 years ago

The two nodes can talk to each other. So to the connection manager and storage container.

@psaini79 Please have a look and let me know if we need more info. Thanks!

psaini79 commented 2 years ago

@ifrankrui

The configuration seems to be correct. please share the following:

From Docker Host:

systemctl status firewalld
getenforce

From the Both the containers

cat /etc/hosts
nslookup racnode1
nslookup racnode2
nslookup <vips>
nslookup <scan>
ping <vips>
ping <scan>
cat /etc/resolv.conf
/bin/netstat -in     # <<Check the mTU size on eth0>>
# racnode1
ping -s <MTU> -c 2  -I 192.168.17.150 192.168.17.151
# racnode2
ping -s <MTU> -c 2  -I 192.168.17.151 192.168.17.150

Also, can you please share the VM detail? Have you deployed in a cloud? If yes, which cloud?

ifrankrui commented 2 years ago

Thanks @psaini79

I stopped Linux firewall before running the container. Please see the following.

From the docker host:

[root@vm-oracle ansible]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
[root@vm-oracle ansible]# getenforce
Permissive

From node1:

[grid@racnode1 ~]$ cat /etc/hosts
127.0.0.1   localhost.localdomain   localhost

172.16.1.150    racnode1.example.com    racnode1

192.168.17.150  racnode1-priv.example.com   racnode1-priv

172.16.1.160    racnode1-vip.example.com    racnode1-vip

172.16.1.15 racnode-cman1.example.com   racnode-cman1

172.16.1.151    racnode2.example.com    racnode2

192.168.17.151  racnode2-priv.example.com   racnode2-priv

172.16.1.161    racnode2-vip.example.com    racnode2-vip

172.16.1.70 racnode-scan.example.com    racnode-scan
[grid@racnode1 ~]$ nslookup racnode1
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode1.example.com
Address: 172.16.1.150

[grid@racnode1 ~]$ nslookup racnode2
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode2.example.com
Address: 172.16.1.151

[grid@racnode1 ~]$ nslookup 172.16.1.160  
160.1.16.172.in-addr.arpa   name = racnode1-vip.example.com.

[grid@racnode1 ~]$ nslookup racnode-cman1
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode-cman1.example.com
Address: 172.16.1.2

[grid@racnode1 ~]$ nslookup 172.16.1.161
161.1.16.172.in-addr.arpa   name = racnode2-vip.example.com.

[grid@racnode1 ~]$ nslookup 172.16.1.70
70.1.16.172.in-addr.arpa    name = racnode1-scan.example.com.

[grid@racnode1 ~]$ nslookup racnode-scan.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53
Name:   racnode-scan.example.com
Address: 172.16.1.172
Name:   racnode-scan.example.com
Address: 172.16.1.170
Name:   racnode-scan.example.com
Address: 172.16.1.171

[grid@racnode1 ~]$ nslookup racnode2-priv.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

** server can't find racnode2-priv.example.com: NXDOMAIN

[grid@racnode1 ~]$ nslookup racnode1-priv.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

** server can't find racnode1-priv.example.com: NXDOMAIN

[grid@racnode1 ~]$ ping racnode-scan.example.com
PING racnode-scan.example.com (172.16.1.70) 56(84) bytes of data.
From racnode1.example.com (172.16.1.150) icmp_seq=1 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=2 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=3 Destination Host Unreachable
^C
--- racnode-scan.example.com ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4134ms
pipe 4
[grid@racnode1 ~]$ ping 172.16.1.70
PING 172.16.1.70 (172.16.1.70) 56(84) bytes of data.
From 172.16.1.150 icmp_seq=1 Destination Host Unreachable
From 172.16.1.150 icmp_seq=2 Destination Host Unreachable
From 172.16.1.150 icmp_seq=3 Destination Host Unreachable
^C
--- 172.16.1.70 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4104ms
pipe 4
[grid@racnode1 ~]$ ping racnode2-vip.example.com
PING racnode2-vip.example.com (172.16.1.161) 56(84) bytes of data.
From racnode1.example.com (172.16.1.150) icmp_seq=1 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=2 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=3 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=4 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=5 Destination Host Unreachable
From racnode1.example.com (172.16.1.150) icmp_seq=6 Destination Host Unreachable
^C
--- racnode2-vip.example.com ping statistics ---
8 packets transmitted, 0 received, +6 errors, 100% packet loss, time 7182ms
pipe 4
[grid@racnode1 ~]$ ping racnode2-priv.example.com
PING racnode2-priv.example.com (192.168.17.151) 56(84) bytes of data.
64 bytes from racnode2-priv.example.com (192.168.17.151): icmp_seq=1 ttl=64 time=0.056 ms
64 bytes from racnode2-priv.example.com (192.168.17.151): icmp_seq=2 ttl=64 time=0.055 ms
64 bytes from racnode2-priv.example.com (192.168.17.151): icmp_seq=3 ttl=64 time=0.056 ms
64 bytes from racnode2-priv.example.com (192.168.17.151): icmp_seq=4 ttl=64 time=0.051 ms
^C
--- racnode2-priv.example.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3068ms
rtt min/avg/max/mdev = 0.051/0.054/0.056/0.007 ms
[grid@racnode1 ~]$ cat /etc/resolv.conf
search example.com
nameserver 172.16.1.25

[grid@racnode1 ~]$ /bin/netstat -in 
Kernel Interface table
Iface             MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0             1500     5389      0      0 0          5885      0      0      0 BMRU
eth0:1           1500      - no statistics available -                        BMRU
eth1             1500    16427      0      0 0         18801      0      0      0 BMRU
eth1:1           1500      - no statistics available -                        BMRU
eth1:2           1500      - no statistics available -                        BMRU
eth1:3           1500      - no statistics available -                        BMRU
eth1:4           1500      - no statistics available -                        BMRU
lo              65536   195060      0      0 0        195060      0      0      0 LRU
[grid@racnode1 ~]$ ping -s 1500 -c 2  -I 192.168.17.150 192.168.17.151
PING 192.168.17.151 (192.168.17.151) from 192.168.17.150 : 1500(1528) bytes of data.
1508 bytes from 192.168.17.151: icmp_seq=1 ttl=64 time=0.093 ms
1508 bytes from 192.168.17.151: icmp_seq=2 ttl=64 time=0.077 ms

--- 192.168.17.151 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1007ms
rtt min/avg/max/mdev = 0.077/0.085/0.093/0.008 ms

From node2:

[grid@racnode2 ~]$ cat /etc/hosts
127.0.0.1   localhost.localdomain   localhost

172.16.1.150    racnode1.example.com    racnode1

192.168.17.150  racnode1-priv.example.com   racnode1-priv

172.16.1.160    racnode1-vip.example.com    racnode1-vip

172.16.1.15 racnode-cman1.example.com   racnode-cman1

172.16.1.151    racnode2.example.com    racnode2

192.168.17.151  racnode2-priv.example.com   racnode2-priv

172.16.1.161    racnode2-vip.example.com    racnode2-vip

172.16.1.70 racnode-scan.example.com    racnode-scan
[grid@racnode2 ~]$ nslookup racnode1
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode1.example.com
Address: 172.16.1.150

[grid@racnode2 ~]$ nslookup racnode2
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode2.example.com
Address: 172.16.1.151
[grid@racnode2 ~]$ nslookup 172.16.1.150
150.1.16.172.in-addr.arpa   name = racnode1.example.com.

[grid@racnode2 ~]$ nslookup racnode1.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode1.example.com
Address: 172.16.1.150

[grid@racnode2 ~]$ nslookup racnode1-priv.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

** server can't find racnode1-priv.example.com: NXDOMAIN

[grid@racnode2 ~]$ nslookup racnode1-vip.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode1-vip.example.com
Address: 172.16.1.160

[grid@racnode2 ~]$ nslookup racnode2-vip.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode2-vip.example.com
Address: 172.16.1.161

[grid@racnode2 ~]$ nslookup racnode-scan.example.com
Server:     172.16.1.25
Address:    172.16.1.25#53

Name:   racnode-scan.example.com
Address: 172.16.1.171
Name:   racnode-scan.example.com
Address: 172.16.1.172
Name:   racnode-scan.example.com
Address: 172.16.1.170

[grid@racnode2 ~]$ ping racnode1-vip.example.com
PING racnode1-vip.example.com (172.16.1.160) 56(84) bytes of data.
64 bytes from racnode1-vip.example.com (172.16.1.160): icmp_seq=1 ttl=64 time=0.069 ms
64 bytes from racnode1-vip.example.com (172.16.1.160): icmp_seq=2 ttl=64 time=0.057 ms
64 bytes from racnode1-vip.example.com (172.16.1.160): icmp_seq=3 ttl=64 time=0.053 ms
^C
--- racnode1-vip.example.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2049ms
rtt min/avg/max/mdev = 0.053/0.059/0.069/0.011 ms
[grid@racnode2 ~]$ ping racnode-scan.example.com
PING racnode-scan.example.com (172.16.1.70) 56(84) bytes of data.
From racnode2.example.com (172.16.1.151) icmp_seq=1 Destination Host Unreachable
From racnode2.example.com (172.16.1.151) icmp_seq=2 Destination Host Unreachable
From racnode2.example.com (172.16.1.151) icmp_seq=3 Destination Host Unreachable
From racnode2.example.com (172.16.1.151) icmp_seq=4 Destination Host Unreachable
From racnode2.example.com (172.16.1.151) icmp_seq=5 Destination Host Unreachable
From racnode2.example.com (172.16.1.151) icmp_seq=6 Destination Host Unreachable
^C
--- racnode-scan.example.com ping statistics ---
9 packets transmitted, 0 received, +6 errors, 100% packet loss, time 8199ms
pipe 4
[grid@racnode2 ~]$ ping 172.16.1.70
PING 172.16.1.70 (172.16.1.70) 56(84) bytes of data.
From 172.16.1.151 icmp_seq=1 Destination Host Unreachable
From 172.16.1.151 icmp_seq=2 Destination Host Unreachable
From 172.16.1.151 icmp_seq=3 Destination Host Unreachable
^C
--- 172.16.1.70 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4099ms
pipe 4
[grid@racnode2 ~]$ cat /etc/resolv.conf
search example.com
nameserver 172.16.1.25
[grid@racnode2 ~]$ /bin/netstat -in 
Kernel Interface table
Iface             MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0             1500     5951      0      0 0          5902      0      0      0 BMRU
eth1             1500    18675      0      0 0         16339      0      0      0 BMRU
lo              65536     8767      0      0 0          8767      0      0      0 LRU
[grid@racnode2 ~]$ ping -s 1500 -c 2  -I 192.168.17.151 192.168.17.150
PING 192.168.17.150 (192.168.17.150) from 192.168.17.151 : 1500(1528) bytes of data.
1508 bytes from 192.168.17.150: icmp_seq=1 ttl=64 time=0.090 ms
1508 bytes from 192.168.17.150: icmp_seq=2 ttl=64 time=0.075 ms

--- 192.168.17.150 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1039ms
rtt min/avg/max/mdev = 0.075/0.082/0.090/0.011 ms
ifrankrui commented 2 years ago

Moreover, the docker host is an Azure VM. The spec is Standard D16 v4 (16 vcpus, 64 GiB memory), with 256GB OS disk and 128 GB data disk.

psaini79 commented 2 years ago

@ifrankrui

I recommend using Oracle RAC on Docker on KVM or virtual box on-prem populated with OEL 7.x and UEK5 for further assistance, because Oracle RAC is only supported in the Oracle Cloud: https://www.oracle.com/technetwork/database/options/clustering/overview/rac-cloud-support-2843861.pdf

For details, please refer following GitHub thread:

https://github.com/oracle/docker-images/issues/1590

If you still have any questions, please let me know I will try to get more details.

ifrankrui commented 2 years ago

Hi @psaini79

I came back to work on this issue. I have also followed the investigation on https://github.com/oracle/docker-images/issues/1590.

If I started node2 before node1. I can see node2 is running, but node1 won't be able to join. So only one node can work in my cluster.

I found the cluster showing the following error from /u01/app/grid/diag/crs/racnode1/crs/trace/ocssd.trc:

2022-03-17 13:49:44.293 :    CSSD:3052607232: [     INFO] clssnmeventhndlr: gipcAssociate endp 0x179c1 in container 0x75b type of conn gipcha
2022-03-17 13:49:44.296 : GIPCTLS:3052607232:  gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL
2022-03-17 13:49:44.296 : GIPCTLS:3052607232:  gipcmodTlsAuthStart: Peer is anonymous 
2022-03-17 13:49:44.296 : GIPCTLS:3052607232:  gipcmodTlsAuthStart: endpoint 0x7fa68c06ebc0 [00000000000179c1] { gipcEndpoint : localAddr 'gipcha://racnode2:nm2_racnode1-c/79a6-41c8-3402-4cfb', remoteAddr 'gipcha://racnode1:f103-d105-7a95-cfe2', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0x561d604e7e90, ready 1, wobj 0x7fa68c0b7ec0, sendp (nil) status 0flags 0x20138606, flags-2 0x10, usrFlags 0x0 }, auth state: gipcmodTlsAuthStateReady (3)
2022-03-17 13:49:44.296 : GIPCTLS:3052607232:  gipcmodTlsAuthReady: TLS Auth completed Successfully
2022-03-17 13:49:44.297 :    CSSD:3052607232: [    ERROR] clssnmConnComplete: Rejecting connection from node 1 as MultiNode RAC is not supported in this Configuration
2022-03-17 13:49:44.297 : GIPCTLS:3052607232:  gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7fa68c06ebc0 [00000000000179c1] { gipcEndpoint : localAddr 'gipcha://racnode2:nm2_racnode1-c/79a6-41c8-3402-4cfb', remoteAddr 'gipcha://racnode1:f103-d105-7a95-cfe2', numPend 1, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22360, readyRef 0x561d604c7b40, ready 0, wobj 0x7fa68c0b7ec0, sendp (nil) status 0flags 0x26138606, flags-2 0x50, usrFlags 0x0 }
2022-03-17 13:49:44.297 :GIPCGMOD:3052607232:  gipcmodGipcDisconnect: [gipc]  Issued endpoint close for endp 0x7fa68c06ebc0 [00000000000179c1] { gipcEndpoint : localAddr 'gipcha://racnode2:nm2_racnode1-c/79a6-41c8-3402-4cfb', remoteAddr 'gipcha://racnode1:f103-d105-7a95-cfe2', numPend 1, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22360, readyRef 0x561d604c7b40, ready 0, wobj 0x7fa68c0b7ec0, sendp (nil) status 0flags 0x26138606, flags-2 0x50, usrFlags 0x0 }

I saw the above error when I join node2 to the cluster too. Do you have any idea about the above error?

** The same error was discussed in https://balazspapp.wordpress.com/2018/12/09/you-may-not-run-multinode-rac-because-it-is-not-supported-or-certified/. I tried the workaround to block 169.254.169.254 on all the nodes. But it doesn't work.

psaini79 commented 2 years ago

@ifrankrui

Please share the details about the env? Are you trying this on-premise or on cloud?