oracle / docker-images

Official source of container configurations, images, and examples for Oracle products and projects
https://developer.oracle.com/use-cases/#containers
Universal Permissive License v1.0
6.53k stars 5.42k forks source link

Oracle RAC 19c failed on racnode2 during CRSD check #1590

Closed hprop closed 3 years ago

hprop commented 4 years ago

Hi! I am trying to spin up an Oracle 19c RAC environment with two nodes. I followed the steps described in the README files, and successfully run the first node racnode1. However, when trying to add an additional node, I hit the following error:

04-22-2020 13:10:16 UTC :  : Running Node Addition and cluvfy test for node racnode2
04-22-2020 13:10:16 UTC :  : Copying /tmp/grid_addnode.rsp on remote node racnode1
04-22-2020 13:10:16 UTC :  : Running GridSetup.sh on racnode1 to add the node to existing cluster
04-22-2020 13:11:07 UTC :  : Node Addition performed. removing Responsefile
04-22-2020 13:11:07 UTC :  : Running root.sh on node racnode2
04-22-2020 13:11:07 UTC :  : Nodes in the cluster racnode2
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
04-22-2020 13:23:30 UTC :  : Checking Cluster
04-22-2020 13:23:30 UTC :  : Cluster Check passed
04-22-2020 13:23:30 UTC :  : Cluster Check went fine
04-22-2020 13:23:31 UTC : : CRSD Check failed!
04-22-2020 13:23:31 UTC : : Error has occurred in Grid Setup, Please verify!

The complete docker logs for both nodes are attached below:

Further info from the docker host:

$ uname -r
4.14.35-1902.301.1.el7uek.x86_64
$ cat /etc/oracle-release
Oracle Linux Server release 7.8
$ docker info
Client:
 Debug Mode: false

Server:
 Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
 Images: 106
 Server Version: 19.03.1-ol
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
 runc version: 4bb1fe4ace1a32d3676bb98f5d3b6a4e32bf6c58
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
  selinux
 Kernel Version: 4.14.35-1902.301.1.el7uek.x86_64
 Operating System: Oracle Linux Server 7.8
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.15GiB
 Name: ip-172-31-2-173.eu-west-1.compute.internal
 ID: WMBC:AVXV:SCZB:AID6:OJRK:35RK:ABR2:MQIT:HER7:RN5E:UVBJ:6JFC
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Registries: 
$ systemctl status -l docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-04-22 11:49:57 UTC; 6h ago
     Docs: https://docs.docker.com
 Main PID: 26317 (dockerd)
    Tasks: 30
   Memory: 139.2M
   CGroup: /system.slice/docker.service
           ├─ 1868 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1522 -container-ip 172.16.1.15 -container-port 1521
           └─26317 /usr/bin/dockerd --selinux-enabled --cpu-rt-runtime=950000 --cpu-rt-period=1000000 -H fd:// --containerd=/run/containerd/containerd.sock
$ docker images 
REPOSITORY                  TAG                 IMAGE ID            CREATED             SIZE
oracle/rac-storage-server   19.3.0              dd34bc25e944        8 hours ago         302MB
oracle/client-cman          19.3.0              04a55847e7dc        8 hours ago         3.53GB
<none>                      <none>              b2506d08962a        12 hours ago        302MB
<none>                      <none>              b2da3538742e        12 hours ago        3.53GB
<none>                      <none>              c1fb598bb4f3        24 hours ago        302MB
<none>                      <none>              96cc6cee9277        24 hours ago        3.53GB
oracle/database-rac         19.3.0              6519f4ebb17b        26 hours ago        20.5GB
<none>                      <none>              10a28c0fb715        2 days ago          20.5GB
<none>                      <none>              3bce81cc7042        2 days ago          302MB
<none>                      <none>              90495f097cda        2 days ago          3.53GB
oraclelinux                 7-slim              f23503228fa1        12 days ago         120MB
hello-world                 latest              bf756fb1ae65        3 months ago        13.3kB
$ docker ps -a
CONTAINER ID        IMAGE                              COMMAND                  CREATED             STATUS              PORTS                              NAMES
3214f674a6fc        oracle/database-rac:19.3.0         "/usr/sbin/oracleinit"   7 hours ago         Up 2 hours                                             racnode2
f498cd892f55        oracle/database-rac:19.3.0         "/usr/sbin/oracleinit"   8 hours ago         Up 8 hours                                             racnode1
60f8bdc5e59f        oracle/rac-storage-server:19.3.0   "/bin/sh -c 'exec $S…"   8 hours ago         Up 8 hours                                             racnode-storage
3c810e23ec83        oracle/client-cman:19.3.0          "/bin/sh -c 'exec $S…"   8 hours ago         Up 8 hours          5500/tcp, 0.0.0.0:1522->1521/tcp   racnode-cman

Any help would be very appreciated. Also, please let me know if further information is required. Thanks!

psaini79 commented 4 years ago

@hprop

Please provide following logs: login to racnode2 sudo /bin/bash cd /u01/app/grid tar -cvzf racnode2_gridlogs.tgz * upload the files

Also, paste the output of following systemctl status | grep running

hprop commented 4 years ago

Thanks for your quick reply @psaini79.

racnode2_gridlogs.tgz

And systemctl status from racnode2:

$ systemctl status|grep running
systemctl status|grep running
    State: running
           ├─18234 grep --color=auto running

This is the full systemctl status output for racnode2 and the docker host, just in case:

psaini79 commented 4 years ago

@hprop

It seems racnode1 was not reachable from racnode2 and it failed. Please provide following: docker exec -i -t racnode1 /bin/bash ping racnode2 ssh racnode2

cat /etc/hosts

Try the same thing from racnode2 to racnode1. Also, paste the output of following from racnode1: crsctl check cluster crsctl check crs olsnodes

hprop commented 4 years ago

Thanks again for your help @psaini79.

From racnode1:

[grid@racnode1 ~]$ ping racnode2
PING racnode2.example.com (172.16.1.151) 56(84) bytes of data.
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=1 ttl=64 time=0.077 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=2 ttl=64 time=0.066 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=3 ttl=64 time=0.060 ms
64 bytes from racnode2.example.com (172.16.1.151): icmp_seq=4 ttl=64 time=0.063 ms
^C
--- racnode2.example.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3075ms
rtt min/avg/max/mdev = 0.060/0.066/0.077/0.010 ms
[grid@racnode1 ~]$ ssh racnode2
Last login: Wed Apr 22 17:54:21 2020
[grid@racnode2 ~]$ hostname
racnode2
[grid@racnode1 ~]$ cat /etc/hosts
127.0.0.1   localhost.localdomain   localhost

172.16.1.150    racnode1.example.com    racnode1

192.168.17.150  racnode1-priv.example.com   racnode1-priv

172.16.1.160    racnode1-vip.example.com    racnode1-vip

172.16.1.70 racnode-scan.example.com    racnode-scan

172.16.1.15 racnode-cman1.example.com   racnode-cman1

172.16.1.151    racnode2.example.com    racnode2

192.168.17.151  racnode2-priv.example.com   racnode2-priv

172.16.1.161    racnode2-vip.example.com    racnode2-vip
[grid@racnode1 ~]$ crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@racnode1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@racnode1 ~]$ olsnodes
racnode1
racnode2

From racnode2:

[grid@racnode2 ~]$ ping racnode1
PING racnode1.example.com (172.16.1.150) 56(84) bytes of data.
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=1 ttl=64 time=0.066 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=2 ttl=64 time=0.080 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=3 ttl=64 time=0.060 ms
64 bytes from racnode1.example.com (172.16.1.150): icmp_seq=4 ttl=64 time=0.046 ms
^C
--- racnode1.example.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3090ms
rtt min/avg/max/mdev = 0.046/0.063/0.080/0.012 ms
[grid@racnode2 ~]$ ssh racnode1
Last login: Mon Apr 27 09:13:42 2020
[grid@racnode1 ~]$ hostname
racnode1
[grid@racnode2 ~]$ cat /etc/hosts
127.0.0.1   localhost.localdomain   localhost

172.16.1.150    racnode1.example.com    racnode1

192.168.17.150  racnode1-priv.example.com   racnode1-priv

172.16.1.160    racnode1-vip.example.com    racnode1-vip

172.16.1.70 racnode-scan.example.com    racnode-scan

172.16.1.15 racnode-cman1.example.com   racnode-cman1

172.16.1.151    racnode2.example.com    racnode2

192.168.17.151  racnode2-priv.example.com   racnode2-priv

172.16.1.161    racnode2-vip.example.com    racnode2-vip
psaini79 commented 4 years ago

@hprop

I looked at the logs and it seems racnode2 is unable to communicate with racnode1 on private inter connect. I found following errors:

gipcd.trc

2020-04-23 01:18:06.383 :GIPCHALO:3033978624:  gipchaLowerProcessNode: no valid interfaces found to node for 4294967286 ms, node 0x7f138c2a63f0 { host 'racnode1', haName 'gipcd_ha_name', srcLuid 3719e9da-39e601dd, dstLuid 60fc3aaa-2bfe43fc numInf 1, sentRegister 1, localMonitor 1, baseStream 0x7f138c2a0fe0 type gipchaNodeType12001 (20), nodeIncarnation 3f82d948-042c614a, incarnation 0, cssIncarnation 1, negDigest 4294967295, roundTripTime 368 lastSeenPingAck 885 nextPingId 887 latencySrc 293 latencyDst 75 flags 0x860080c}

cssd.log

clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 483023861, wrtcnt, 20868, LATS 90856264, lastSeqNo 20867, uniqueness 1587557846, timestamp 1587578679/90855434

Please check following and provide the details:

Docker Host

systemctl status firewalld
getenforce 

Login to racnode1 and racnode2 and paste the output for following from container:

ifconfig
ping -I eth0 192.168.17.151
ping -S 192.168.17.150 192.168.17.151

Note: Make sure 192.168.17.0/24 subnet is on eth0 if not change the network card for ping.

Login to racnode2 and paste the output for following from the container:

ifconfig
ping -I eth0 192.168.17.150
ping -S 192.168.17.151 192.168.17.150

Note: Make sure 192.168.17.0/24 subnet is on eth0 if not change the network card for ping based on that subnet.

hprop commented 4 years ago

@psaini79 please find below the requested info:

Docker host:

$ systemctl status firewalld
Unit firewalld.service could not be found.
$ getenforce
Permissive

From racnode1:

[grid@racnode1 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.17.150  netmask 255.255.255.0  broadcast 192.168.17.255
        ether 02:42:c0:a8:11:96  txqueuelen 0  (Ethernet)
        RX packets 424002  bytes 59752360 (56.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 424341  bytes 59885802 (57.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.7.39  netmask 255.255.224.0  broadcast 169.254.31.255
        ether 02:42:c0:a8:11:96  txqueuelen 0  (Ethernet)

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.150  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)
        RX packets 217038  bytes 57680347 (55.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 235780  bytes 102696763 (97.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.160  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

eth1:2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.70  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:96  txqueuelen 0  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 7462241  bytes 24503423536 (22.8 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7462241  bytes 24503423536 (22.8 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
[grid@racnode1 ~]$ ping -I eth0 192.168.17.151
PING 192.168.17.151 (192.168.17.151) from 192.168.17.150 eth0: 56(84) bytes of data.
64 bytes from 192.168.17.151: icmp_seq=1 ttl=64 time=0.054 ms
64 bytes from 192.168.17.151: icmp_seq=2 ttl=64 time=0.045 ms
64 bytes from 192.168.17.151: icmp_seq=3 ttl=64 time=0.056 ms
64 bytes from 192.168.17.151: icmp_seq=4 ttl=64 time=0.061 ms
64 bytes from 192.168.17.151: icmp_seq=5 ttl=64 time=0.057 ms
^C
--- 192.168.17.151 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4092ms
rtt min/avg/max/mdev = 0.045/0.054/0.061/0.009 ms
[grid@racnode1 ~]$ ping -S 192.168.17.150 192.168.17.151
PING 192.168.17.151 (192.168.17.151) 56(84) bytes of data.
64 bytes from 192.168.17.151: icmp_seq=1 ttl=64 time=0.070 ms
64 bytes from 192.168.17.151: icmp_seq=2 ttl=64 time=0.057 ms
64 bytes from 192.168.17.151: icmp_seq=3 ttl=64 time=0.046 ms
64 bytes from 192.168.17.151: icmp_seq=4 ttl=64 time=0.045 ms
64 bytes from 192.168.17.151: icmp_seq=5 ttl=64 time=0.080 ms
64 bytes from 192.168.17.151: icmp_seq=6 ttl=64 time=0.055 ms
^C
--- 192.168.17.151 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5146ms
rtt min/avg/max/mdev = 0.045/0.058/0.080/0.016 ms

From racnode2:

[grid@racnode2 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.17.151  netmask 255.255.255.0  broadcast 192.168.17.255
        ether 02:42:c0:a8:11:97  txqueuelen 0  (Ethernet)
        RX packets 403234  bytes 56877333 (54.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 402794  bytes 56647857 (54.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.151  netmask 255.255.255.0  broadcast 172.16.1.255
        ether 02:42:ac:10:01:97  txqueuelen 0  (Ethernet)
        RX packets 216643  bytes 68650401 (65.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 198824  bytes 53836522 (51.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 1571788  bytes 250479733 (238.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1571788  bytes 250479733 (238.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
[grid@racnode2 ~]$ ping -I eth0 192.168.17.150
PING 192.168.17.150 (192.168.17.150) from 192.168.17.151 eth0: 56(84) bytes of data.
64 bytes from 192.168.17.150: icmp_seq=1 ttl=64 time=0.060 ms
64 bytes from 192.168.17.150: icmp_seq=2 ttl=64 time=0.065 ms
64 bytes from 192.168.17.150: icmp_seq=3 ttl=64 time=0.079 ms
64 bytes from 192.168.17.150: icmp_seq=4 ttl=64 time=0.058 ms
64 bytes from 192.168.17.150: icmp_seq=5 ttl=64 time=0.059 ms
^C
--- 192.168.17.150 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4111ms
rtt min/avg/max/mdev = 0.058/0.064/0.079/0.009 ms
[grid@racnode2 ~]$ ping -S 192.168.17.151 192.168.17.150
PING 192.168.17.150 (192.168.17.150) 56(84) bytes of data.
64 bytes from 192.168.17.150: icmp_seq=1 ttl=64 time=0.067 ms
64 bytes from 192.168.17.150: icmp_seq=2 ttl=64 time=0.072 ms
64 bytes from 192.168.17.150: icmp_seq=3 ttl=64 time=0.051 ms
64 bytes from 192.168.17.150: icmp_seq=4 ttl=64 time=0.067 ms
64 bytes from 192.168.17.150: icmp_seq=5 ttl=64 time=0.052 ms
^C
--- 192.168.17.150 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4104ms
rtt min/avg/max/mdev = 0.051/0.061/0.072/0.013 ms

Thanks!

psaini79 commented 4 years ago

@hprop

Your network setup seems to be right but not sure why network heart beat error is reported during node addition.

Please do following: Login to racnode1 and racnode2, capture following and paste output:

route -n

Login to racnode1

$GRID_HOME/bin/crsctl stat res -t
sudo /bin/bash
$GRID_HOME/bin/crsctl stop crs -f
$GRID_HOME/bin/crsctl start crs

Also, re-run root.sh from racnode2 as root user:

$GRID_HOME/root.sh

Let me know if it is still failing with the network heartbeat error.

Since you are trying to bring up 2 node RAC, did you try to bring up RAC on docker/container using response file? Please check following:

https://github.com/oracle/docker-images/tree/master/OracleDatabase/RAC/OracleRealApplicationClusters/samples/customracdb

I am trying to understand if the error is only during Addnode or is it coming for 2 node RAC setup using responsefile in your environment.

hprop commented 4 years ago

Thanks @psaini79, please find below the info.

Route tables for racnode1 and racnode2:

[grid@racnode1 ~]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.17.1    0.0.0.0         UG    0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.224.0   U     0      0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.17.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
[grid@racnode2 ~]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.17.1    0.0.0.0         UG    0      0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.17.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0

Clusterware resources info from racnode1:

[grid@racnode1 ~]$ $GRID_HOME/bin/crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
ora.chad
               ONLINE  ONLINE       racnode1                 STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        OFFLINE OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.orclcdb.db
      1        ONLINE  ONLINE       racnode1                 Open,HOME=/u01/app/o
                                                             racle/product/19.3.0
                                                             /dbhome_1,STABLE
ora.qosmserver
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------
bash-4.2# /u01/app/19.3.0/grid/bin/crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'racnode1'
CRS-2673: Attempting to stop 'ora.crsd' on 'racnode1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on server 'racnode1'
CRS-2673: Attempting to stop 'ora.qosmserver' on 'racnode1'
CRS-2673: Attempting to stop 'ora.chad' on 'racnode1'
CRS-2673: Attempting to stop 'ora.orclcdb.db' on 'racnode1'
CRS-2677: Stop of 'ora.qosmserver' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.orclcdb.db' on 'racnode1' succeeded
CRS-33673: Attempting to stop resource group 'ora.asmgroup' on server 'racnode1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'racnode1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'racnode1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'racnode1'
CRS-2673: Attempting to stop 'ora.cvu' on 'racnode1'
CRS-2677: Stop of 'ora.DATA.dg' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'racnode1'
CRS-2677: Stop of 'ora.cvu' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.racnode1.vip' on 'racnode1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'racnode1'
CRS-2677: Stop of 'ora.racnode1.vip' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.asm' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'racnode1'
CRS-2677: Stop of 'ora.chad' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.asmnet1.asmnetwork' on 'racnode1'
CRS-2677: Stop of 'ora.asmnet1.asmnetwork' on 'racnode1' succeeded
CRS-33677: Stop of resource group 'ora.asmgroup' on server 'racnode1' succeeded.
CRS-2673: Attempting to stop 'ora.ons' on 'racnode1'
CRS-2677: Stop of 'ora.ons' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'racnode1'
CRS-2677: Stop of 'ora.net1.network' on 'racnode1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'racnode1' has completed
CRS-2677: Stop of 'ora.crsd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'racnode1'
CRS-2673: Attempting to stop 'ora.crf' on 'racnode1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'racnode1'
CRS-2677: Stop of 'ora.crf' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.storage' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'racnode1'
CRS-2677: Stop of 'ora.mdnsd' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.asm' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'racnode1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'racnode1'
CRS-2673: Attempting to stop 'ora.evmd' on 'racnode1'
CRS-2677: Stop of 'ora.ctssd' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'racnode1'
CRS-2677: Stop of 'ora.cssd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'racnode1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'racnode1'
CRS-2677: Stop of 'ora.gipcd' on 'racnode1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'racnode1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'racnode1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
bash-4.2# /u01/app/19.3.0/grid/bin/crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

Re-running root.sh from racnode2:

[grid@racnode2 ~]$ sudo /bin/bash
bash-4.2# /u01/app/19.3.0/grid/root.sh
Check /u01/app/19.3.0/grid/install/root_racnode2_2020-05-01_09-48-54-942804239.log for the output of root script
[grid@racnode2 ~]$ cat /u01/app/19.3.0/grid/install/root_racnode2_2020-05-01_09-48-54-942804239.log
Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/19.3.0/grid
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/19.3.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/grid/crsdata/racnode2/crsconfig/rootcrs_racnode2_2020-05-01_09-48-55AM.log
2020/05/01 09:49:00 CLSRSC-594: Executing installation step 1 of 19: 'SetupTFA'.
2020/05/01 09:49:00 CLSRSC-594: Executing installation step 2 of 19: 'ValidateEnv'.
2020/05/01 09:49:01 CLSRSC-363: User ignored prerequisites during installation
2020/05/01 09:49:01 CLSRSC-594: Executing installation step 3 of 19: 'CheckFirstNode'.
2020/05/01 09:49:01 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
2020/05/01 09:49:01 CLSRSC-594: Executing installation step 4 of 19: 'GenSiteGUIDs'.
2020/05/01 09:49:02 CLSRSC-594: Executing installation step 5 of 19: 'SetupOSD'.
2020/05/01 09:49:02 CLSRSC-594: Executing installation step 6 of 19: 'CheckCRSConfig'.
2020/05/01 09:49:03 CLSRSC-594: Executing installation step 7 of 19: 'SetupLocalGPNP'.
2020/05/01 09:49:04 CLSRSC-594: Executing installation step 8 of 19: 'CreateRootCert'.
2020/05/01 09:49:04 CLSRSC-594: Executing installation step 9 of 19: 'ConfigOLR'.
2020/05/01 09:49:05 CLSRSC-594: Executing installation step 10 of 19: 'ConfigCHMOS'.
2020/05/01 09:49:36 CLSRSC-594: Executing installation step 11 of 19: 'CreateOHASD'.
2020/05/01 09:49:37 CLSRSC-594: Executing installation step 12 of 19: 'ConfigOHASD'.
2020/05/01 09:49:40 CLSRSC-594: Executing installation step 13 of 19: 'InstallAFD'.
2020/05/01 09:49:41 CLSRSC-594: Executing installation step 14 of 19: 'InstallACFS'.
2020/05/01 09:49:43 CLSRSC-594: Executing installation step 15 of 19: 'InstallKA'.
2020/05/01 09:49:44 CLSRSC-594: Executing installation step 16 of 19: 'InitConfig'.
2020/05/01 09:49:49 CLSRSC-594: Executing installation step 17 of 19: 'StartCluster'.
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode2'
CRS-2672: Attempting to start 'ora.evmd' on 'racnode2'
CRS-2676: Start of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2676: Start of 'ora.evmd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode2'
CRS-2676: Start of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode2'
CRS-2676: Start of 'ora.gipcd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'racnode2'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-2676: Start of 'ora.crf' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-1722: Cluster Synchronization Service daemon encountered an internal error.
CRS-2883: Resource 'ora.cssd' failed during Clusterware stack start.
CRS-4406: Oracle High Availability Services synchronous start failed.
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
PRVH-0116 : Path "/u01/app/19.3.0/grid/crs/install/cmdllroot.sh" with permissions "rw-r--r--" does not have execute permissions for the owner, file's group, and others on node "racnode2".
PRVG-2031 : Owner of file "/u01/app/19.3.0/grid/crs/install/cmdllroot.sh" did not match the expected value on node "racnode2". [Expected = "grid(54332)" ; Found = "root(0)"]
PRVG-2032 : Group of file "/u01/app/19.3.0/grid/crs/install/cmdllroot.sh" did not match the expected value on node "racnode2". [Expected = "oinstall(54321)" ; Found = "root(0)"]
CRS-4000: Command Start failed, or completed with errors.
2020/05/01 10:00:21 CLSRSC-117: Failed to start Oracle Clusterware stack
Died at /u01/app/19.3.0/grid/crs/install/crsinstall.pm line 1970.

Trying to set the correct permissions for cmdllroot.sh and re-running root.sh, but it died at the same point:

[grid@racnode2 ~]$ ls -l /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
-rw-r--r--. 1 root root 1276 Apr 22 13:11 /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
[grid@racnode2 ~]$ sudo /bin/bash
bash-4.2# chmod 755 /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
bash-4.2# chown grid:oinstall /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
bash-4.2# ls -l /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
-rwxr-xr-x. 1 grid oinstall 1276 Apr 22 13:11 /u01/app/19.3.0/grid/crs/install/cmdllroot.sh
bash-4.2# /u01/app/19.3.0/grid/root.sh
Check /u01/app/19.3.0/grid/install/root_racnode2_2020-05-01_10-22-40-969725920.log for the output of root script
bash-4.2# cat /u01/app/19.3.0/grid/install/root_racnode2_2020-05-01_10-22-40-969725920.log
Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/19.3.0/grid
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/19.3.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/grid/crsdata/racnode2/crsconfig/rootcrs_racnode2_2020-05-01_10-22-41AM.log
2020/05/01 10:22:45 CLSRSC-594: Executing installation step 1 of 19: 'SetupTFA'.
2020/05/01 10:22:46 CLSRSC-594: Executing installation step 2 of 19: 'ValidateEnv'.
2020/05/01 10:22:46 CLSRSC-363: User ignored prerequisites during installation
2020/05/01 10:22:46 CLSRSC-594: Executing installation step 3 of 19: 'CheckFirstNode'.
2020/05/01 10:22:46 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
2020/05/01 10:22:46 CLSRSC-594: Executing installation step 4 of 19: 'GenSiteGUIDs'.
2020/05/01 10:22:47 CLSRSC-594: Executing installation step 5 of 19: 'SetupOSD'.
2020/05/01 10:22:47 CLSRSC-594: Executing installation step 6 of 19: 'CheckCRSConfig'.
2020/05/01 10:22:48 CLSRSC-594: Executing installation step 7 of 19: 'SetupLocalGPNP'.
2020/05/01 10:22:49 CLSRSC-594: Executing installation step 8 of 19: 'CreateRootCert'.
2020/05/01 10:22:49 CLSRSC-594: Executing installation step 9 of 19: 'ConfigOLR'.
2020/05/01 10:22:50 CLSRSC-594: Executing installation step 10 of 19: 'ConfigCHMOS'.
2020/05/01 10:23:21 CLSRSC-594: Executing installation step 11 of 19: 'CreateOHASD'.
2020/05/01 10:23:22 CLSRSC-594: Executing installation step 12 of 19: 'ConfigOHASD'.
2020/05/01 10:23:25 CLSRSC-594: Executing installation step 13 of 19: 'InstallAFD'.
2020/05/01 10:23:26 CLSRSC-594: Executing installation step 14 of 19: 'InstallACFS'.
2020/05/01 10:23:27 CLSRSC-594: Executing installation step 15 of 19: 'InstallKA'.
2020/05/01 10:23:28 CLSRSC-594: Executing installation step 16 of 19: 'InitConfig'.
2020/05/01 10:23:32 CLSRSC-594: Executing installation step 17 of 19: 'StartCluster'.
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.evmd' on 'racnode2'
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode2'
CRS-2676: Start of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2676: Start of 'ora.evmd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode2'
CRS-2676: Start of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode2'
CRS-2676: Start of 'ora.gipcd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'racnode2'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-2676: Start of 'ora.crf' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-1722: Cluster Synchronization Service daemon encountered an internal error.
CRS-2883: Resource 'ora.cssd' failed during Clusterware stack start.
CRS-4406: Oracle High Availability Services synchronous start failed.
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
CRS-4000: Command Start failed, or completed with errors.
2020/05/01 10:34:04 CLSRSC-117: Failed to start Oracle Clusterware stack
Died at /u01/app/19.3.0/grid/crs/install/crsinstall.pm line 1970.

Fragment of the offending script:

bash-4.2# nl -ba /u01/app/19.3.0/grid/crs/install/crsinstall.pm | sed -n '1949,1974p'
  1949  sub start_cluster
  1950  {
  1951    trace(sprintf("Startup level is %d", $CFG->stackStartLevel));
  1952  
  1953    # start the entire stack in shiphome
  1954    if (START_STACK_ALL == $CFG->stackStartLevel)
  1955    {
  1956      trace("Attempt to start the whole CRS stack");
  1957      my $rc = startHasStack($CFG->params('ORACLE_HOME'));
  1958      
  1959      if (WARNING == $rc)
  1960      {
  1961        # maximum number of hub nodes reached, try this as a rim node.
  1962        my $role = NODE_ROLE_RIM;
  1963        setNodeRole($role);
  1964        stopFullStack("force") || die(dieformat(349));
  1965        $rc = startHasStack($CFG->params('ORACLE_HOME'), $role);
  1966      }
  1967  
  1968      if ( SUCCESS != $rc )
  1969      {
  1970        die(dieformat(117));
  1971      }
  1972  
  1973      print_info(343);
  1974  

I have to take a look at the 2-nodes RAC example you pointed out before. Also, any guidance to continue the diagnose above is really appreciated.

Thanks!

psaini79 commented 4 years ago

@hprop

Sure, I will assist you. Can you please paste the the logs of recent failure:

login to racnode2 sudo /bin/bash cd /u01/app/grid tar -cvzf racnode2_gridlogs.tgz * upload the files

Also, upload the logs of racnode1 as well. Also, from the docker host, please provide output of following : route -n

Note route -n command need to be run on docker host.

psaini79 commented 4 years ago

@hprop

Any update on this. Also, before doing above tasks (if you have not done), please try following: Check if iptables are up on your machine (docker host) if yes then execute following steps:

systemtctl stop iptables
systemctl disable iptables

Login to racnode1 container

sudo crsctl stop crs -f

Logi to racnode2 container

sudo crsctl stop crs -f

stop racnode2 and racnode1 and start first racnode2 container and check if grid comes up. If yes, start racnode1 container and check if grid comes up.

I asked to do this because racnode2 is already a part of cluster and want to see if grid comes when racnode2 is not running.

hprop commented 4 years ago

@psaini79

Iptables was up on the docker host, so I stopped the service as suggested:

$ systemctl status iptables
● iptables.service - IPv4 firewall with iptables
   Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled)
   Active: active (exited) since Tue 2020-04-21 16:45:28 UTC; 1 weeks 6 days ago
 Main PID: 534 (code=exited, status=0/SUCCESS)
    Tasks: 0
   Memory: 0B
   CGroup: /system.slice/iptables.service

Apr 21 16:45:27 ip-172-31-2-173.eu-west-1.compute.internal systemd[1]: Starting IPv4 firewall with iptables...
Apr 21 16:45:28 ip-172-31-2-173.eu-west-1.compute.internal iptables.init[534]: iptables: Applying firewall rules:...]
Apr 21 16:45:28 ip-172-31-2-173.eu-west-1.compute.internal systemd[1]: Started IPv4 firewall with iptables.
Hint: Some lines were ellipsized, use -l to show in full.
/scp:ol7:/home/ec2-user/ #$ sudo systemctl stop iptables
/scp:ol7:/home/ec2-user/ #$ sudo systemctl disable iptables
Removed symlink /etc/systemd/system/basic.target.wants/iptables.service.

Then I followed the steps to stop crs and containers in the specified order. Now after starting racnode2 and then racnode1, I see:

[grid@racnode1 ~]$ crsctl check cluster -all
**************************************************************
racnode1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racnode2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[grid@racnode1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
               OFFLINE OFFLINE      racnode2                 STABLE
ora.chad
               ONLINE  ONLINE       racnode1                 STABLE
               OFFLINE OFFLINE      racnode2                 STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        ONLINE  ONLINE       racnode2                 STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 Started,STABLE
      2        ONLINE  ONLINE       racnode2                 Started,STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       racnode1                 STABLE
      2        ONLINE  ONLINE       racnode2                 STABLE
ora.cvu
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.orclcdb.db
      1        ONLINE  ONLINE       racnode1                 Open,HOME=/u01/app/o
                                                             racle/product/19.3.0
                                                             /dbhome_1,STABLE
ora.qosmserver
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------

I tried to start the listener on racnode2:

[grid@racnode1 ~]$ srvctl start listener -node racnode2
PRCR-1013 : Failed to start resource ora.LISTENER.lsnr
PRCR-1064 : Failed to start resource ora.LISTENER.lsnr on node racnode2
CRS-2805: Unable to start 'ora.LISTENER.lsnr' because it has a 'hard' dependency on resource type 'ora.cluster_vip_net1.type' and no resource of that type can satisfy the dependency
CRS-2525: All instances of the resource 'ora.racnode1.vip' are already running; relocate is not allowed because the force option was not specified

Also, please find the new grid logs below:

psaini79 commented 4 years ago

@hprop

It is because your RAC setup on node 2 did not complete. I would request if you can re-run root.sh on node2 or recreate the setup. It seems iptables were causing issues as I can see CSSD, CRSD and EVMD process came up fine on node 2.

psaini79 commented 3 years ago

Please close this thread if the issue is resolved.

hprop commented 3 years ago

My apologies @psaini79 -- I was working on other things and had no chances to come back to this, until now. The issue still persists, I have tried to recreate the whole setup after disabling iptables and encountered the same error when running node2:

$ docker logs -f racnode2
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=racnode2
TERM=xterm
SCAN_IP=172.16.1.70
ASM_DISCOVERY_DIR=/oradata
ASM_DEVICE_LIST=/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img
DOMAIN=example.com
PUBLIC_IP=172.16.1.151
PUBLIC_HOSTNAME=racnode2
EXISTING_CLS_NODES=racnode1
PRIV_IP=192.168.17.151
SCAN_NAME=racnode-scan
COMMON_OS_PWD_FILE=common_os_pwdfile.enc
VIP_HOSTNAME=racnode2-vip
PRIV_HOSTNAME=racnode2-priv
ORACLE_SID=ORCLCDB
OP_TYPE=ADDNODE
PWD_KEY=pwd.key
NODE_VIP=172.16.1.161
SETUP_LINUX_FILE=setupLinuxEnv.sh
INSTALL_DIR=/opt/scripts
GRID_BASE=/u01/app/grid
GRID_HOME=/u01/app/19.3.0/grid
INSTALL_FILE_1=LINUX.X64_193000_grid_home.zip
GRID_INSTALL_RSP=gridsetup_19c.rsp
GRID_SW_INSTALL_RSP=grid_sw_install_19c.rsp
GRID_SETUP_FILE=setupGrid.sh
FIXUP_PREQ_FILE=fixupPreq.sh
INSTALL_GRID_BINARIES_FILE=installGridBinaries.sh
INSTALL_GRID_PATCH=applyGridPatch.sh
INVENTORY=/u01/app/oraInventory
CONFIGGRID=configGrid.sh
ADDNODE=AddNode.sh
DELNODE=DelNode.sh
ADDNODE_RSP=grid_addnode.rsp
SETUPSSH=setupSSH.expect
DOCKERORACLEINIT=dockeroracleinit
GRID_USER_HOME=/home/grid
SETUPGRIDENV=setupGridEnv.sh
RESET_OS_PASSWORD=resetOSPassword.sh
MULTI_NODE_INSTALL=MultiNodeInstall.py
DB_BASE=/u01/app/oracle
DB_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
INSTALL_FILE_2=LINUX.X64_193000_db_home.zip
DB_INSTALL_RSP=db_sw_install_19c.rsp
DBCA_RSP=dbca_19c.rsp
DB_SETUP_FILE=setupDB.sh
PWD_FILE=setPassword.sh
RUN_FILE=runOracle.sh
STOP_FILE=stopOracle.sh
ENABLE_RAC_FILE=enableRAC.sh
CHECK_DB_FILE=checkDBStatus.sh
USER_SCRIPTS_FILE=runUserScripts.sh
REMOTE_LISTENER_FILE=remoteListener.sh
INSTALL_DB_BINARIES_FILE=installDBBinaries.sh
GRID_HOME_CLEANUP=GridHomeCleanup.sh
ORACLE_HOME_CLEANUP=OracleHomeCleanup.sh
DB_USER=oracle
GRID_USER=grid
FUNCTIONS=functions.sh
COMMON_SCRIPTS=/common_scripts
CHECK_SPACE_FILE=checkSpace.sh
RESET_FAILED_UNITS=resetFailedUnits.sh
SET_CRONTAB=setCrontab.sh
CRONTAB_ENTRY=crontabEntry
EXPECT=/usr/bin/expect
BIN=/usr/sbin
container=true
INSTALL_SCRIPTS=/opt/scripts/install
SCRIPT_DIR=/opt/scripts/startup
GRID_PATH=/u01/app/19.3.0/grid/bin:/u01/app/19.3.0/grid/OPatch/:/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
DB_PATH=/u01/app/oracle/product/19.3.0/dbhome_1/bin:/u01/app/oracle/product/19.3.0/dbhome_1/OPatch/:/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
GRID_LD_LIBRARY_PATH=/u01/app/19.3.0/grid/lib:/usr/lib:/lib
DB_LD_LIBRARY_PATH=/u01/app/oracle/product/19.3.0/dbhome_1/lib:/usr/lib:/lib
HOME=/home/grid
Failed to parse kernel command line, ignoring: No such file or directory
systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Detected virtualization other.
Detected architecture x86-64.

Welcome to Oracle Linux Server 7.8!

Set hostname to <racnode2>.
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
/usr/lib/systemd/system-generators/systemd-fstab-generator failed with error code 1.
[/usr/lib/systemd/system/systemd-pstore.service:22] Unknown lvalue 'StateDirectory' in section 'Service'
Cannot add dependency job for unit display-manager.service, ignoring: Unit not found.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Swap.
[  OK  ] Created slice Root Slice.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Created slice System Slice.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Listening on Journal Socket.
Couldn't determine result for ConditionKernelCommandLine=|rd.modules-load for systemd-modules-load.service, assuming failed: No such file or directory
Couldn't determine result for ConditionKernelCommandLine=|modules-load for systemd-modules-load.service, assuming failed: No such file or directory
[  OK  ] Created slice User and Session Slice.
[  OK  ] Reached target Slices.
[  OK  ] Reached target RPC Port Mapper.
         Starting Read and set NIS domainname from /etc/sysconfig/network...
         Starting Journal Service...
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Listening on Delayed Shutdown Socket.
         Starting Rebuild Hardware Database...
[  OK  ] Reached target Local File Systems (Pre).
         Starting Configure read-only root support...
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Started Read and set NIS domainname from /etc/sysconfig/network.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Configure read-only root support.
         Starting Load/Save Random Seed...
[  OK  ] Reached target Local File Systems.
         Starting Rebuild Journal Catalog...
         Starting Mark the need to relabel after reboot...
         Starting Preprocess NFS configuration...
[  OK  ] Started Mark the need to relabel after reboot.
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Rebuild Journal Catalog.
[  OK  ] Started Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Preprocess NFS configuration.
[  OK  ] Started Create Volatile Files and Directories.
         Mounting RPC Pipe File System...
         Starting Update UTMP about System Boot/Shutdown...
[FAILED] Failed to mount RPC Pipe File System.
See 'systemctl status var-lib-nfs-rpc_pipefs.mount' for details.
[DEPEND] Dependency failed for rpc_pipefs.target.
[DEPEND] Dependency failed for RPC security service for NFS client and server.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started Rebuild Hardware Database.
         Starting Update is Completed...
[  OK  ] Started Update is Completed.
[  OK  ] Reached target System Initialization.
[  OK  ] Listening on RPCbind Server Activation Socket.
         Starting RPC bind service...
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Started Flexible branding.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
         Starting Login Service...
         Starting GSSAPI Proxy Daemon...
         Starting LSB: Bring up/down networking...
         Starting Resets System Activity Logs...
         Starting Self Monitoring and Reporting Technology (SMART) Daemon...
[  OK  ] Started D-Bus System Message Bus.
         Starting OpenSSH Server Key Generation...
[  OK  ] Started RPC bind service.
[  OK  ] Started GSSAPI Proxy Daemon.
[  OK  ] Started Resets System Activity Logs.
         Starting Cleanup of Temporary Directories...
[  OK  ] Reached target NFS client services.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting Permit User Sessions...
[  OK  ] Started Login Service.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Command Scheduler.
[  OK  ] Started Cleanup of Temporary Directories.
[  OK  ] Started LSB: Bring up/down networking.
[  OK  ] Reached target Network.
         Starting /etc/rc.d/rc.local Compatibility...
[  OK  ] Reached target Network is Online.
         Starting Notify NFS peers of a restart...
[  OK  ] Started /etc/rc.d/rc.local Compatibility.
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Notify NFS peers of a restart.
09-29-2020 13:02:02 UTC :  : Process id of the program : 
09-29-2020 13:02:02 UTC :  : #################################################
09-29-2020 13:02:02 UTC :  :  Starting Grid Installation          
09-29-2020 13:02:02 UTC :  : #################################################
09-29-2020 13:02:02 UTC :  : Pre-Grid Setup steps are in process
09-29-2020 13:02:02 UTC :  : Process id of the program : 
[  OK  ] Started OpenSSH Server Key Generation.
         Starting OpenSSH server daemon...
09-29-2020 13:02:02 UTC :  : Disable failed service var-lib-nfs-rpc_pipefs.mount
[  OK  ] Started OpenSSH server daemon.
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
09-29-2020 13:02:02 UTC :  : Resetting Failed Services
09-29-2020 13:02:02 UTC :  : Sleeping for 60 seconds
[  OK  ] Started Self Monitoring and Reporting Technology (SMART) Daemon.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Oracle Linux Server 7.8
Kernel 4.14.35-1902.301.1.el7uek.x86_64 on an x86_64

racnode2 login: 09-29-2020 13:03:02 UTC :  : Systemctl state is running!
09-29-2020 13:03:02 UTC :  : Setting correct permissions for /bin/ping
09-29-2020 13:03:02 UTC :  : Public IP is set to 172.16.1.151
09-29-2020 13:03:02 UTC :  : RAC Node PUBLIC Hostname is set to racnode2
09-29-2020 13:03:02 UTC :  : Preparing host line for racnode2
09-29-2020 13:03:02 UTC :  : Adding \n172.16.1.151\tracnode2.example.com\tracnode2 to /etc/hosts
09-29-2020 13:03:02 UTC :  : Preparing host line for racnode2-priv
09-29-2020 13:03:02 UTC :  : Adding \n192.168.17.151\tracnode2-priv.example.com\tracnode2-priv to /etc/hosts
09-29-2020 13:03:02 UTC :  : Preparing host line for racnode2-vip
09-29-2020 13:03:02 UTC :  : Adding \n172.16.1.161\tracnode2-vip.example.com\tracnode2-vip to /etc/hosts
09-29-2020 13:03:02 UTC :  : racnode-scan already exists : 172.16.1.70  racnode-scan.example.com    racnode-scan, no  update required
09-29-2020 13:03:02 UTC :  : Preapring Device list
09-29-2020 13:03:02 UTC :  : Changing Disk permission and ownership /oradata/asm_disk01.img
09-29-2020 13:03:02 UTC :  : Changing Disk permission and ownership /oradata/asm_disk02.img
09-29-2020 13:03:02 UTC :  : Changing Disk permission and ownership /oradata/asm_disk03.img
09-29-2020 13:03:02 UTC :  : Changing Disk permission and ownership /oradata/asm_disk04.img
09-29-2020 13:03:02 UTC :  : Changing Disk permission and ownership /oradata/asm_disk05.img
09-29-2020 13:03:02 UTC :  : DNS_SERVERS is set to empty. /etc/resolv.conf will use default dns docker embedded server.
09-29-2020 13:03:02 UTC :  : #####################################################################
09-29-2020 13:03:02 UTC :  :  RAC setup will begin in 2 minutes                                   
09-29-2020 13:03:02 UTC :  : ####################################################################
09-29-2020 13:03:04 UTC :  : ###################################################
09-29-2020 13:03:04 UTC :  : Pre-Grid Setup steps completed
09-29-2020 13:03:04 UTC :  : ###################################################
09-29-2020 13:03:04 UTC :  : Checking if grid is already configured
09-29-2020 13:03:04 UTC :  : Public IP is set to 172.16.1.151
09-29-2020 13:03:04 UTC :  : RAC Node PUBLIC Hostname is set to racnode2
09-29-2020 13:03:04 UTC :  : Domain is defined to example.com
09-29-2020 13:03:04 UTC :  : Setting Existing Cluster Node for node addition operation. This will be retrieved from racnode1
09-29-2020 13:03:04 UTC :  : Existing Node Name of the cluster is set to racnode1
09-29-2020 13:03:05 UTC :  : 172.16.1.150
09-29-2020 13:03:05 UTC :  : Existing Cluster node resolved to IP. Check passed
09-29-2020 13:03:05 UTC :  : Default setting of AUTO GNS VIP set to false. If you want to use AUTO GNS VIP, please pass DHCP_CONF as an env parameter set to true
09-29-2020 13:03:05 UTC :  : RAC VIP set to 172.16.1.161
09-29-2020 13:03:05 UTC :  : RAC Node VIP hostname is set to racnode2-vip 
09-29-2020 13:03:05 UTC :  : SCAN_NAME name is racnode-scan
09-29-2020 13:03:05 UTC :  : 172.16.1.70
09-29-2020 13:03:05 UTC :  : SCAN Name resolving to IP. Check Passed!
09-29-2020 13:03:05 UTC :  : SCAN_IP name is 172.16.1.70
09-29-2020 13:03:05 UTC :  : RAC Node PRIV IP is set to 192.168.17.151 
09-29-2020 13:03:05 UTC :  : RAC Node private hostname is set to racnode2-priv
09-29-2020 13:03:05 UTC :  : CMAN_NAME set to the empty string
09-29-2020 13:03:05 UTC :  : CMAN_IP set to the empty string
09-29-2020 13:03:05 UTC :  : Password file generated
09-29-2020 13:03:05 UTC :  : Common OS Password string is set for Grid user
09-29-2020 13:03:05 UTC :  : Common OS Password string is set for  Oracle user
09-29-2020 13:03:05 UTC :  : GRID_RESPONSE_FILE env variable set to empty. AddNode.sh will use standard cluster responsefile
09-29-2020 13:03:05 UTC :  : Location for User script SCRIPT_ROOT set to /common_scripts
09-29-2020 13:03:05 UTC :  : ORACLE_SID is set to ORCLCDB
09-29-2020 13:03:05 UTC :  : Setting random password for root/grid/oracle user
09-29-2020 13:03:05 UTC :  : Setting random password for grid user
09-29-2020 13:03:05 UTC :  : Setting random password for oracle user
09-29-2020 13:03:05 UTC :  : Setting random password for root user
09-29-2020 13:03:05 UTC :  : Cluster Nodes are racnode1 racnode2
09-29-2020 13:03:05 UTC :  : Running SSH setup for grid user between nodes racnode1 racnode2
09-29-2020 13:03:17 UTC :  : Running SSH setup for oracle user between nodes racnode1 racnode2
09-29-2020 13:03:29 UTC :  : SSH check fine for the racnode1
09-29-2020 13:03:29 UTC :  : SSH check fine for the racnode2
09-29-2020 13:03:29 UTC :  : SSH check fine for the racnode2
09-29-2020 13:03:29 UTC :  : SSH check fine for the oracle@racnode1
09-29-2020 13:03:29 UTC :  : SSH check fine for the oracle@racnode2
09-29-2020 13:03:29 UTC :  : SSH check fine for the oracle@racnode2
09-29-2020 13:03:29 UTC :  : Setting Device permission to grid and asmadmin on all the cluster nodes
09-29-2020 13:03:29 UTC :  : Nodes in the cluster racnode2
09-29-2020 13:03:29 UTC :  : Setting Device permissions for RAC Install  on racnode2
09-29-2020 13:03:29 UTC :  : Preapring ASM Device list
09-29-2020 13:03:29 UTC :  : Changing Disk permission and ownership
09-29-2020 13:03:29 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Populate Rac Env Vars on Remote Hosts
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
09-29-2020 13:03:30 UTC :  : Changing Disk permission and ownership
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Populate Rac Env Vars on Remote Hosts
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
09-29-2020 13:03:30 UTC :  : Changing Disk permission and ownership
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
09-29-2020 13:03:30 UTC :  : Populate Rac Env Vars on Remote Hosts
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
09-29-2020 13:03:30 UTC :  : Changing Disk permission and ownership
09-29-2020 13:03:30 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
09-29-2020 13:03:31 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
09-29-2020 13:03:31 UTC :  : Populate Rac Env Vars on Remote Hosts
09-29-2020 13:03:31 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
09-29-2020 13:03:31 UTC :  : Changing Disk permission and ownership
09-29-2020 13:03:31 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chown $GRID_USER:asmadmin $device" execute on racnode2
09-29-2020 13:03:31 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo chmod 660 $device" execute on racnode2
09-29-2020 13:03:31 UTC :  : Populate Rac Env Vars on Remote Hosts
09-29-2020 13:03:31 UTC :  : Command : su - $GRID_USER -c "ssh $node sudo echo \"export ASM_DEVICE_LIST=${ASM_DEVICE_LIST}\" >> /etc/rac_env_vars" execute on racnode2
09-29-2020 13:03:31 UTC :  : Checking Cluster Status on racnode1
09-29-2020 13:03:31 UTC :  : Checking Cluster
09-29-2020 13:03:31 UTC :  : Cluster Check on remote node passed
09-29-2020 13:03:32 UTC :  : Cluster Check went fine
09-29-2020 13:03:32 UTC :  : CRSD Check went fine
09-29-2020 13:03:32 UTC :  : CSSD Check went fine
09-29-2020 13:03:32 UTC :  : EVMD Check went fine
09-29-2020 13:03:32 UTC :  : Generating Responsefile for node addition
09-29-2020 13:03:32 UTC :  : Clustered Nodes are set to racnode2:racnode2-vip:HUB
09-29-2020 13:03:32 UTC :  : Running Cluster verification utility for new node racnode2 on racnode1
09-29-2020 13:03:32 UTC :  : Nodes in the cluster racnode2
09-29-2020 13:03:32 UTC :  : ssh to the node racnode1 and executing cvu checks on racnode2
09-29-2020 13:04:29 UTC :  : Checking /tmp/cluvfy_check.txt if there is any failed check.

Verifying Physical Memory ...PASSED
Verifying Available Physical Memory ...PASSED
Verifying Swap Size ...PASSED
Verifying Free Space: racnode2:/usr,racnode2:/var,racnode2:/etc,racnode2:/u01/app/19.3.0/grid,racnode2:/sbin,racnode2:/tmp ...PASSED
Verifying Free Space: racnode1:/usr,racnode1:/var,racnode1:/etc,racnode1:/u01/app/19.3.0/grid,racnode1:/sbin,racnode1:/tmp ...PASSED
Verifying User Existence: oracle ...
  Verifying Users With Same UID: 54321 ...PASSED
Verifying User Existence: oracle ...PASSED
Verifying User Existence: grid ...
  Verifying Users With Same UID: 54332 ...PASSED
Verifying User Existence: grid ...PASSED
Verifying User Existence: root ...
  Verifying Users With Same UID: 0 ...PASSED
Verifying User Existence: root ...PASSED
Verifying Group Existence: asmadmin ...PASSED
Verifying Group Existence: asmoper ...PASSED
Verifying Group Existence: asmdba ...PASSED
Verifying Group Existence: oinstall ...PASSED
Verifying Group Membership: oinstall ...PASSED
Verifying Group Membership: asmdba ...PASSED
Verifying Group Membership: asmadmin ...PASSED
Verifying Group Membership: asmoper ...PASSED
Verifying Run Level ...PASSED
Verifying Hard Limit: maximum open file descriptors ...PASSED
Verifying Soft Limit: maximum open file descriptors ...PASSED
Verifying Hard Limit: maximum user processes ...PASSED
Verifying Soft Limit: maximum user processes ...PASSED
Verifying Soft Limit: maximum stack size ...PASSED
Verifying Architecture ...PASSED
Verifying OS Kernel Version ...PASSED
Verifying OS Kernel Parameter: semmsl ...PASSED
Verifying OS Kernel Parameter: semmns ...PASSED
Verifying OS Kernel Parameter: semopm ...PASSED
Verifying OS Kernel Parameter: semmni ...PASSED
Verifying OS Kernel Parameter: shmmax ...PASSED
Verifying OS Kernel Parameter: shmmni ...PASSED
Verifying OS Kernel Parameter: shmall ...FAILED (PRVG-1201)
Verifying OS Kernel Parameter: file-max ...PASSED
Verifying OS Kernel Parameter: aio-max-nr ...PASSED
Verifying OS Kernel Parameter: panic_on_oops ...PASSED
Verifying Package: kmod-20-21 (x86_64) ...PASSED
Verifying Package: kmod-libs-20-21 (x86_64) ...PASSED
Verifying Package: binutils-2.23.52.0.1 ...PASSED
Verifying Package: compat-libcap1-1.10 ...PASSED
Verifying Package: libgcc-4.8.2 (x86_64) ...PASSED
Verifying Package: libstdc++-4.8.2 (x86_64) ...PASSED
Verifying Package: libstdc++-devel-4.8.2 (x86_64) ...PASSED
Verifying Package: sysstat-10.1.5 ...PASSED
Verifying Package: ksh ...PASSED
Verifying Package: make-3.82 ...PASSED
Verifying Package: glibc-2.17 (x86_64) ...PASSED
Verifying Package: glibc-devel-2.17 (x86_64) ...PASSED
Verifying Package: libaio-0.3.109 (x86_64) ...PASSED
Verifying Package: libaio-devel-0.3.109 (x86_64) ...PASSED
Verifying Package: nfs-utils-1.2.3-15 ...PASSED
Verifying Package: smartmontools-6.2-4 ...PASSED
Verifying Package: net-tools-2.0-0.17 ...PASSED
Verifying Users With Same UID: 0 ...PASSED
Verifying Current Group ID ...PASSED
Verifying Root user consistency ...PASSED
Verifying Node Addition ...
  Verifying CRS Integrity ...PASSED
  Verifying Clusterware Version Consistency ...PASSED
  Verifying '/u01/app/19.3.0/grid' ...PASSED
Verifying Node Addition ...PASSED
Verifying Host name ...PASSED
Verifying Node Connectivity ...
  Verifying Hosts File ...PASSED
  Verifying Check that maximum (MTU) size packet goes through subnet ...PASSED
  Verifying subnet mask consistency for subnet "172.16.1.0" ...PASSED
  Verifying subnet mask consistency for subnet "192.168.17.0" ...PASSED
Verifying Node Connectivity ...PASSED
Verifying Multicast or broadcast check ...PASSED
Verifying ASM Integrity ...PASSED
Verifying Device Checks for ASM ...
  Verifying Package: cvuqdisk-1.0.10-1 ...PASSED
  Verifying ASM device sharedness check ...
    Verifying Shared Storage Accessibility:/oradata/asm_disk01.img,/oradata/asm_disk02.img,/oradata/asm_disk03.img,/oradata/asm_disk04.img,/oradata/asm_disk05.img ...PASSED
  Verifying ASM device sharedness check ...PASSED
  Verifying Access Control List check ...PASSED
Verifying Device Checks for ASM ...PASSED
Verifying Database home availability ...PASSED
Verifying OCR Integrity ...PASSED
Verifying Time zone consistency ...PASSED
Verifying Network Time Protocol (NTP) ...
  Verifying '/etc/ntp.conf' ...PASSED
  Verifying '/var/run/ntpd.pid' ...PASSED
  Verifying '/var/run/chronyd.pid' ...PASSED
Verifying Network Time Protocol (NTP) ...FAILED (PRVG-1017)
Verifying User Not In Group "root": grid ...PASSED
Verifying Time offset between nodes ...PASSED
Verifying resolv.conf Integrity ...FAILED (PRVG-10048)
Verifying DNS/NIS name service ...PASSED
Verifying User Equivalence ...PASSED
Verifying /dev/shm mounted as temporary file system ...PASSED
Verifying /boot mount ...PASSED
Verifying zeroconf check ...PASSED

Pre-check for node addition was unsuccessful on all the nodes. 

Failures were encountered during execution of CVU verification request "stage -pre nodeadd".

Verifying OS Kernel Parameter: shmall ...FAILED
racnode2: PRVG-1201 : OS kernel parameter "shmall" does not have expected
          configured value on node "racnode2" [Expected = "2251799813685247" ;
          Current = "18446744073692774000"; Configured = "1073741824"].

racnode1: PRVG-1201 : OS kernel parameter "shmall" does not have expected
          configured value on node "racnode1" [Expected = "2251799813685247" ;
          Current = "18446744073692774000"; Configured = "1073741824"].

Verifying Network Time Protocol (NTP) ...FAILED
racnode2: PRVG-1017 : NTP configuration file "/etc/ntp.conf" is present on
          nodes "racnode2,racnode1" on which NTP daemon or service was not
          running

racnode1: PRVG-1017 : NTP configuration file "/etc/ntp.conf" is present on
          nodes "racnode2,racnode1" on which NTP daemon or service was not
          running

Verifying resolv.conf Integrity ...FAILED
racnode2: PRVG-10048 : Name "racnode2" was not resolved to an address of the
          specified type by name servers "127.0.0.11".

racnode1: PRVG-10048 : Name "racnode1" was not resolved to an address of the
          specified type by name servers "127.0.0.11".

CVU operation performed:      stage -pre nodeadd
Date:                         Sep 29, 2020 1:03:34 PM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid
09-29-2020 13:04:29 UTC :  : CVU Checks are ignored as IGNORE_CVU_CHECKS set to true. It is recommended to set IGNORE_CVU_CHECKS to false and meet all the cvu checks requirement. RAC installation might fail, if there are failed cvu checks.
09-29-2020 13:04:29 UTC :  : Running Node Addition and cluvfy test for node racnode2
09-29-2020 13:04:29 UTC :  : Copying /tmp/grid_addnode.rsp on remote node racnode1
09-29-2020 13:04:29 UTC :  : Running GridSetup.sh on racnode1 to add the node to existing cluster
09-29-2020 13:05:21 UTC :  : Node Addition performed. removing Responsefile
09-29-2020 13:05:21 UTC :  : Running root.sh on node racnode2
09-29-2020 13:05:21 UTC :  : Nodes in the cluster racnode2
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
Failed to parse kernel command line, ignoring: No such file or directory
09-29-2020 13:17:47 UTC :  : Checking Cluster
09-29-2020 13:17:47 UTC :  : Cluster Check passed
09-29-2020 13:17:47 UTC :  : Cluster Check went fine
09-29-2020 13:17:47 UTC : : CRSD Check failed!
09-29-2020 13:17:47 UTC : : Error has occurred in Grid Setup, Please verify!

In the same host machine I am able to create and use an Oracle 12c RAC cluster with two nodes. However I am still unable to get a 2-node Oracle 19c RAC working.

Please, let me know any further step to check. I will be focused on this during the next days, hoping to solve it with your help soon.

Thanks!

psaini79 commented 3 years ago

@hprop

Are you trying RAC on Docker on AWS cloud?

hprop commented 3 years ago

@psaini79 Yes, the Oracle Linux host is an EC2 machine. Is there any limitation for running RAC 19c regarding this? I was able to run a RAC 12c following the same steps in this host.

Thanks for your help.

psaini79 commented 3 years ago

@hprop

What version of RAC did you test on AWS VMs? Was it Oracle Linux VM deployed in AWS?

hprop commented 3 years ago

@psaini79

The host in both cases is an AWS EC2 machine, with Oracle Linux 7.7.

psaini79 commented 3 years ago

@onkarnigam14

Sorry for the delayed reply. I recommend using Oracle RAC on Docker on KVM or virtual box on-prem populated with OEL 7.x and UEK5 for further assistance, because Oracle RAC is only supported in the Oracle Cloud: https://www.oracle.com/technetwork/database/options/clustering/overview/rac-cloud-support-2843861.pdf

psaini79 commented 3 years ago

I am closing this thread. Please reopen if you see any issue on RAC on Docker running on on-prem KVM or Virtual Box.