Closed fdanapfel closed 9 years ago
Looks like the he following part in SAPHanaTopology is responsible for this:
#
# figure-out all needed values from system replication status with ONE call
# we need: mode=primary|sync|syncmem|...; site name=<site>; mapping/<me>=<site>/<node> (multiple lines)
case $(crm_attribute --type crm_config --name cluster-infrastructure -q) in
*corosync* ) nodelist=$(crm_node -l | awk '{ print $2 }');;
*openais* ) nodelist=$(crm_node -l | awk '/member/ {print $2}');;
*cman* ) nodelist=$(crm_node -l);;
esac
hdbANSWER=$(su - ${sidadm} -c "hdbnsutil -sr_state --sapcontrol=1" 2>/dev/null)
super_ocf_log debug "DBG2: hdbANSWER=\$\(su - ${sidadm} -c \"hdbnsutil -sr_state --sapcontrol=1\"\)"
site=$(echo "$hdbANSWER" | awk -F= '/site name/ {print $2}')
srmode=$(echo "$hdbANSWER" | awk -F= '/mode/ {print $2}')
MAPPING=$(echo "$hdbANSWER" | awk -F[=/] '$1 ~ "mapping" && $3 !~ site { print $4 }' site=$site)
super_ocf_log debug "DBG: site=$site, mode=$srmode, MAPPING=$MAPPING"
#
# filter all non-cluster mappings
#
hanaRemoteHost=$(for n1 in $nodelist; do for n2 in $MAPPING; do if [ "$n1" == "$n2" ]; then echo $n1; fi; done; done )
super_ocf_log info "DEC: site=$site, mode=$srmode, MAPPING=$MAPPING, hanaRemoteHost=$hanaRemoteHost"
Looks like it is trying to compare the Hana hostnames against the cluster nodenames, and since they differ hanaRemoteHost never gets set.
However in my test environment the remoteHost attribute is still set to the correct value, so it looks like it gets set somewhere else as well. Haven't figured out how, though.
Ah ok, I just reviewed your picked lines from the SAPHanaTopology. I guess that this message (with the empty remote HOST) only applies, when the HANA is DOWN. In this case we could not determine the remoteHANAHost, because hdbnsutil does not give us this info. In this situation we just use "the" other node in a two-node-setup. This is one of the reasons, why we are limited to 2 nodes in a scale-up scenario :/
No, on my setup this actually happens also on a running cluster where HANA is UP, and there the message is printed every minute when the "monitor_clone" for the SAPHanaTopology resource is running.
As far as I can see the reason is because of the following line: hanaRemoteHost=$(for n1 in $nodelist; do for n2 in $MAPPING; do if [ "$n1" == "$n2" ]; then echo $n1; fi; done; done )
The problem here is not that "hdbnsutil" can't provide the information, but that the comparison uses "nodelist", which is the list of cluster nodenames, and tries to compare that to the SAP HANA hostname it got by parsing the hdbnsutil output, which obviously in an environment where the cluster nodenames are not identical to the hostnames.
I've now tested what happens if you delete the 'hana_
... the remoteHost it is probably safe to assume that we could actually get rid of the attribute. Unfortunately we need the remote(HANA)Host Name for the REGISTRATION of a former primary, if AUTMATED_REGISTER is set to true. In that case we need to know the exact HANA virtual host name. Just using an other name of the remote host is not sufficient. In my tests the registration failed than :(
Just have created (and answerd) a pull request against master. Could you please check, if the error reported here is fixed now?
Thanks, with the latest version of SAPHanaTopology the error does not appear any more and the remoteHost attribute gets set correctly.
Regarding the previous comment about getting rid of the attribute: I'm aware that it is needed for the registration of a former primary, and as far as I can see the SAPHana resource agent has various checks built in to determine the correct remote HANA Hostname even if the attribute isn't set or contains the incorrect value. So what I meant was that we could get rid of letting the SAPHanaTopology agent trying to set this attribute and let the SAPHana agent determine the correct value when it needs to as it already does.
Did not see any more issues after applying the patch, therefore closing this issue.
In the Debug log of the SAPHanaTopology Resource Agent you can se that it is unable to determine the correct value for the hanaRemoteHost parameter in environments where the nodename is not identical to the hostname:
Jun 10 17:34:06 node2 SAPHanaTopology(rsc_SAPHanaTopology_HDB_HDB00)[11188]: INFO: DEC: site=DC2, mode=primary, MAPPING=node1, hanaRemoteHost=