SUSE / ha-sap-terraform-deployments

Automated SAP/HA Deployments in Public/Private Clouds
GNU General Public License v3.0
122 stars 88 forks source link

HANA and NetWeaver cluster corosync is not in udpu mode #711

Closed pirat013 closed 3 years ago

pirat013 commented 3 years ago

Used cloud platform libvirt

Used SLES4SAP version SLES15SP2

Used client machine OS Linux SLES15 SP2

Expected behaviour vs observed behaviour Deploying a SAP system with DRBD, HANA and NetWeaver cluster on libvirt results in two different cluster types. The HANA and NetWeaver cluster are multicast cluster and the DRBD cluster is a UDPU cluster. Based on our Best Practices the HANA and NetWeaver cluster should be deployed as a UDPD cluster as well because this is the only option which we have documented and recommending.

How to reproduce Specify the step by step process to reproduce the issue. This usually would look like something like this:

  1. Move to any of the cloud providers folder
  2. Create the terraform.tfvars file based on terraform.tfvars.example
  3. Run the next terraform commands:
    terraform init
    terraform plan
    terraform apply -auto-approve

The usage of the provisioning_log_level = "info" option in the terraform.tfvars file is interesting to get more information during the terraform commands execution. So it is suggested to run the deployment with this option to see what happens before opening any ticket.

Used terraform.tfvars

#################################
# ha-sap-terraform-deployments project configuration file
# Find all the available variables and definitions in the variables.tf file
#################################

# qemu uri, this example is to run locally
qemu_uri = "qemu:///system"

# Use already existing network
#network_name = "my-network"

# Due to some internal limitations, the iprange of the existing network must be defined
# The iprange must be defined for already existing network and to create a new one
iprange = "192.168.19.0/24"

# libvirt storage pool, select the libvirt storage pool where the volume will stored

storage_pool = "github"

# Base image configuration. This images will be used for all deployed machines unless the specific image is defined
# The source image has preference over the `volume_name` parameter
source_image = "/moon/media/suse/SLES4SAP-15_SP2-JeOS.x86_64-0.3-Build2.264.qcow2"

# Use an already existing image. The image must be in the same storage pool defined in `storage_pool` parameter
# This option is way faster as the image must not be downloaded
#volume_name = "SLES4SAP-15_SP1"

#################################
# General configuration variables
#################################

# Deployment name. This variable is used to complement the name of multiple infrastructure resources adding the string as suffix
# If it is not used, the terraform workspace string is used
# The name must be unique among different deployments
deployment_name = "apmor"

# SUSE Customer Center Registration parameters.
#reg_code = "<<REG_CODE>>"
#reg_email = "<<your email>>"
reg_code = "<empty>"
reg_email = "<empty>"

# For any sle12 version the additional module sle-module-adv-systems-management/12/x86_64 is mandatory if reg_code is provided
#reg_additional_modules = {
#    "sle-module-adv-systems-management/12/x86_64" = ""
#    "sle-module-containers/12/x86_64" = ""
#    "sle-ha-geo/12.4/x86_64" = "<<REG_CODE>>"
#}

# Authorize additional keys optionally (in this case, the private key is not required)
# Path to local files or keys content
#authorized_keys = ["/home/myuser/.ssh/id_rsa_second_key.pub", "/home/myuser/.ssh/id_rsa_third_key.pub", "ssh-rsa AAAAB3NzaC1yc2EAAAA...."]
authorized_keys = ["/root/.ssh/id_rsa.pub"]

##########################
# Other deployment options
##########################

# Repository url used to install development versions of HA/SAP deployment packages
# The latest RPM packages can be found at:
# https://download.opensuse.org/repositories/network:ha-clustering:sap-deployments:devel/{YOUR SLE VERSION}
# Contains the salt formulas rpm packages.
# To auto detect the SLE version, just leave it out from the url:
#ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:ha-clustering:sap-deployments:devel/"
# Otherwise use a specific SLE version:
#ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:ha-clustering:sap-deployments:devel/SLE_15/"
#ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:ha-clustering:sap-deployments:devel/SLE_15_SP2/"
ha_sap_deployment_repo = "https://download.opensuse.org/repositories/network:ha-clustering:sap-deployments:v7/"

# Provisioning log level (error by default)
#provisioning_log_level = "info"

# Print colored output of the provisioning execution (true by default)
#provisioning_output_colored = false

# Enable pre deployment steps (disabled by default)
pre_deployment = true

# To disable the provisioning process
#provisioner = ""

# Run provisioner execution in background
#background = true

# QA variables

# Define if the deployement is using for testing purpose
# Disable all extra packages that do not come from the image
# Except salt-minion (for the moment) and salt formulas
# true or false
#qa_mode = false

# Execute HANA Hardware Configuration Check Tool to bench filesystems
# qa_mode must be set to true for executing hwcct
# true or false (default)
#hwcct = false

#########################
# HANA machines variables
#########################

# Set specific image for HANA (it's the same for iscsi, monitoring, netweaver and drbd)
# This option has preference over base image options
# hana_source_image = "url-to-your-sles4sap-image"
# hana_volume_name   = "SLES4SAP-15_SP0"

# Disk size for HANA database content in bytes
#hana_node_disk_size  = 68719476736 # 64GB
# For S/4HANA set a big disk size
hana_node_disk_size  = 375809638400 # 350GB

# The next variables define how the HANA installation software is obtained.
# 'hana_inst_master' is a NFS share where HANA installation files (extracted or not) are stored
# 'hana_inst_master' must be used always! It is used as the reference path to the other variables

# Local folder where HANA installation master will be mounted
hana_inst_folder = "/sapmedia/HANA"

# There are multiple options:
# 1. Use an already extracted HANA Platform folder structure.
# The last numbered folder is the HANA Platform folder with the extracted files with
# something like `HDB:HANA:2.0:LINUX_X86_64:SAP HANA PLATFORM EDITION 2.0::XXXXXX` in the LABEL.ASC file
hana_inst_master = "192.168.100.1:/moon/media"

# 2. Combine the `hana_inst_master` with `hana_platform_folder` variable.
#hana_inst_master = "url-to-your-nfs-share:/sapdata/sap_inst_media"
# Specify the path to already extracted HANA platform installation media, relative to hana_inst_master mounting point.
# This will have preference over hana archive installation media
#hana_platform_folder = "sap/sps05-plat"

# 3. Specify the path to the HANA installation archive file in either of SAR, RAR, ZIP, EXE formats, relative to the 'hana_inst_master' mounting point
# For multipart RAR archives, provide the first part EXE file name.
hana_archive_file = "sap/sar/IMDB_SERVER20_048_5-80002031.SAR"

# 4. If using HANA SAR archive, provide the compatible version of sapcar executable to extract the SAR archive
#hana_archive_file = "IMDB_SERVER.SAR"
hana_sapcar_exe = "sap/SAPCAR"

# For option 3 and 4, HANA installation archives are extracted to the path specified
# at hana_extract_dir (optional, by default /sapmedia_extract/HANA). This folder cannot be the same as `hana_inst_folder`!
hana_extract_dir = "/sapmedia_extract/HANA"

# The following SAP HANA Client variables are needed only when you are using a HANA database SAR archive for HANA installation.
# HANA Client is used by monitoring & cost-optimized scenario and it is already included in HANA platform media unless a HANA database SAR archive is used
# You can provide HANA Client in one of the two options below:
# 1. Path to already extracted hana client folder, relative to hana_inst_master mounting point
#hana_client_folder = "sap/sps05/SAP_HANA_CLIENT"
# 2. Or specify the path to the hana client SAR archive file, relative to the 'hana_inst_master'. To extract the SAR archive, you need to also provide compatible version of sapcar executable in variable hana_sapcar_exe
# It will be extracted to hana_client_extract_dir path (optional, by default /sapmedia_extract/HANA_CLIENT)
hana_client_archive_file = "sap/sar/IMDB_CLIENT20_008_22-80002082.SAR"
hana_client_extract_dir = "/sapmedia_extract/HANA_CLIENT"

# More configuration about the HANA machines
# Set the IP addresses for the HANA machines. Leave this commented to get an autogenerated addresses
#hana_ips         = ["192.168.XXX.Y", "192.168.XXX.Y+1"]

# HANA instance configuration
# Find some references about the variables in:
# https://help.sap.com
# HANA instance system identifier. It's composed of 3 characters string.
hana_sid = "H2O"
# HANA instance number. It's composed of 2 integers string
hana_instance_number = "02"
# HANA instance master password. It must follow the SAP Password policies
hana_master_password = "<empty>"
# HANA primary site name. Only used if HANA's system replication feature is enabled (hana_ha_enabled to true)
hana_primary_site = "up"
# HANA secondary site name. Only used if HANA's system replication feature is enabled (hana_ha_enabled to true)
hana_secondary_site = "down"

# Enable system replication and HA cluster
hana_ha_enabled = true
hana_count = 1
#hana_node_disk_size = 214748364800
hana_node_memory = 65356
# Enable Active/Active HANA setup (read-only access in the secondary instance)
#hana_active_active = true

# Cost optimized scenario
#scenario_type: "cost-optimized"

#######################
# SBD related variables
#######################

# In order to enable the iscsi machine creation _fencing_mechanism must be set to 'sbd' for any of the clusters
# Choose the sbd storage option. Options: iscsi, shared-disk
sbd_storage_type = "iscsi"
#iscsi_srv_ip = "192.168.XXX.Y+6"

# Number of LUN (logical units) to serve with the iscsi server. Each LUN can be used as a unique sbd disk
iscsi_lun_count = 7

# Disk size in Bytes used to create the LUNs and partitions to be served by the ISCSI service
sbd_disk_size = 10737418240

##############################
# Monitoring related variables
##############################

# Enable the host to be monitored by exporters
monitoring_enabled = true

# IP address of the machine where prometheus and grafana are running
#monitoring_srv_ip = "192.168.XXX.Y+7"

########################
# DRBD related variables
########################

# Enable the DRBD cluster for nfs
drbd_enabled = true

# IP of DRBD cluster
#drbd_ips = ["192.168.XXX.Y+8", "192.168.XXX.Y+9"]

# NFS share mounting point and export. Warning: Since cloud images are using cloud-init, /mnt folder cannot be used as standard mounting point folder
# DRBD cluster will create the NFS export in /mnt_permanent/sapdata/{netweaver_sid} to be connected as {drbd_cluster_vip}:/{netweaver_sid} (e.g.: )192.168.1.20:/HA1
drbd_nfs_mounting_point = "/mnt_permanent/sapdata"

#############################
# Netweaver or S/4HANA related variables
#############################

# Enable/disable Netweaver deployment
netweaver_enabled = true

# Netweaver APP server count (PAS and AAS)
# Set to 0 to install the PAS instance in the same instance as the ASCS. This means only 1 machine is installed in the deployment (2 if HA capabilities are enabled)
# Set to 1 to only enable 1 PAS instance in an additional machine`
# Set to 2 or higher to deploy additional AAS instances in new machines
netweaver_app_server_count = 2

# Enabling this option will create a ASCS/ERS HA available cluster together with a PAS and AAS application servers
# Set to false to only create a ASCS and PAS instances
netweaver_ha_enabled = true

# Select SBD as fencing mechanism for the Netweaver cluster
netweaver_cluster_fencing_mechanism = "sbd"

# Set the Netweaver product id. The 'HA' sufix means that the installation uses an ASCS/ERS cluster
# Below are the supported SAP Netweaver product ids if using SWPM version 1.0:
# - NW750.HDB.ABAP
# - NW750.HDB.ABAPHA
# - S4HANA1709.CORE.HDB.ABAP
# - S4HANA1709.CORE.HDB.ABAPHA
# Below are the supported SAP Netweaver product ids if using SWPM version 2.0:
# - S4HANA1809.CORE.HDB.ABAP
# - S4HANA1809.CORE.HDB.ABAPHA
# - S4HANA1909.CORE.HDB.ABAP
# - S4HANA1909.CORE.HDB.ABAPHA
# - S4HANA2020.CORE.HDB.ABAP
# - S4HANA2020.CORE.HDB.ABAPHA

# Example:
netweaver_product_id = "S4HANA2020.CORE.HDB.ABAPHA"

# Preparing the Netweaver download basket. Check `doc/sap_software.md` for more information

# NFS share with Netweaver installation folders
netweaver_inst_media = "192.168.100.1:/moon/media/sap"

# This share must contain the next software (select the version you want to install of course)

# NFS share to store the Netweaver shared files. Only used if drbd_enabled is not set. For single machine deployments (ASCS and PAS in the same machine) set an empty string
#netweaver_nfs_share = "url-to-your-netweaver-sapmnt-nfs-share"

# Path where netweaver sapmnt data is stored.
netweaver_sapmnt_path = "/sapmnt"

# Netweaver installation required folders
# SAP SWPM installation folder, relative to the netweaver_inst_media mounting point
netweaver_swpm_folder     =  "SWPM2"
# Or specify the path to the sapcar executable & SWPM installer sar archive, relative to the netweaver_inst_media mounting point
# The sar archive will be extracted to path specified at netweaver_extract_dir under SWPM directory (optional, by default /sapmedia_extract/NW/SWPM)
#netweaver_sapcar_exe = "your_sapcar_exe_file_path"
#netweaver_swpm_sar = "your_swpm_sar_file_path"
# Folder where needed SAR executables (sapexe, sapdbexe) are stored, relative to the netweaver_inst_media mounting point
netweaver_sapexe_folder   =  "S4HANA_2020"
# Additional media archives or folders (added in start_dir.cd), relative to the netweaver_inst_media mounting point
netweaver_additional_dvds = ["S4HANA_2020", "sps05/SAP_HANA_CLIENT/"]

# IP addresses of the machines hosting Netweaver instances
#netweaver_ips = ["192.168.XXX.Y+2", "192.168.XXX.Y+3", "192.168.XXX.Y+4", "192.168.XXX.Y+5"]
#netweaver_virtual_ips = ["192.168.XXX.Y+6", "192.168.XXX.Y+7", "192.168.XXX.Y+8", "192.168.XXX.Y+9"]

# Netweaver installation configuration
# Netweaver system identifier. It's composed of 3 characters string
netweaver_sid = "h3o"
# Netweaver ASCS instance number. It's composed of 2 integers string
netweaver_ascs_instance_number = "00"
# Netweaver ERS instance number. It's composed of 2 integers string
netweaver_ers_instance_number = "10"
# Netweaver PAS instance number. If additional AAS machines are deployed, they get the next number starting from the PAS instance number. It's composed of 2 integers string
netweaver_pas_instance_number = "01"
# Netweaver master password. It must follow the SAP Password policies such as having 8 characters at least combining upper and lower case characters and numbers. It cannot start with special characters.
netweaver_master_password = "<empty>"

Logs Upload the deployment logs to make the root cause finding easier. The logs might have sensitive secrets exposed. Remove them before uploading anything here. Otherwise, contact @arbulu89 to send the logs privately.

@arbulu89 if really required I'll send the logs privately.

These is the list of the required logs (each of the deployed machines will have all of them):

Additional logs might be required to deepen the analysis on HANA or NETWEAVER installation. They will be asked specifically in case of need.

yeoldegrove commented 3 years ago

In general, there are sane defaults set to use unicast.

❯ grep unicast .
  pillar_examples/azure/cluster.sls
  5:  unicast: True

  pillar_examples/openstack/cluster.sls
  5:  unicast: True

  pillar_examples/automatic/hana/cluster.sls
  13:  unicast: True

  pillar_examples/aws/cluster.sls
  5:  unicast: True

  pillar_examples/automatic/netweaver/cluster.sls
  11:  unicast: True

  pillar_examples/automatic/drbd/cluster.sls
  10:  unicast: True

Also a test deployment on azure had unicast set for every cluster type. The issue is that pillar_examples files for openstack are missing the unicast: True lines.

pillar_examples/libvirt/cost_optimized/cluster.sls
pillar_examples/libvirt/performance_optimized/cluster.sls

715 includes a fix cosmetical change.

yeoldegrove commented 3 years ago

Actually #715 is not a fix but a cosmetic change.

@pirat013 I could not verify this issue on azure and do not have libvirt setup at hand ... Did you by any chance not use the pillar_example/automatic/{drbd,hana,netweaver}/cluster.sls files? This could be the case if pre_deployment = true is not set (which is, according to your terraform.tfvars).

yeoldegrove commented 3 years ago

715 now also includes a change for the behavior experienced by @pirat013

yeoldegrove commented 3 years ago

fixed in #715