SUSE / ha-sap-terraform-deployments

Automated SAP/HA Deployments in Public/Private Clouds
GNU General Public License v3.0
123 stars 88 forks source link

NetWeaver 7.5 HA deployment incorrect parameters and notes #779

Closed ab-mohamed closed 3 years ago

ab-mohamed commented 3 years ago

Used cloud platform GCP

Used SLES4SAP version SLES15SP2 for SAP Applications

Used client machine OS Google Cloud Shell

How to reproduce

  1. Clone the master brunch.

  2. Configure the Terraform variables file.

  3. Execute terraform init and terraform apply --auto-approve commands

  4. The deployment has been completed successfully.

  5. I used https://documentation.suse.com/sbp/all/single-html/SAP_NW740_SLE15_SetupGuide/#id-crm-configuration and https://cloud.google.com/solutions/sap/docs/netweaver-ha-config-sles as a reference for my notes.

  6. In the SAPInstance RA configurations:

    • The on_fail=restart parameter must be replaced by on-fail=restart one.
    • op_params parameter is not exist on both of the above-mentioned documents.
    • op monitor interval=120 parameter value is incorrect. It should be op monitor interval=11 Below is the configurations sample from my cluster:
      primitive rsc_sap_HA1_ASCS00 SAPInstance \
      operations $id=rsc_sap_HA1_ASCS00-operations \
      op monitor interval=120 timeout=60 \
      op_params on_fail=restart \
      params InstanceName=HA1_ASCS00_sapha1as START_PROFILE="/sapmnt/HA1/profile/HA1_ASCS00_sapha1as" AUTOMATIC_RECOVER=false \
      meta resource-stickiness=5000 failure-timeout=60 migration-threshold=1 priority=10
      primitive rsc_sap_HA1_ERS10 SAPInstance \
      operations $id=rsc_sap_HA1_ERS10-operations \
      op monitor interval=120 timeout=60 \
      op_params on_fail=restart \
      params InstanceName=HA1_ERS10_sapha1er START_PROFILE="/sapmnt/HA1/profile/HA1_ERS10_sapha1er" AUTOMATIC_RECOVER=false IS_ERS=true \
      meta priority=1000
  7. A small typo in the colocation name. It should be col_sap_HA1_not_both instead of `col_sap_HA1_no_both. Below is the configurations sample from my cluster:

    colocation col_sap_HA1_no_both -5000: grp_HA1_ERS10 grp_HA1_ASCS00
  8. I am using the gcp-vpc-move-route RA. What is the default value for gcp-vpc-move-route start and stop intervals? Below is the configurations sample from my cluster:

    primitive rsc_ip_HA1_ASCS00 gcp-vpc-move-route \
        params ip=10.0.1.34 vpc_network=default-network route_name=default-nw-ascs-route \
        op start interval=0 timeout=180 \
        op stop interval=0 timeout=180 \
        op monitor interval=60 timeout=60
    primitive rsc_ip_HA1_ERS10 gcp-vpc-move-route \
        params ip=10.0.1.35 vpc_network=default-network route_name=default-nw-ers-route \
        op start interval=0 timeout=180 \
        op stop interval=0 timeout=180 \
        op monitor interval=60 timeout=60

Best regards, Ab

yeoldegrove commented 3 years ago
  1. In the SAPInstance RA configurations:

The on_fail=restart parameter must be replaced by on-fail=restart one. op_params parameter is not exist on both of the above-mentioned documents.

AFAIK this issue is only cosmetic and the old way to write it and op monitor on_params on-fail=restart is default anyhow. Anyway, it can be updated to match the docs.

op monitor interval=120 parameter value is incorrect. It should be op monitor interval=11

Can be updated to match the docs.

  1. A small typo in the colocation name. It should be col_sap_HA1_not_both instead of `col_sap_HA1_no_both. Below is the configurations sample from my cluster:

Purelly cosmetic. Could be updated to match the docs.

  1. I am using the gcp-vpc-move-route RA. What is the default value for gcp-vpc-move-route start and stop intervals?

There is not interval for start/stop operations. An interval would mean that it is called in an interval, what is not really what we want for start/stop operations ;)

ab-mohamed commented 3 years ago
  1. I am using the gcp-vpc-move-route RA. What is the default value for gcp-vpc-move-route start and stop intervals?

There is not interval for start/stop operations. An interval would mean that it is called in an interval, what is not really what we want for start/stop operations ;)

Based on https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/resources.html#operation-properties, my understanding is:

Interval: How frequently (in seconds) to perform the operation. A value of 0 means “when needed”. A positive value defines a recurring action, which is typically used with monitor.

Should we keep the currently used option, interval=0 as shown below?

primitive rsc_ip_HA1_ASCS00 gcp-vpc-move-route \
        params ip=10.0.1.34 vpc_network=default-network route_name=default-nw-ascs-route \
        op start interval=0 timeout=180 \
        op stop interval=0 timeout=180 \
        op monitor interval=60 timeout=60
primitive rsc_ip_HA1_ERS10 gcp-vpc-move-route \
        params ip=10.0.1.35 vpc_network=default-network route_name=default-nw-ers-route \
        op start interval=0 timeout=180 \
        op stop interval=0 timeout=180 \
        op monitor interval=60 timeout=60

Or should we remove it?

yeoldegrove commented 3 years ago

We can just keep it. It works and also matches the SUSE docs you referenced.