sap-linuxlab / community.sap_operations

Automation for SAP - Collection of Ansible Roles for various operational tasks with SAP Systems
Apache License 2.0
24 stars 18 forks source link

SAP NW Start - wrong starting order. PAS before ASCS #30

Open vsliNQ opened 3 weeks ago

vsliNQ commented 3 weeks ago

Dear community,

we try to start a S4/HANA with the role "community.sap_operations.sap_control ".

We noticed that the PAS is started before the ASCS which leads the Dispatcher to stay in YELLOW status.

Normally the ASCS should start first, and then the PAS.

After a timeout the TASK cancels with an ERROR.

TASK [community.sap_operations.sap_control : SAP NW Start - Executing sapcontrol -nr 01 -function StartWait 180 2] *** fatal: [10.0.0.5]: FAILED! => {"changed": true, "cmd": "source ~/.profile && sapcontrol -nr 01 -function StartWait 180 2\n", "delta": "0:03:00.730358", "end": "2024-10-24 10:36:12.309141", "failed_when_result": true, "msg": "non-zero return code", "rc": 2, "start": "2024-10-24 10:33:11.578783", "stderr": "", "stderr_lines": [], "stdout": "\n24.10.2024 10:33:11\nStart\nOK\n\n24.10.2024 10:36:12\nStartWait\nFAIL: Timeout", "stdout_lines": ["", "24.10.2024 10:33:11", "Start", "OK", "", "24.10.2024 10:36:12", "StartWait", "FAIL: Timeout"]}

image

image

Thanks in advance!

Best regards, Vasili

Jaro-nqc commented 3 weeks ago

Hi together,

it seems that the tasks to start the systems are executed by the order of the SAP instance numbers. In our case we have these instance numbers: HANA = 00 PAS = 01 ASCS= 02

In this case first the HANA is started which is fine. But then ansible starts the PAS and this one can't start properly as long as the ASCS is down.

I've swapped the instance numbers of PAS and ASCS to: ASCS = 01 PAS = 02

After that the start operation works fine :-)

So the current conclusion is that the instances are being started in the order of their instance numbers, which is not a good idea. The start order should depend on the instance type and not on the instance number ... like first start the DB, then start the ASCS, then start the PAS, then start the AAS. The stop operation should use the reverse order ... first stop the AAS, then the PAS, then the ASCS and the DB as last.

Best regards, Jaro