Open ZouhirYachou opened 9 months ago
@ZouhirYachou sapinst
can take many hours to run, the SSH Session Tunnel has a tendancy to timeout and therefore the Ansible Task never ends even when the sapinst
process has ended. This is why async
approach with checking the process has ended was used. This is explained in the commented code.
The default behaviour was altered upon request of other end-users, where the SWPM stdout/stderr upon error would wipe a terminal window if the scrollback buffer settings were too low (easily SWPM can output 10,000 lines to the terminal window).
Upon end-user request the following commit was created that introduced the variable set to not display output by default: https://github.com/sap-linuxlab/community.sap_install/commit/1861c15972abeeded7351ef41a9425823c39631e
If you use sap_swpm_display_unattended_output: true
in your variables, you will see the output.
Even with the usage of the sap_swpm_display_unattended_output: true
variable, my playbook fails before the task that show the output, therefore, no access to the logs
TASK [community.sap_install.sap_swpm : Display the sapinst command line] *******
ok: [vlh1bse26] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.jrzwy2moswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_ASCS:S4HANA2022.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [community.sap_install.sap_swpm : SAP SWPM -] *****************************
changed: [vlh1bse26]
TASK [community.sap_install.sap_swpm : SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds] ***
ok: [vlh1bse26]
TASK [community.sap_install.sap_swpm : SAP SWPM - Verify if sapinst process finished successfully] ***
fatal: [vlh1bse26]: FAILED! =>
{
"ansible_job_id": "j325284186928.5920",
"changed": false,
"failed_when_result": true,
"finished": 0,
"results_file": "/root/.ansible_async/j325284186928.5920",
"started": 1,
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
When I run the command manually on the host, I do not get any errors and the script gives a 0 return code
my proposition allows for the monitoring with a update on its status every 30 seconds (we can probably change the value for async to allow more than 30 minutes)
TASK [local_sap_swpm : Display the sapinst command line] ***********************
ok: [vlh1bse26] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.n18ypd4cswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_ASCS:S4HANA2022.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [local_sap_swpm : SAP SWPM -] *********************************************
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC OK on vlh1bse26: jid=j847759566754.5984
changed: [vlh1bse26]
TASK [local_sap_swpm : SAP SWPM - Find last installation location] *************
ok: [vlh1bse26]
HI @ZouhirYachou , async:1800 is very optimistic. I have seen an S/4 install running for 3 hours in a cloud test environment with a slow database, so the async: 32400 makes total sense. if we set poll to 30, we might get the same result as if we watch the the process ending. At least we get a less confusing shell output. I do not know exactly the previous implementation. Still, I would suggest encapsulating the current and the suggested method in code blocks, which enables us to switch between the two by a variable. @ZouhirYachou 's suggestion is at least a cleaner implementation, that should become the default if it can be proven to be stable with the current ansible release. What do you think @berndfinger, @sean-freeman?
@ZouhirYachou something is not right in this output.... under ansible_job_id
should be the executed cmd
and a stdout
/stderr
entries.
Such as....
TASK [community.sap_install.sap_swpm : Display the sapinst command line] *********
ok: [nwas01] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.zm7n3b1gswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_OneHost:S4HANA2021.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [community.sap_install.sap_swpm : SAP SWPM -] ******************************
changed: [nwas01]
TASK [community.sap_install.sap_swpm : SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds] **********
FAILED - RETRYING: [nwas01]: SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds (1000 retries left).
ok: [nwas01]
TASK [community.sap_install.sap_swpm : SAP SWPM - Verify if sapinst process finished successfully] *********
fatal: [nwas01]: FAILED! =>
{
"ansible_job_id": "j444392358629.64741",
"changed": true,
"cmd": "umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.zm7n3b1gswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_OneHost:S4HANA2021.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false \n",
"failed_when_result": true,
"finished": 1,
"msg": "non-zero return code",
"rc": 111,
"results_file": "/root/.ansible_async/j444392358629.64741",
"start": "2023-06-30 18:41:40.436147",
"started": 1,
"stderr_lines": [
"=>sapparam(1c): No Profile used.",
"=>sapparam: SAPSYSTEMNAME neither in Profile nor in Commandline",
"################################################",
"Abort execution because of ",
"Step returns osmod.hosts.getHostByName",
"################################################"
],
"stdout_lines": [
"Extracting...",
"Extraction done!",
"SAPinst build information:"
....
....
"Removed directory /root/.sapinst/nwas01.example.com/64833."
]
}
@ZouhirYachou let's confirm a few things because I've not seen this behaviour before and the functionality of this Ansible Role has not changed (except for request to hide output, as shown in commit above + that has no impact on the debug you showed) in over 12 months.
sap_swpm_sapinst_path
is set to the directory path containing sapinst? e.g. if /software/sap_swpm_unpack/sapinst
then variable would be sap_swpm_sapinst_path: /software/sap_swpm_unpack
.
Ansible Core and Python version, see example...
$ ansible-playbook --version
ansible-playbook [core 2.16.2]
python version = 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] (/Users/username/.py_venv/py_ansible/bin/python3)
jinja version = 3.1.2
libyaml = True
ansible-galaxy collection list
N.B. Poll is set to 60 seconds, so that it is easier for end-user to mentally calculate how long the installation has taken. It Ansible waits 59 seconds too long on a 5 minute install, it's a bit annoying but on a 3 hour install it's unnoticeable.
Hello
The variable is set
sap_swpm_sapinst_path: /sapinst/swpm/sap_swpm_extracted/
Ansible version and python version: (we are using Ansible Automation platform 2.4 with Ansible EE 2.15)
bash-4.4# ansible --version
ansible [core 2.15.8]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3.9/site-packages/ansible
ansible collection location = /home/runner/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/bin/ansible
python version = 3.9.18 (main, Sep 22 2023, 17:58:34) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] (/usr/bin/python3.9)
jinja version = 3.1.2
libyaml = True
bash-4.4# python --version
Python 3.9.18
and the requirements.yml for the collections
collections:
- name: community.general
version: 6.5.0
- name: redhat.rhel_system_roles
version: 1.22.0
- name: community.sap_install
version: 1.4.0
I do not understand why we use 3 tasks and a poll 0 value when we could just use one task with a positive poll value since we do not run other tasks concurrently I can't explain the issue i'm having (empty output) but with my proposition, I do not have any issues running the script
@ZouhirYachou I explained this above. After a certain release of SAP SWPM 2.0 (SP10 I think), the Ansible Task that executed SAP SWPM would continue forever even though the sapinst process had exited successfully. It was almost impossible to diagnose, therefore a separation:
sapinst
to run dettached (async 0)sapinst
process every 60 seconds, and use failed_when if the watch/poll .finished
was not 1
or the .rc
was not 0
I'll run an SAP SWPM today with false entries that triggers a failure, using the versions provided to replicate your issue
@ZouhirYachou I have attempted:
ansible.builtin.async_status
Ansible Modulecommunity.general
version 6.5.0
to assess community.general.pids
Ansible ModuleI cannot replicate your output (and subsequent failure) from my laptop. Therefore I have to conclude there is something about the specific setup, and I must run a test from Ansible Automation Platform with Ansible EE 2.15
Can you please describe the steps you used to upload and execute your Playbook from AAP ? I've never used it before and want to be sure the setup is identical to yours
I have synced the sap_install collection to our internal Automation Hub and we then use it in AAP We used RedHat documentation for the setup
@ZouhirYachou which documentation specifically?
Like I said, I have never used AAP before and will need to setup everything identically to yours.
This documentation to configure the Hub with AAP https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.4/html/getting_started_with_automation_hub/configure-hub-primary#proc-configure-automation-hub-server-gui
and this documentation to sync content from ansible galaxy https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.4/html/managing_content_in_automation_hub/managing-cert-valid-content#assembly-creating-tokens-in-automation-hub
@Sean: It should be easier and possible to pull the EE and run from ansible-navigator. @Zouhir: In AAP it is recommended to create an AAP with the 3 collections derived from your EE and not bind mount the collection into the container (although this is possible and should work)
Hello The task that runs the install script ./sapinst should not be ran as async and then monitored with other tasks When the scripts fails, Ansible does not provide the stderr and stdout for this script, rendering the troubleshooting impossible There should be one task to run for the script
This task https://github.com/sap-linuxlab/community.sap_install/blob/main/roles/sap_swpm/tasks/swpm.yml#L64 should be:
and remove the following tasks to monitor the script There is no need to retreive RC and output as the shell module already does this