Closed jurajhajka closed 2 years ago
Can you mention exactly from which driver level (i guess it was 942) to which level, the upgrade was triggered? Please share the playbook if possible.
Does the HMC came back online after the failure? Is it pingable?
Also share the output of the command lshmc -v
I applied ifix MH01857 - for 941 level. HMC came back online you have output after reboot ifix is there. Then I did update to 950 level with same result. HMC came back with correct updated level but playbook finished with same error. hscroot@vhmcansible:~> lshmc -V "version= Version: 9 Release: 2 Service Pack: 950 HMC Build level 2010230054 ","base_version=V9R2 "
We typically see the error Hmc not responding after reboot
, in case HMC is not pingable even after waiting for 60 mins post update/upgrade. Can you confirm did it really take that much of time for it to come back online after reboot?
MH01857 is very small ifix few kb. vHMC was back online in 10 min. upgrade to 950 was also faster < 60 min.
Some upgrades/patches could take more then 60 min.
Can you run the below python code snippet from the ansible control node (node on which playbook is triggered) ? Replace the
import re
import subprocess
def pingTest(i_host):
pattern = re.compile(r"(\d) received")
report = ("No response", "Partial Response", "Alive")
cmd = "ping -c 2 " + i_host.strip()
result = "No response"
with subprocess.Popen(cmd, shell=True, executable="/bin/bash",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE) as proc:
stdout_value, stderr_value = proc.communicate()
if isinstance(stdout_value, bytes):
stdout_value = stdout_value.decode("ascii")
igot = re.findall(pattern, stdout_value)
if igot:
result = report[int(igot[0])]
return result
print(pingTest("<hmc_ip>"))
The issue looks like the ping command result is a bit different on your control node than usually expected on a linux machine.
Instead of 2 packets transmitted, 2 received, 0% packet loss, time 1025ms
it is giving
2 packets transmitted, 2 packets received, 0% packet loss, time 1025ms
Can you please run the below modified code snippet again to confirm that?
import subprocess
def pingTest(i_host):
pattern = re.compile(r"(\d) (packets\s)?received")
report = ("No response", "Partial Response", "Alive")
cmd = "ping -c 2 " + i_host.strip()
result = "No response"
with subprocess.Popen(cmd, shell=True, executable="/bin/bash",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE) as proc:
stdout_value, stderr_value = proc.communicate()
if isinstance(stdout_value, bytes):
stdout_value = stdout_value.decode("ascii")
igot = re.findall(pattern, stdout_value)
if igot:
result = report[int(igot[0][0])]
return result
print(pingTest("<hmc_ip>"))
Curious on the linux flavour you are using on the control node
We are running control node on AIX
ansau@a9tvap105:/home/ansau$ python hmc_test2.py Alive ansau@a9tvap105:/home/ansau$
looks much better
diff hmc_test.py hmc_test2.py 4c4 pattern = re.compile(r"(\d) received") pattern = re.compile(r"(\d) (packets\s)?received") 8c9 with subprocess.Popen(cmd, shell=True, executable="/usr/bin/bash", with subprocess.Popen(cmd, shell=True, executable="/bin/bash", 18c19 result = report[int(igot[0])] result = report[int(igot[0][0])] 23d23
Added the fix with commit: a9bd5b2a497e3. This will be available with latest version v1.6.0
Thank you.
Describe the bug HMC_update_upgrade module finish with FAILED: Hmc not responding after reboot
TASK [debug] ** Tuesday 23 August 2022 07:51:03 EDT (0:00:00.092) 0:00:02.326 ** ok: [vhmc_ansible] => missing_ifixes:
TASK [Installing missing ifixes] ** Tuesday 23 August 2022 07:51:03 EDT (0:00:00.067) 0:00:02.394 ** included: /home/ansau/project/fiserv-ansible/playbooks/hmc_update.yml for vhmc_ansible
TASK [Update the HMC to the V9R2M952 build level from sftp location] ** Tuesday 23 August 2022 07:51:03 EDT (0:00:00.105) 0:00:02.499 ** fatal: [vhmc_ansible]: FAILED! => changed=false msg: 'FAILED: Hmc not responding after reboot' ...ignoring
TASK [pause] ** Tuesday 23 August 2022 08:53:52 EDT (1:02:48.995) 1:02:51.495 ** [pause]
hscroot@vhmcansible:~> who -b system boot Aug 23 11:53 hscroot@vhmcansible:~> lshmc -V "version= Version: 9 Release: 1 Service Pack: 942 HMC Build level 2011270432 MH01759 - HMC V9R1 M920 [x86_64] MH01787 - Required fix for HMC V9R1 M920 [x86_64] MH01789 - HMC V9R1 Service Pack 1 Release (M921) [x86_64] MH01800 - iFix for HMC V9R1 M921 MH01808 - iFix for HMC V9R1 M921 MH01810 - HMC V9R1 M930 MH01820 - iFix for HMC V9R1 M910+ MH01825 - iFix for HMC V9R1 M930 MH01857 - Save upgrade fix for HMC V9R1 M910+ MH01876 - HMC V9R1 M942 ","base_version=V9R1 "
hscroot@vhmcansible:~> Expected behavior reconnect to rebooted HMC and check versions
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information): HMC: tested with several versions of HMC code [V9R952, V9R1M910] Python 3.7.12 OpenSSH_8.1p1, OpenSSL 1.0.2u
Additional context