Closed flystar0439 closed 7 years ago
@zet809 We need to take a look whether we can recreate this issue in xCAT local system.
@daniceexi @zet809 @pdlun92 I think we did see this happening in our local system, but not 100% sure how to reproduce it consistently. Can we show the BMC closing outside of rcons? It would help us go to the firmware team with something...
I have Little Endian Machines for one of my customers, if it helps to test, then happy to do it.
It is already provisioned, and on post-install duties
@whowutwut As @zet809 mentioned that we can recreate this problem when just using ipmitool, we can use this fact to talk with firmware team. Could we find an official way to discuss or open but to firmware team?
@arif-ali Could you LE machine recreate this issue consistently?
I don't see an issue myself, but if there is anything I can do to try and reproduce, then will happy to do it
@arif-ali do you have an environment where you have the same open power machines managed by two xcat mgmt nodes? I seem to recall we saw this happening when we were sharing machines in our development env moving them between mgmt nodes...
no, I only have one management node. the only thing is that I enable consoleondemand in the site table, so that would be the only difference with the console setup. As that can fill up /var/log/console
while going through the installation phases
@whowutwut The creator of the issue mentioned that he only has one MN. So the root cause of this issue was NOT caused multiple MNs.
@daniceexi, @zet809, @chenglch We got some more information about this in email discussion.... It seems like the solution was using ipmitool-xcat-1.8.11 which is the wrong version. But something interesting, when the team compiled ipmitool from source for 1.8.15, the problem of console dropping seemed to go away.
We only really make two changes to the ipmitool package
ipmitool-1.8.15-rflash.patch
<== could this be the issue?We need to try and re-create this issue in our lab with ipmitool-xcat and then verify if ipmitool compiled from source behaves better....
@dillonfzw, In some email exchange, you mentioned that you were running with ipmitool-xcat 1.8.15
, do you see this console dropping issue?
Good news if the 1.8.15 could fix this issue.
Need to do more investigation together with Firmware team. Move to next release.
@flystar0439 Will you pls verified this issue with our latest ipmitool-xcat-1.8.15-3?
I am seeing this problem, too. I had ipmitool-xcat-1.8.15-2 installed and updated it to ipmitool-xcat-1.8.15-3, but I still see the issue.
When I use "ipmitool-xcat ... sol activate" directly, I never see these "SOL session closed by BMC" messages.
When I am using conserver with consoleondemand=no, I see the ""SOL session closed by BMC" every few seconds. It also seems to interfere with the system's ability to boot.
This is on an IBM "Habanero" system - 8348-21C with the latest (as of 09/22/2016) firmware.
[root@smn ~]# /opt/xcat/bin/ipmitool-xcat -I lanplus -U <...> -P <...> -H 50.2.0.68 mc info Device ID : 32 Device Revision : 1 Firmware Revision : 2.16 IPMI Version : 2.0 Manufacturer ID : 0 Manufacturer Name : Unknown Product ID : 43707 (0xaabb) Product Name : Unknown (0xAABB) Device Available : yes Provides Device SDRs : no Additional Device Support : Sensor Device SDR Repository Device SEL Device FRU Inventory Device IPMB Event Receiver IPMB Event Generator Chassis Device Aux Firmware Rev Info : 0xaa 0x66 0x01 0x00
[root@smn ~]# /opt/xcat/bin/ipmitool-xcat -I lanplus -U <...> -P <...> -H 50.2.0.68 fru print 43 Product Name : OpenPOWER Firmware Product Version : IBM-habanero-ibm-OP8_v1.7_1.62 Product Extra : hostboot-bc98d0b-1a29dff Product Extra : occ-0362706-16fdfa7 Product Extra : skiboot-5.1.13 Product Extra : hostboot-binaries-43d5a59 Product Extra : habanero-xml-a71550e-cdd3b31 Product Extra : capp-ucode-105cb8f
[root@smn ~]# /opt/xcat/bin/ipmitool-xcat -V ipmitool-xcat version 1.8.15 [root@smn ~]# rpm -q --whatprovides /opt/xcat/bin/ipmitool-xcat ipmitool-xcat-1.8.15-3.ppc64le
[root@smn ~]# rpm -q --whatprovides /etc/rc.d/init.d/conserver conserver-xcat-8.1.16-10.ppc64le
[root@smn ~]# rpm -qa | grep xCAT xCAT-2.12-snap201606240318.ppc64le perl-xCAT-2.12-snap201606240317.noarch xCAT-server-2.12-snap201606240318.noarch xCAT-genesis-base-ppc64-2.12-snap201606130335.noarch xCAT-buildkit-2.12-snap201606240317.noarch xCAT-client-2.12-snap201606240318.noarch xCAT-genesis-scripts-ppc64-2.12-snap201606240317.noarch xCAT-vlan-2.12-snap201606240317.noarch
This release is mainly for mInsky, so move this defect to next release for further investigation.
Reference to #1963
@whowutwut , I noticed this issue on our frame23cn18, it happens a little frequently when rebooting the machine. But this issue can not be fixed from client side(ipmitool).
[Tue Nov 1 03:32:28 2016] 33.27322|ISTEP 11.10
[Tue Nov 1 03:32:30 2016]SOL session closed by BMC
[-- Console down -- Tue Nov 1 03:32:30 2016]
[-- Console up -- Tue Nov 1 03:32:31 2016]
[Tue Nov 1 03:32:32 2016][SOL Session operational. Use ~? for help]
[Tue Nov 1 03:32:33 2016] 37.08533|ISTEP 13. 8
[Tue Nov 1 03:32:33 2016] 37.43448|ISTEP 13. 9
[Tue Nov 1 03:32:34 2016] 39.19401|ISTEP 13.10
[Tue Nov 1 03:33:21 2016] 94.76739|ISTEP 13.11
[Tue Nov 1 03:33:21 2016] 94.96537|ISTEP 13.12
[Tue Nov 1 03:33:21 2016] 94.96604|ISTEP 14. 1
[Tue Nov 1 03:33:22 2016] 95.28645|ISTEP 14. 2
[Tue Nov 1 03:33:22 2016] 95.29670|ISTEP 14. 3
[Tue Nov 1 03:33:25 2016] 98.59058|ISTEP 14. 4
[Tue Nov 1 03:33:25 2016] 98.64294|ISTEP 14. 5
[Tue Nov 1 03:33:25 2016] 98.71250|ISTEP 14. 6
[Tue Nov 1 03:33:25 2016] 98.71391|ISTEP 14. 7
[Tue Nov 1 03:33:25 2016] 98.88211|ISTEP 14. 8
[Tue Nov 1 03:33:25 2016] 98.88377|ISTEP 15. 1
[Tue Nov 1 03:33:26 2016] 99.72602|ISTEP 15. 2
[Tue Nov 1 03:33:26 2016] 99.74654|ISTEP 15. 3
[Tue Nov 1 03:33:26 2016] 99.86674|ISTEP 16. 1
[Tue Nov 1 03:33:26 2016]101.03400|ISTEP 16. 2
[Tue Nov 1 03:33:26 2016]101.47670|ISTEP 16. 3
[Tue Nov 1 03:33:26 2016]101.53993|ISTEP 16. 4
[Tue Nov 1 03:33:26 2016]101.51723|ISTEP 18.13
[Tue Nov 1 03:33:26 2016]101.65641|ISTEP 18.14
[Tue Nov 1 03:33:26 2016]101.66685|ISTEP 20. 1
[Tue Nov 1 03:33:28 2016]102.30749|ISTEP 21. 1
[Tue Nov 1 03:33:30 2016]SOL session closed by BMC
[-- Console down -- Tue Nov 1 03:33:30 2016]
[-- Console up -- Tue Nov 1 03:33:30 2016]
[Tue Nov 1 03:33:32 2016][SOL Session operational. Use ~? for help]
[Tue Nov 1 03:33:39 2016]115.39851|htmgt|OCCs are now running in ACTIVE state
[Tue Nov 1 03:33:46 2016]123.34082|ISTEP 21. 2
[Tue Nov 1 03:33:46 2016]123.33555|ISTEP 21. 3
@flystar0439 We have fixed this issue with our latest ipmitool-xcat-1.8.17-1 build, will you pls help to verify? Thx!
no response, closing out.
When TTY is in a daze, rcons always reports many 'session closed' entries repeatedly. It massed up the screen output. We should keep a long connection with BMC. I only noticed this issue on open power system which use BMC as hardware control(IPMI).