openhpi2 / Open-HPI

Open HPI is an open source implementation of the SA Forum's Hardware Platform Interface (HPI). HPI provides an abstracted interface to managing computer hardware, typically for chassis and rack based servers
Other
3 stars 1 forks source link

Openhpi reports Server in slot # is removed / added during OA switchover #2562

Closed openhpi2 closed 8 years ago

openhpi2 commented 9 years ago

During OA switchover for some of the blades (already present in the enclosure) ‘Server in slot # is removed/ added’ is getting reported as below: Snippet of /var/log/messages oa_soap_re_discover.c:164: Re-discovery started Oct 8 06:29:08 localhost openhpid: oa_soap: oa_soap_re_discover.c:848: Server in slot 16 is removed Oct 8 06:29:10 localhost openhpid: oa_soap: oa_soap_re_discover.c:865: Server in slot 16 is added Oct 8 06:29:23 localhost openhpid: ssl: oh_ssl.c:531: Socket connect failed with error: Connection refused Oct 8 06:29:23 localhost openhpid: oa_soap: oa_soap_callsupport.c:652: oh_ssl_connect() failed Oct 8 06:29:23 localhost openhpid: oa_soap: oa_soap_callsupport.c:1098: failed to communicate with OA during soap_call() Oct 8 06:29:51 localhost openhpid: oa_soap: oa_soap_re_discover.c:253: Re-discovery completed

With further analysis found that during switchover getBladeInfoArrayResponse has partNumber and serialNumber as [Unknown] for some of the blades. When these fields are absent in the response structure we go ahead remove and add that particular blade. As OA response does not have data for these fields we are getting these messages during OA switchover. We need to modify our logic in code to take care of this scenario.

OA xml response during switchover having partNumber and serialNumber as [Unknown] for that particular blade are as below:

hpoa:bladeInfo hpoa:bayNumber9/hpoa:bayNumber hpoa:presencePRESENT/hpoa:presence hpoa:bladeTypeBLADE_TYPE_SERVER/hpoa:bladeType hpoa:width1/hpoa:width hpoa:height1/hpoa:height hpoa:nameHP ProLiant BL495c G5/hpoa:name hpoa:manufacturerHP/hpoa:manufacturer hpoa:partNumber[Unknown]/hpoa:partNumber hpoa:sparePartNumber488623-001/hpoa:sparePartNumber hpoa:serialNumber[Unknown]/hpoa:serialNumber hpoa:serverName[Unknown]/hpoa:serverName hpoa:uuid/hpoa:uuid hpoa:rbsuOsName[Unknown]/hpoa:rbsuOsName hpoa:assetTag[Unknown]/hpoa:assetTag hpoa:romVersion[Unknown]/hpoa:romVersion hpoa:numberOfCpus0/hpoa:numberOfCpus

Reported by: openhpi2

openhpi2 commented 9 years ago

Original comment by: dr_mohan

openhpi2 commented 9 years ago

Original comment by: dr_mohan

openhpi2 commented 9 years ago

There is a situation where a blade with normal serial number could be replaced with a blade with unknown or wrong serial number. So it is better to replace the blade during re-discovery if the blade comes out as unknown even if the blade that is already there has a proper serial number. In our testing we found that if we do not replace, we could end up with two blades with same serial number, if we swap two blades during a switch over (one good and one bad). We could make a code change where we could give a message to user if an improper serial number (non-alphanumeric) is present.

Original comment by: dr_mohan

openhpi2 commented 9 years ago

Original comment by: dr_mohan

openhpi2 commented 9 years ago

Fixed with checkin #7616.

Original comment by: dr_mohan

openhpi2 commented 9 years ago

Original comment by: dr_mohan