openhpi2 / openhpi_bug_test

0 stars 0 forks source link

Inventory RDRs no longer visible after Hotswap AMC #1279

Open openhpi2 opened 17 years ago

openhpi2 commented 17 years ago

After extracting, and then re-inserting an AMC on an ATCA Blade (the ATCA Blade is not extracted), the Inventory RDRs are no longer visible. In fact, ALL of the Sensor RDRs are also no longer visible, with the exception of the Hotswap Sensor.

Before extraction of AMC:

+--- {SYSTEM_CHASSIS,1}{PHYSICAL_SLOT,4}{193,5} __ Sensor Num: 1, Type: OEM_SENSOR, Category: SENSOR_SPECIFIC, Tag: FRU1 Hot Swap0 __ Sensor Num: 114, Type: (null), Category: STATE, Tag: B1:Ver change0 __ Sensor Num: 113, Type: PLATFORM_ALERT, Category: STATE, Tag: B1:IpmC Reboot0 __ Sensor Num: 112, Type: PLATFORM_ALERT, Category: STATE, Tag: B1:Health Error0 __ Sensor Num: 111, Type: TEMPERATURE, Category: THRESHOLD, Tag: B1:Temp PSU In __ Sensor Num: 110, Type: TEMPERATURE, Category: THRESHOLD, Tag: B1:Temp PSU Out __ Sensor Num: 109, Type: TEMPERATURE, Category: THRESHOLD, Tag: B1:Temp Exhaust __ Sensor Num: 108, Type: TEMPERATURE, Category: THRESHOLD, Tag: B1:Temp AirInlet __ Sensor Num: 107, Type: VOLTAGE, Category: THRESHOLD, Tag: B1:Board BlueLed __ Sensor Num: 106, Type: VOLTAGE, Category: THRESHOLD, Tag: B1:Board 12v __ Sensor Num: 105, Type: VOLTAGE, Category: THRESHOLD, Tag: B1:Board 3.3VSUS __ Sensor Num: 104, Type: VOLTAGE, Category: THRESHOLD, Tag: B1:Board 5V __ Sensor Num: 103, Type: OEM_SENSOR, Category: SENSOR_SPECIFIC, Tag: B1:IPMBL State0 __ Sensor Num: 102, Type: OEM_SENSOR, Category: SENSOR_SPECIFIC, Tag: B1:ModuleHotSwap0 __ Sensor Num: 101, Type: OEM_SENSOR, Category: AVAILABILITY, Tag: B1:FRU Agent0 __ Sensor Num: 100, Type: OEM_SENSOR, Category: GENERIC, Tag: B1:IPMI Info-20 __ Sensor Num: 99, Type: OEM_SENSOR, Category: GENERIC, Tag: B1:IPMI Info-10 __ Control Num: 0, Type: OEM, Output Type: LED, Tag: Blue LED __ Control Num: 1, Type: OEM, Output Type: LED, Tag: LED 1 __ Control Num: 2, Type: OEM, Output Type: LED, Tag: LED 2 __ Inventory Idr Num: 1, Num Areas: 2, Tag: B1:AM45X0

After Re-Inserting AMC:

+--- {SYSTEM_CHASSIS,1}{PHYSICAL_SLOT,4}{193,5} __ Sensor Num: 1, Type: OEM_SENSOR, Category: SENSOR_SPECIFIC, Tag: FRU1 Hot Swap0

Reported by: ppothier

openhpi2 commented 16 years ago

Logged In: YES user_id=1549494 Originator: YES

Now running openhpi-2.10.1, commenting out one line in ipmi_discover.cpp cIpmiMcThread::HandleHotswapEvent() appears to correct this behavior...

case eIpmiFruStateNotInstalled: // We care only if it's the MC itself

if 0

if ( sensor->Resource()->FruId() == 0 )

endif

{ // remove mc WriteLock();

if ( m_mc ) m_domain->CleanupMc( m_mc );

WriteUnlock();

m_mc = 0; } break;

After commenting out the check for FruId==0, both Inventory and Sensor information are correctly provided after pulling out and then re-inserting the AMC card.

Original comment by: ppothier

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Logged In: YES user_id=944149 Originator: NO

This is not the right fix. If you do that, the removal of any FRU will remove the corresponding MC and all of its managed FRUs; meaning that in your case, when you remove your AMC module the ATCA Carrier board resource will be removed as well (And reinserted when you reinsert the AMC module). The mentioned line was added to fix bug #1120217.

Original comment by: psangouard

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Logged In: YES user_id=660960 Originator: NO

Anything new on this bug?

Original comment by: renierm

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

I posted a patch for this issue to openhpi-devel ML.

https://sourceforge.net/mailarchive/message.php?msg\_name=20080314191240.093b0915.ashie%40homa.ne.jp

Although I think the patch isn't perfect, it works fine for me.

I'm now anxious on this issue because Pierre has leaved this project and currently no one doesn't care about this issue.

Original comment by: makeinu

openhpi2 commented 16 years ago

Original comment by: renierm

openhpi2 commented 16 years ago

Logged In: YES user_id=1884926 Originator: NO

Will take a look at this patch and get back to you in a couple of days. -- Shuah

Original comment by: shuah

openhpi2 commented 16 years ago

Logged In: YES user_id=1884926 Originator: NO

Could you please attach the patch to the defect.

Original comment by: shuah

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

How can I attach files to this BTS? It seems that there is no form to attach files to this BTS (on my view). The form exists only on "Submit New" page.

It seems that only the originator can add files on this system.

Original comment by: makeinu

openhpi2 commented 16 years ago

Logged In: YES user_id=1884926 Originator: NO

Anyone should be able to upload files. Once you log in you should see a Upload radio button which lets you upload a file. If the patch is small enough, please add it as as comment.Looks like you were able to add a comment. You said your patch is not perfect, but works for you. Please elaborate on that comment as well and explain how your patch fixes the bug. -- Shuah

Original comment by: shuah

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

I've logged in, but I can't find the "upload" radio button.

I can find the form at some other bug entries which is opened by me. I can also find it at all bug entries on my projects. But I can't find it here. Although I've searched "upload" at this page, but I have no hit without your comment.

So I think this system doesn't allow uploading files by followers.

I'll paste the patch as comment and explain about it later.

Original comment by: makeinu

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

Index: plugins/ipmidirect/ipmi_mc_vendor.cpp

--- plugins/ipmidirect/ipmi_mc_vendor.cpp (revision 6710) +++ plugins/ipmidirect/ipmi_mc_vendor.cpp (working copy) @@ -627,16 +627,28 @@

type = (SaHpiEntityTypeT)sdr->m_data[8]; instance = (SaHpiEntityLocationT)sdr->m_data[9];

parent_fru_id = sdrs->FindParentFru( type, instance, parent_type,

cIpmiResource *res = FindResource( domain, sensor->Mc(), parent_fru_id, parent_type, parent_instance, sdrs ); +

if ( sdr ) { @@ -819,7 +832,8 @@ parent_fru_id = sdrs->FindParentFru( type, instance, parent_type,

stdlog << "CreateSensorEntityPath mc " << source_mc->GetAddress() << " FRU " << parent_fru_id << " type " << type << " instance " << instance << "\n";

Index: plugins/ipmidirect/ipmi_discover.cpp

--- plugins/ipmidirect/ipmi_discover.cpp (revision 6710) +++ plugins/ipmidirect/ipmi_discover.cpp (working copy) @@ -247,6 +247,28 @@

void +cIpmiMcThread::DiscoverForSubboard() +{

cIpmiSensor *sensor = m_mc->FindSensor( (event->m_data[5] & 0x3), event->m_data[8] );

+ if ( event->m_data[7] == eIpmiSensorTypeAtcaHotSwap && sensor == 0 )

m_mc = 0;

Index: plugins/ipmidirect/ipmi_sdr.cpp

--- plugins/ipmidirect/ipmi_sdr.cpp (revision 6710) +++ plugins/ipmidirect/ipmi_sdr.cpp (working copy) @@ -1174,7 +1174,8 @@ cIpmiSdrs::FindParentFru( SaHpiEntityTypeT type, SaHpiEntityLocationT instance, SaHpiEntityTypeT & parent_type,

parent_type = SAHPI_ENT_UNSPECIFIED; parent_instance = 0;

// First look for FRUs themselves for( unsigned int i = 0; i < NumSdrs(); i++ ) @@ -1218,6 +1220,7 @@ } }

+ exact_matched = false; stdlog << "Entity ID " << type << ", Instance " << instance << " is not a FRU\n";

// SDR entity is not a FRU: look for association records

Index: plugins/ipmidirect/ipmi_mc.h

--- plugins/ipmidirect/ipmi_mc.h (revision 6710) +++ plugins/ipmidirect/ipmi_mc.h (working copy) @@ -190,6 +190,7 @@ cIpmiSel *Sel() const { return m_sel; }

bool Cleanup(); // true => it is safe to destroy mc

SaErrorT HandleNew(); bool DeviceDataCompares( const cIpmiMsg &msg ) const;

Index: plugins/ipmidirect/ipmi_discover.h

--- plugins/ipmidirect/ipmi_discover.h (revision 6710) +++ plugins/ipmidirect/ipmi_discover.h (working copy) @@ -82,6 +82,7 @@ protected: // discover MC void Discover( cIpmiMsg *get_device_id_rsp = 0 );

// poll mc task void PollAddr( void *userdata );

Index: plugins/ipmidirect/ipmi_sdr.h

--- plugins/ipmidirect/ipmi_sdr.h (revision 6710) +++ plugins/ipmidirect/ipmi_sdr.h (working copy) @@ -173,7 +173,8 @@ unsigned int FindParentFru( SaHpiEntityTypeT type, SaHpiEntityLocationT instance, SaHpiEntityTypeT & parent_type,

Index: plugins/ipmidirect/ipmi_mc.cpp

--- plugins/ipmidirect/ipmi_mc.cpp (revision 6710) +++ plugins/ipmidirect/ipmi_mc.cpp (working copy) @@ -184,7 +184,31 @@ return true; }

+bool +cIpmiMc::CleanupResourceSensors(cIpmiResource *res) +{

+ for( node = m_sensors_in_my_sdr; node; node = g_list_next(node) )

Original comment by: makeinu

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

Main reasons of this bug is:

1.cIpmiMcThread::Discover() doen't search new resourses and sensors for sub boards like AMC. 2.cIpmiMcThread::HandleHotswapEvent() doesn't remove the resource which represents AMC after removing the board.

I added a discovery function for sub boards in attached patch to solve issue 1. But it is not enough because of issue 2. DiscoverForSubboard() is invoked only when a new sensor is found. But the sensor is already exists because of issue 2 so the function is never invoked if the board is inserted once.

The solution described at first comment of this bug entry accidentally solve above both problems.

* The patch removes both AMC and its parent resource when the AMC board is removed.

But of course removing the AMC resource with its parent is a wrong solution. So I improved this solution.

Original comment by: makeinu

openhpi2 commented 16 years ago

Logged In: YES user_id=364243 Originator: NO

Although It works fine on my environment, I don't understand about mIpmiMc::HandleNew() which is called from DiscoverForSubboard(). This is a reason why I think the patch isn't perfect. Could you confirm it is O.K. or not to use this function from here? Although I believe it doesn't have side effects, I can't estimate strictly whether the patch has side effects or not because I don't understand whole structure of this program yet.

Original comment by: makeinu

openhpi2 commented 16 years ago

Logged In: YES user_id=1884926 Originator: NO

Patch under review. I do share makeinu's concerns about the patch, that this is not a complete solution. Reviewing to see if it is a good patch as a partial solution even though it doesn't provide a complete solution.

As far as I can tell cIpmiMc: HandleNew() is a good call to make from DiscoverForSubboard(). cIpmiMc:HandleNew() looks for SDRs and enables SEL features if

I will keep posting updates as I keep making progress on the analysis. Sorry being new to the plug-in, it is taking me longer to understand the patch.

Original comment by: shuah

openhpi2 commented 14 years ago

Problem still exists today. One wonders if the patch submitted will pass muster or if there's work planned on fixing it properly, in light of the fact that we're looking at roughly a year of looking at the problem with reports of the same going back nearly TWO years.

Original comment by: svartalf

openhpi2 commented 14 years ago

Pastebin URL for log dump: http://pastebin.com/yyhtHSr5

Original comment by: svartalf