ipmi direct plugin always core dumps for a particular test case in our environment. Backtrace of the problem.
(gdb) bt
#0 0x0000005558bb17bc in cIpmiMc::FindRdr (this=0x1200c7f00, r=0x556401d930)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_mc.cpp:620
#1 0x0000005558ba370c in cIpmiDomain::VerifyRdr (this=0x12009a930, rdr=0x556401d930)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_domain.cpp:774
#2 0x0000005558bbef8c in cIpmiResource::SendCommandReadLock (this=0x5564011cf0, rdr=0x556401d930, msg=..., rsp=..., lun=0, retries=3)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_resource.cpp:94
#3 0x0000005558bccf0c in cIpmiSensor::GetSensorData (this=0x556401d930, rsp=...)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_sensor.cpp:362
#4 0x0000005558bd2100 in cIpmiSensorHotswap::GetPicmgState (this=0x556401d930, state=@0x5559c384b0)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_sensor_hotswap.cpp:319
#5 0x0000005558bc0998 in cIpmiResource::GetHpiState (this=0x5564011cf0)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_resource.cpp:487
#6 0x0000005558bc0324 in cIpmiResource::Populate (this=0x5564011cf0)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_resource.cpp:387
#7 0x0000005558bb2a34 in cIpmiMc::Populate (this=0x5564000970)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_mc.cpp:809
#8 0x0000005558b9dacc in cIpmiMcThread::Discover (this=0x1200a6530, get_device_id_rsp=0x5559c38778)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_discover.cpp:459
#9 0x0000005558b9f2c0 in cIpmiMcThread::PollAddr (this=0x1200a6530, userdata=0x0)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_discover.cpp:816
#10 0x0000005558b9c43c in cIpmiMcThread::Run (this=0x1200a6530)
at SS_OpenHPI/openhpi/plugins/ipmidirect/ipmi_discover.cpp:247
#11 0x0000005558bda4ac in cThread::Thread (param=0x1200a6530)
at SS_OpenHPI/openhpi/plugins/ipmidirect/thread.cpp:115
#12 0x0000005558812660 in start_thread () from /lib64/libpthread.so.0
#13 0x0000005558a8b56c in __thread_start () from /lib64/libc.so.6
Backtrace stopped: frame did not save the PC
(gdb) p *this
$2 = {<cArray<cIpmiResource>> = {m_array = 0x616a61612f434752, m_num = 793137236, m_size = 825191785, m_resize = 1886596973}, _vptr.cIpmiMc = 0x0,
m_vendor = 0x615f323031343035, m_addr = {m_type = 825635120, m_channel = 13109, m_lun = 50 '2', m_slave_addr = 56 '8'}, m_active = 95, m_domain =
0x2f53535f4f70656e, m_sdrs = 0x4850492f6f70656e, m_sensors_in_my_sdr = 0x6870692f6f70656e, m_sel = 0x687069642f657665, m_picmg_major = 110 'n',
m_picmg_minor = 116 't', m_device_id = 46 '.', m_device_revision = 99 'c', m_provides_device_sdrs = 58, m_device_available = 49,
m_device_support = 57 '9', m_chassis_support = 53, m_bridge_support = 58, m_ipmb_event_generator_support = 32, m_ipmb_event_receiver_support = 104,
m_fru_inventory_support = 97, m_sel_device_support = 114, m_sdr_repository_support = 118, m_sensor_device_support = 101, m_major_fw_revision = 115 's',
m_minor_fw_revision = 116 't', m_major_version = 105 'i', m_minor_version = 110 'n', m_manufacturer_id = 543584114, m_product_id = 8241,
m_aux_fw_revision = "\000\000\000", m_is_tca_mc = true, m_is_rms_board = false}
`````````````````````~~
Here I could see that vptr has become NULL and also other int variables like m_num and m_size have become too big. My suspection is somewhere memory has been corrupted.
If there were a valgrind, i could have figured out where exactly the memory has been over written.
Reported by: mukuntharajaa
Hello,
ipmi direct plugin always core dumps for a particular test case in our environment. Backtrace of the problem.