cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.07k stars 4.28k forks source link

[ASAN] CondFormats/SiStripObjects/interface/SiStripApvGain.h #11993

Closed davidlt closed 5 years ago

davidlt commented 8 years ago

slc6_amd64_gcc493 and CMSSW_7_6_X_2015-10-19-1100.

Noticed in 1001.0 step3 (ALCA).

=================================================================
==5480==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f8f2cf91c70 at pc 0x7f8f47a8c010 bp 0x7fffbc88c0e0 sp 0x7fffbc88c0c0
READ of size 4 at 0x7f8f2cf91c70 thread T0
   #0 0x7f8f47a8c00f in SiStripApvGain::getApvGain(unsigned short, std::pair<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, __gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > > > const&) (/mnt/build/davidlt/asan2/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libCalibFormatsSiStripObjects.so+0xe000f)
   #1 0x7f8f47a8b22e in SiStripGain::getApvGain(unsigned short const&, std::pair<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, __gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > > > const&, unsigned int) const /mnt/build/davidlt/asan2/CMSSW_7_6_ASAN_X_2015-10-19-1100/src/CalibFormats/SiStripObjects/src/SiStripGain.cc:102
   #2 0x7f8f37e33fb5 in SiStripGainFromCalibTree::algoBeginRun(edm::Run const&, edm::EventSetup const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/pluginCalibTrackerSiStripChannelGainPlugins.so+0x79fb5)
   #3 0x7f8f37e1e00f in ConditionDBWriter<SiStripApvGain>::beginRun(edm::Run const&, edm::EventSetup const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/pluginCalibTrackerSiStripChannelGainPlugins.so+0x6400f)
   #4 0x7f8f657dc61e in edm::EDAnalyzer::doBeginRun(edm::RunPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x10461e)
   #5 0x7f8f65a5a271 in edm::WorkerT<edm::EDAnalyzer>::implDoBegin(edm::RunPrincipal&, edm::EventSetup const&, edm::ModuleCallingContext const*) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x382271)
   #6 0x7f8f658676b7 in decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::doWork<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::Context const*)::{lambda()#1}>(bool edm::Worker::doWork<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::Context const*)::{lambda()#1}) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x18f6b7)
   #7 0x7f8f65867b2d in bool edm::Worker::doWork<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::Context const*) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x18fb2d)
   #8 0x7f8f6586868b in void edm::GlobalSchedule::runNow<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::GlobalContext const*) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x19068b)
   #9 0x7f8f65868a9a in decltype ({parm#1}()) edm::convertException::wrap<void edm::GlobalSchedule::processOneGlobal<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, bool)::{lambda()#1}>(void edm::GlobalSchedule::processOneGlobal<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, bool)::{lambda()#1}) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x190a9a)
   #10 0x7f8f65868f5c in void edm::GlobalSchedule::processOneGlobal<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, bool) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x190f5c)
   #11 0x7f8f658529b9 in edm::EventProcessor::beginRun(statemachine::Run const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x17a9b9)
   #12 0x7f8f657f3497 in statemachine::HandleRuns::beginRun(statemachine::Run const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x11b497)
   #13 0x7f8f657f3676 in statemachine::HandleRuns::setupCurrentRun() (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x11b676)
   #14 0x7f8f657f9574 in statemachine::NewRun::NewRun(boost::statechart::state<statemachine::NewRun, statemachine::HandleRuns, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x121574)
   #15 0x7f8f658181d3 in boost::statechart::state<statemachine::HandleRuns, statemachine::HandleFiles, statemachine::NewRun, (boost::statechart::history_mode)0>::deep_construct(boost::intrusive_ptr<statemachine::HandleFiles> const&, boost::statechart::state_machine<statemachine::Machine, statemachine::Starting, std::allocator<void>, boost::statechart::null_exception_translator>&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x1401d3)
   #16 0x7f8f65818970 in boost::statechart::simple_state<statemachine::FirstFile, statemachine::HandleFiles, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x140970)
   #17 0x7f8f6586675d in boost::statechart::state_machine<statemachine::Machine, statemachine::Starting, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x18e75d)
   #18 0x7f8f6583fb29 in edm::EventProcessor::runToCompletion() (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libFWCoreFramework.so+0x167b29)
   #19 0x4a97c7 in main::{lambda()#1}::operator()() const (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/bin/slc6_amd64_gcc493/cmsRun+0x4a97c7)
   #20 0x41f2ca in main (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/bin/slc6_amd64_gcc493/cmsRun+0x41f2ca)
   #21 0x7f8f62400d5c in __libc_start_main (/lib64/libc.so.6+0x1ed5c)
   #22 0x41f7f4 (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/bin/slc6_amd64_gcc493/cmsRun+0x41f7f4)

0x7f8f2cf91c70 is located 0 bytes to the right of 287856-byte region [0x7f8f2cf4b800,0x7f8f2cf91c70)
allocated by thread T0 here:
   #0 0x474635 in operator new(unsigned long) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/bin/slc6_amd64_gcc493/cmsRun+0x474635)
   #1 0x7f8f5578b45e in std::vector<float, std::allocator<float> >::reserve(unsigned long) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libCondFormatsRunInfo.so+0x7845e)
   #2 0x7f8f55793996 in boost::archive::detail::iserializer<eos::portable_iarchive, std::vector<float, std::allocator<float> > >::load_object_data(boost::archive::detail::basic_iarchive&, void*, unsigned int) const (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/lib/slc6_amd64_gcc493/libCondFormatsRunInfo.so+0x80996)
   #3 0x7f8f55254d3e in boost::archive::detail::basic_iarchive::load_object(void*, boost::archive::detail::basic_iserializer const&) (/mnt/build/davidlt/asan2/a/slc6_amd64_gcc493/cms/cmssw/CMSSW_7_6_ASAN_X_2015-10-19-1100/external/slc6_amd64_gcc493/lib/libboost_serialization.so.1.57.0+0x33d3e)

SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 SiStripApvGain::getApvGain(unsigned short, std::pair<__gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > >, __gnu_cxx::__normal_iterator<float const*, std::vector<float, std::allocator<float> > > > const&)
Shadow bytes around the buggy address:
 0x0ff2659ea330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x0ff2659ea340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x0ff2659ea350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x0ff2659ea360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x0ff2659ea370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ff2659ea380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[fa]fa
 0x0ff2659ea390: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
 0x0ff2659ea3a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
 0x0ff2659ea3b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
 0x0ff2659ea3c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
 0x0ff2659ea3d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
 Addressable:           00
 Partially addressable: 01 02 03 04 05 06 07 
 Heap left redzone:       fa
 Heap right redzone:      fb
 Freed heap region:       fd
 Stack left redzone:      f1
 Stack mid redzone:       f2
 Stack right redzone:     f3
 Stack partial redzone:   f4
 Stack after return:      f5
 Stack use after scope:   f8
 Global redzone:          f9
 Global init order:       f6
 Poisoned by user:        f7
 Contiguous container OOB:fc
 ASan internal:           fe
==5480==ABORTING

I guess, the issue in this line: https://github.com/cms-sw/cmssw/blob/9087f9b34c86c57c3b3bef7e263468cb7e8fc1f4/CondFormats/SiStripObjects/interface/SiStripApvGain.h#L77

davidlt commented 8 years ago

@ggovi

davidlt commented 8 years ago

Just code review, but this looks a bit suspicious: https://github.com/cms-sw/cmssw/blob/9087f9b34c86c57c3b3bef7e263468cb7e8fc1f4/CondFormats/SiStripObjects/interface/SiStripApvGain.h#L77 https://github.com/cms-sw/cmssw/blob/9087f9b34c86c57c3b3bef7e263468cb7e8fc1f4/CondFormats/SiStripObjects/src/SiStripApvGain.cc#L29

Especially return SiStripApvGain::Range(v_gains.end(),v_gains.end());

end() already point to the element after the last element in vector. Thus doing anything like *(range.first+apv) is strictly no-go.

One would need to go into debugger to verify if this is happening.

davidlt commented 8 years ago

Few snippets from debug output.

Good one:

DEBUG_ACCESS: 0x7fb36fd468b8
DEBUG_APV: 1
DEBUG_ATTEMPT: 0x7fb36fd468bc
GET_GAIN: 1.12618

Bad one:

RANGE_EMPTY: 0x7fb36fd86470 // v_gains.end()
DEBUG_ACCESS: 0x7fb36fd86470 // range.first
DEBUG_APV: 5 // apv
DEBUG_ATTEMPT: 0x7fb36fd86484 // range.first+apv
GET_GAIN: 0

Basically SiStripApvGain::Range(v_gains.end(),v_gains.end()) was returned, which is an empty range. apv is 5, so we go off bounds by 4 x 5 = 20 bytes. Then we tends to get gain as 0 or 1.

gain was 0 was 32,500 times, while 1 1964 times out of 177,248 returns.

Patch I used:

diff --git a/CalibFormats/SiStripObjects/src/SiStripGain.cc b/CalibFormats/SiStripObjects/src/SiStripGain.cc
index 9115499..b067e63 100644
--- a/CalibFormats/SiStripObjects/src/SiStripGain.cc
+++ b/CalibFormats/SiStripObjects/src/SiStripGain.cc
@@ -99,6 +99,7 @@ float SiStripGain::getStripGain(const uint16_t& strip, const SiStripApvGain::Ran
 float SiStripGain::getApvGain(const uint16_t& apv, const SiStripApvGain::Range& range, const uint32_t index) const
 {
   if( !(apvgainVector_.empty()) ) {
+    std::cout << "GET_GAIN: " << apvgainVector_[index]->getApvGain(apv, range) << std::endl;
     return (apvgainVector_[index]->getApvGain(apv, range))/(normVector_[index]);
   }
   edm::LogError("SiStripGain::getApvGain") << "ERROR: no gain available. Returning gain = 1." << std::endl;
diff --git a/CondFormats/SiStripObjects/interface/SiStripApvGain.h b/CondFormats/SiStripObjects/interface/SiStripApvGain.h
index c3abeb6..10703a4 100644
--- a/CondFormats/SiStripObjects/interface/SiStripApvGain.h
+++ b/CondFormats/SiStripObjects/interface/SiStripApvGain.h
@@ -74,7 +74,13 @@ class SiStripApvGain {
   static float   getApvGain  (const uint16_t& apv, const Range& range);
 #else
   static float   getStripGain (uint16_t strip, const Range& range)  {uint16_t apv = strip/128; return *(range.first+apv);}
-  static float   getApvGain   (uint16_t apv, const Range& range) {return *(range.first+apv);}
+  static float   getApvGain   (uint16_t apv, const Range& range) {
+  //std::terminate();
+  std::cout << "DEBUG_ACCESS: " << std::hex << std::addressof(*range.first) << std::dec << std::endl;
+  std::cout << "DEBUG_APV: " << std::dec << apv << std::endl;
+  std::cout << "DEBUG_ATTEMPT: " << std::hex << std::addressof(*(range.first+apv)) << std::dec << std::endl;
+    return *(range.first+apv);
+  }
 #endif

diff --git a/CondFormats/SiStripObjects/src/SiStripApvGain.cc b/CondFormats/SiStripObjects/src/SiStripApvGain.cc
index 446211f..07acda3 100644
--- a/CondFormats/SiStripObjects/src/SiStripApvGain.cc
+++ b/CondFormats/SiStripObjects/src/SiStripApvGain.cc
@@ -29,8 +29,10 @@ bool SiStripApvGain::put(const uint32_t& DetId, Range input) {
 const SiStripApvGain::Range SiStripApvGain::getRange(const uint32_t DetId) const {
   // get SiStripApvGain Range of DetId
   RegistryConstIterator p = std::lower_bound(v_detids.begin(),v_detids.end(),DetId);
-  if (p==v_detids.end() || *p!=DetId)
-    return SiStripApvGain::Range(v_gains.end(),v_gains.end());
+  if (p==v_detids.end() || *p!=DetId) {
+    std::cout << "RANGE_EMPTY: " << std::hex << std::addressof(*v_gains.end()) << std::dec << std::endl;
+    return SiStripApvGain::Range(v_gains.end(),v_gains.end());
+  }
   else{
     unsigned int pd= p-v_detids.begin();
     unsigned int ibegin = *(v_ibegin.begin()+pd);
davidlt commented 8 years ago

@VinInn I guess, this could be interesting to you. I see you have modified relevant parts of these files.

VinInn commented 8 years ago

so, somewhere there is a need to verify that the range is not empty Actually WHY an empty range is returned. Notice this is "ConditionDBWriter" not reconstruction so if the problem exists, is in the client code.... I deal only with hlt and reconstruction use cases that need to be highly optimized the "ConditionDBWriter" can put checks at second instruction... the problem is most probably in SiStripGainFromCalibTree, maybe it has to use a lower level interface w/o the range optimization (that is critical for HLT and reco as described in the pull request or jira somewhere)

davidlt commented 8 years ago

@quertenmont SiStripGainFromCalibTree::algoBeginRun was added by your dcd706f7474a70abc6a83cb81f60b66fe20e256c no so long ago (July). PR #10680

@mmusich

mmusich commented 8 years ago

@diguida might want to watch as well

smuzaffar commented 5 years ago

closing it as we do not see this error any more in ASAN IBs.

cmsbuild commented 5 years ago

A new Issue was created by @davidlt .

@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here