cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.07k stars 4.28k forks source link

[ASAN_X] stack-buffer-overflow in emtf::phase2::algo::RoadSortingLayer::apply #45469

Closed iarspider closed 3 weeks ago

iarspider commented 1 month ago

In CMSSW_14_1_ASAN_X_2024-07-15-2300, several RelVals failed with stack-buffer-overflow in emtf::phase2::algo::RoadSortingLayer::apply:

=================================================================
==1136493==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f08a5e205e0 at pc 0x7f084c8aae51 bp 0x7f08a5e1f360 sp 0x7f08a5e1f358
READ of size 1 at 0x7f08a5e205e0 thread T2
%MSG-w GEMClusterProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  16-Jul-2024 12:17:45 CEST Run: 1 Event: 105
Encountered unphysical GEM pads when making a single cluster, resetting cluster to empty.
%MSG
%MSG-w GEMClusterProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  16-Jul-2024 12:17:45 CEST Run: 1 Event: 105
Encountered unphysical GEM pads when making a single cluster, resetting cluster to empty.
%MSG
%MSG-w GEMClusterProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  16-Jul-2024 12:17:45 CEST Run: 1 Event: 105
Encountered unphysical GEM pads when making a single cluster, resetting cluster to empty.
%MSG
%MSG-w GEMClusterProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  16-Jul-2024 12:17:45 CEST Run: 1 Event: 105
Encountered unphysical GEM pads when making a single cluster, resetting cluster to empty.
%MSG
    #0 0x7f084c8aae50 in ap_private<2, false, true>::get_VAL() const /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02846/el8_amd64_gcc12/external/hls/2019.08-0e37f055a3ed22611ce5edecb14d0695/include/etc/ap_private.h:1368
    #1 0x7f084c8aae50 in ap_private<2, false, true>::operator=(ap_private<2, false, true> const&) /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02846/el8_amd64_gcc12/external/hls/2019.08-0e37f055a3ed22611ce5edecb14d0695/include/etc/ap_private.h:1409
    #2 0x7f084c8aae50 in ap_uint<2>::operator=(ap_uint<2> const&) /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02846/el8_amd64_gcc12/external/hls/2019.08-0e37f055a3ed22611ce5edecb14d0695/include/ap_int.h:281
    #3 0x7f084c8aae50 in emtf::phase2::road_t::operator=(emtf::phase2::road_t const&) src/L1Trigger/L1TMuonEndCapPhase2/interface/EMTFTypes.h:149
    #4 0x7f084c8aae50 in emtf::phase2::algo::RoadSortingLayer::apply(unsigned int const&, std::vector<std::array<emtf::phase2::road_t, 288ul>, std::allocator<std::array<emtf::phase2::road_t, 288ul> > > const&, std::vector<emtf::phase2::road_t, std::allocator<emtf::phase2::road_t> >&) const src/L1Trigger/L1TMuonEndCapPhase2/src/Algo/RoadSortingLayer.cc:102
    #5 0x7f084ca4ab83 in emtf::phase2::SectorProcessor::buildTracks(std::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > > const&, std::array<emtf::phase2::segment_t, 230ul> const&, bool const&, std::vector<l1t::phase2::EMTFTrack, std::allocator<l1t::phase2::EMTFTrack> >&) src/L1Trigger/L1TMuonEndCapPhase2/src/SectorProcessor.cc:401
    #6 0x7f084ca51e33 in emtf::phase2::SectorProcessor::process(std::vector<l1t::phase2::EMTFHit, std::allocator<l1t::phase2::EMTFHit> >&, std::vector<l1t::phase2::EMTFTrack, std::allocator<l1t::phase2::EMTFTrack> >&, std::vector<l1t::phase2::EMTFInput, std::allocator<l1t::phase2::EMTFInput> >&) src/L1Trigger/L1TMuonEndCapPhase2/src/SectorProcessor.cc:166
    #7 0x7f084ca6d107 in emtf::phase2::TrackFinder::process(edm::Event const&, edm::EventSetup const&, std::vector<l1t::phase2::EMTFHit, std::allocator<l1t::phase2::EMTFHit> >&, std::vector<l1t::phase2::EMTFTrack, std::allocator<l1t::phase2::EMTFTrack> >&, std::vector<l1t::phase2::EMTFInput, std::allocator<l1t::phase2::EMTFInput> >&) src/L1Trigger/L1TMuonEndCapPhase2/src/TrackFinder.cc:202
    #8 0x7f084894d89c in L1TMuonEndCapPhase2TrackProducer::produce(edm::Event&, edm::EventSetup const&) src/L1Trigger/L1TMuonEndCapPhase2/plugins/L1TMuonEndCapPhase2TrackProducer.cc:66
    #9 0x7f0909a9ed8b in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0xabad8b)
    #10 0x7f09099fbb68 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0xa17b68)
    #11 0x7f0909639887 in decltype ({parm#1}()) edm::convertException::wrap<edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0x655887)
    #12 0x7f0909639f69 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0x655f69)
    #13 0x7f090964718d in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0x66318d)
    #14 0x7f090a39c48e in tbb::detail::d1::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) (/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_ASAN_X_2024-07-15-2300/lib/el8_amd64_gcc12/libFWCoreConcurrency.so+0x1548e)
    #15 0x7f0906dc7b3a in tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
    #16 0x7f0906dc7b3a in tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
    #17 0x7f0906dc7b3a in tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/arena.cpp:137
    #18 0x7f0906dc7b3a in tbb::detail::r1::market::process(rml::job&) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/market.cpp:599
    #19 0x7f0906dc9ced in tbb::detail::r1::rml::private_worker::run() /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/private_server.cpp:271
    #20 0x7f0906dc9ced in tbb::detail::r1::rml::private_worker::thread_routine(void*) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/private_server.cpp:221
    #21 0x7f0905f161c9 in start_thread (/lib64/libpthread.so.0+0x81c9)
    #22 0x7f0905b718d2 in __GI___clone (/lib64/libc.so.6+0x398d2)

Address 0x7f08a5e205e0 is located in stack of thread T2 at offset 4384 in frame
    #0 0x7f084c8a128f in emtf::phase2::algo::RoadSortingLayer::apply(unsigned int const&, std::vector<std::array<emtf::phase2::road_t, 288ul>, std::allocator<std::array<emtf::phase2::road_t, 288ul> > > const&, std::vector<emtf::phase2::road_t, std::allocator<emtf::phase2::road_t> >&) const src/L1Trigger/L1TMuonEndCapPhase2/src/Algo/RoadSortingLayer.cc:15

  This frame has 126 object(s):
    [48, 49) '<unknown>'
    [64, 65) '<unknown>'
    [80, 81) '<unknown>'
    [96, 97) '<unknown>'
    [112, 113) '<unknown>'
    [128, 129) '<unknown>'
    [144, 145) '<unknown>'
    [160, 161) 'lhs'
    [176, 177) 'rhs'
    [192, 193) 'lhs'
    [208, 209) 'rhs'
    [224, 225) '<unknown>'
    [240, 241) '<unknown>'
    [256, 257) '<unknown>'
    [272, 273) '<unknown>'
    [288, 289) 'lhs'
    [304, 305) 'rhs'
    [320, 321) 'lhs'
    [336, 337) 'rhs'
    [352, 353) 'lhs'
    [368, 369) 'rhs'
    [384, 385) 'lhs'
    [400, 401) 'rhs'
    [416, 418) '<unknown>'
    [432, 434) '<unknown>'
    [448, 452) 'i_zone' (line 19)
    [464, 468) 'i_col' (line 29)
    [480, 484) '<unknown>'
    [496, 500) '<unknown>'
    [512, 516) 'i_road' (line 147)
    [528, 532) '<unknown>'
    [544, 548) '<unknown>'
    [560, 564) '<unknown>'
    [576, 580) 'step'
    [592, 596) 'block_end'
    [608, 612) '<unknown>'
    [624, 628) '<unknown>'
    [640, 644) 'step'
    [656, 660) 'block_end'
    [672, 676) '<unknown>'
    [688, 692) '<unknown>'
    [704, 708) 'lhs'
    [720, 724) 'rhs'
    [736, 740) '<unknown>'
    [752, 756) 'step'
    [768, 772) 'block_end'
    [784, 788) '<unknown>'
    [800, 804) '<unknown>'
    [816, 820) 'step'
    [832, 836) 'block_end'
    [848, 852) '<unknown>'
    [864, 868) '<unknown>'
    [880, 884) '<unknown>'
    [896, 904) '__for_begin' (line 118)
    [928, 936) '__for_end' (line 118)
    [960, 968) '<unknown>'
    [992, 1000) '<unknown>'
    [1024, 1032) 'lhs'
    [1056, 1064) 'rhs'
    [1088, 1096) 'lhs'
    [1120, 1128) 'rhs'
    [1152, 1160) 'lhs'
    [1184, 1192) 'rhs'
    [1216, 1224) '<unknown>'
    [1248, 1256) '<unknown>'
    [1280, 1288) '<unknown>'
    [1312, 1320) '<unknown>'
    [1344, 1352) '<unknown>'
    [1376, 1384) '<unknown>'
    [1408, 1416) '<unknown>'
    [1440, 1448) '<unknown>'
    [1472, 1480) '<unknown>'
    [1504, 1512) '<unknown>'
    [1536, 1544) '<unknown>'
    [1568, 1576) '<unknown>'
    [1600, 1608) '<unknown>'
    [1632, 1640) '<unknown>'
    [1664, 1672) '<unknown>'
    [1696, 1704) '<unknown>'
    [1728, 1736) '<unknown>'
    [1760, 1768) '<unknown>'
    [1792, 1800) '<unknown>'
    [1824, 1832) '<unknown>'
    [1856, 1864) '<unknown>'
    [1888, 1896) '<unknown>'
    [1920, 1928) '<unknown>'
    [1952, 1960) '<unknown>'
    [1984, 1992) '<unknown>'
    [2016, 2024) '<unknown>'
    [2048, 2056) '<unknown>'
    [2080, 2088) '<unknown>'
    [2112, 2120) '<unknown>'
    [2144, 2152) '<unknown>'
    [2176, 2184) '<unknown>'
    [2208, 2216) '<unknown>'
    [2240, 2248) '<unknown>'
    [2272, 2280) '<unknown>'
    [2304, 2312) '<unknown>'
    [2336, 2344) '<unknown>'
    [2368, 2376) '<unknown>'
    [2400, 2408) '<unknown>'
    [2432, 2440) 'lhs'
    [2464, 2472) 'rhs'
    [2496, 2512) '<unknown>'
    [2528, 2544) '<unknown>'
    [2560, 2576) '<unknown>'
    [2592, 2608) '<unknown>'
    [2624, 2640) '<unknown>'
    [2656, 2672) '<unknown>'
    [2688, 2704) '<unknown>'
    [2720, 2736) '<unknown>'
    [2752, 2768) '<unknown>'
    [2784, 2800) '<unknown>'
    [2816, 2840) 'top_roads' (line 17)
    [2880, 2912) '<unknown>'
    [2944, 2976) '<unknown>'
    [3008, 3040) '<unknown>'
    [3072, 3104) '<unknown>'
    [3136, 3168) '<unknown>'
    [3200, 3232) '<unknown>'
    [3264, 3296) '<unknown>'
    [3328, 3360) '<unknown>'
    [3392, 3424) '<unknown>'
    [3456, 3488) '<unknown>'
    [3520, 4384) 'roads_kept' (line 72) <== Memory access at offset 4384 overflows this variable
    [4512, 6240) 'suppressed_roads' (line 24)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T2 created by T0 here:
    #0 0x7f0909ca4136 in __interceptor_pthread_create ../../../../libsanitizer/asan/asan_interceptors.cpp:207
    #1 0x7f0906dc934f in tbb::detail::r1::rml::internal::thread_monitor::launch(void* (*)(void*), void*, unsigned long) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/rml_thread_monitor.h:208
    #2 0x7f0906dc934f in tbb::detail::r1::rml::private_worker::wake_or_launch() /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/private_server.cpp:305
    #3 0x7f0906dc934f in tbb::detail::r1::rml::private_server::wake_some(int) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-df2a8868ba5d9111864e1a63085d7a4b/tbb-v2021.9.0/src/tbb/private_server.cpp:412

SUMMARY: AddressSanitizer: stack-buffer-overflow /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02846/el8_amd64_gcc12/external/hls/2019.08-0e37f055a3ed22611ce5edecb14d0695/include/etc/ap_private.h:1368 in ap_private<2, false, true>::get_VAL() const
Shadow bytes around the buggy address:
  0x0fe194bbc060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc0a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0fe194bbc0b0: 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2 f2
  0x0fe194bbc0c0: f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 00 00 00 00
  0x0fe194bbc0d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc0e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc0f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe194bbc100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1136493==ABORTING

(example log: link)

iarspider commented 1 month ago

assign L1Trigger/L1TMuonEndCapPhase2

cmsbuild commented 1 month ago

New categories assigned: l1,upgrade

@epalencia,@aloeliger,@srimanob,@subirsarkar you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 1 month ago

cms-bot internal usage

cmsbuild commented 1 month ago

A new Issue was created by @iarspider.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

iarspider commented 1 month ago

Failing line: https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1TMuonEndCapPhase2/src/Algo/RoadSortingLayer.cc#L101

iarspider commented 1 month ago

@cms-sw/l1-l2 @cms-sw/upgrade-l2 From the comment on line 99 I think the loop should be

    for (unsigned int i = 0; i < keep_n_roads - 16; ++i) {
      roads_kept[i] = roads_kept[i + 16];
    }

The current version of loop clearly accesses array past it's bounds.

iarspider commented 1 month ago

@cms-sw/l1-l2 @cms-sw/upgrade-l2 gentle ping

aloeliger commented 1 month ago

@omiguelc I think you're the contact I have on muon endcap.

omiguelc commented 1 month ago

@aloeliger You're right. That is a bug. Should I open a PR into master?

omiguelc commented 1 month ago

This should fix it:

for (unsigned int i = 16; i < (keep_n_roads - 16); ++i) {
    roads_kept[i] = roads_kept[i + 16];
}
makortel commented 1 month ago

Should I open a PR into master?

Yes, please. Thanks!

omiguelc commented 1 month ago

Done, here's the PR: https://github.com/cms-sw/cmssw/pull/45581

iarspider commented 3 weeks ago

@cmsbuild please close