Closed czangela closed 3 years ago
A new Issue was created by @czangela .
@Dr15Jones, @perrotta, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign reconstruction, heterogeneous
FYI @cms-sw/trk-dpg-l2 @VinInn
New categories assigned: heterogeneous,reconstruction
@slava77,@fwyzard,@perrotta,@makortel,@jpata you have been requested to review this Pull request/Issue and eventually sign? Thanks
Interesting as it WAS protected for
0 == nHits_
but just by eye now is buggy (check acquire and then produce)
so it is obvious that produce will crash
with this
diff --git a/RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitFromCUDA.cc b/RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitFromCUDA.cc
index f2f1497b4ba..5861a0be734 100644
--- a/RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitFromCUDA.cc
+++ b/RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitFromCUDA.cc
@@ -74,7 +74,7 @@ void SiPixelRecHitFromCUDA::acquire(edm::Event const& iEvent,
nHits_ = inputData.nHits();
- LogDebug("SiPixelRecHitFromCUDA") << "converting " << nHits_ << " Hits";
+ LogDebug("SiPixelRecHitFromCUDA") << "converting " << nHits_ << " Hits" << std::endl;
if (0 == nHits_)
return;
@@ -83,18 +83,21 @@ void SiPixelRecHitFromCUDA::acquire(edm::Event const& iEvent,
}
void SiPixelRecHitFromCUDA::produce(edm::Event& iEvent, edm::EventSetup const& es) {
+
// allocate a buffer for the indices of the clusters
auto hmsp = std::make_unique<uint32_t[]>(gpuClustering::maxNumModules + 1);
- std::copy(hitsModuleStart_.get(), hitsModuleStart_.get() + gpuClustering::maxNumModules + 1, hmsp.get());
- // wrap the buffer in a HostProduct, and move it to the Event, without reallocating the buffer or affecting hitsModuleStart
- iEvent.emplace(hostPutToken_, std::move(hmsp));
SiPixelRecHitCollection output;
+ output.reserve(gpuClustering::maxNumModules, nHits_);
if (0 == nHits_) {
iEvent.emplace(rechitsPutToken_, std::move(output));
+ iEvent.emplace(hostPutToken_, std::move(hmsp));
return;
}
- output.reserve(gpuClustering::maxNumModules, nHits_);
+
+ std::copy(hitsModuleStart_.get(), hitsModuleStart_.get() + gpuClustering::maxNumModules + 1, hmsp.get());
+ // wrap the buffer in a HostProduct, and move it to the Event, without reallocating the buffer or affecting hitsModuleStart
+ iEvent.emplace(hostPutToken_, std::move(hmsp));
auto xl = store32_.get();
auto yl = xl + nHits_;
it runs for me. Acceptable?
one can leave the reserve after the return; (very very minor). I think should be ok.
still: was tested (long long ago) with nHits_==0; and the history has been erased by the file renaming. So no way to understand when and why was changed.
ok found a version in CMSSW_11_1_0_pre8_Patatrack when was named SiPixelRecHitFromSOA.cc and the code is the same. No clue "how" was tested then... (maybe before introducing "hmsp")
maybe easier to just protect the copy
if(hitsModuleStart_) std::copy(hitsModuleStart_.get(), hitsModuleStart_.get() + gpuClustering::maxNumModules + 1, hmsp.get());
+heterogeneous
PRs to master and 11_3_X have been merged
+reconstruction
This issue is fully signed and ready to be closed.
1. Description
Similar to #34197.
The idea here was to remove all
FED channels
above1199
*, and run the reconstruction on this skimmed raw data.This was run on the release
CMSSW_12_0_0_pre3
, and machinecmg-gpu1080.cern.ch
.[*] where
1200
is the minimumFED
number for the silicon pixel detector.2. Crash
The reconstruction crashes with a
segmentation fault
:Full log: crash.log
3. Reproduce - Short version
From https://aczirkos.web.cern.ch/aczirkos/pixel_crash_test/ run on the provided dataset:
4. Reproduce - Long version
0. SSH to GPU equipped machine
Don't forget to be nice and
Where P, Q, etc. are the numbers of the visible GPUs, which you can view with
nvidia-smi
.Init release area
CMSSW_12_0_0_pre3
.1. generate configs and run
Use
pixelTrackingOnly
workflow:136.885502_RunHLTPhy2018D+RunHLTPhy2018D+HLTDR2_2018+RECODR2_2018reHLT_Patatrack_PixelOnlyGPU+HARVEST2018_pixelTrackingOnly
2. modifiy step3_RAW2DIGI_RECO_DQM.py
-> add a new module named rawDataCollector to the beginning of the Schedule -> modules after this will see and use this collection
Path, sequence, task definition:
Add to schedule
3.
sed
and replacerawDataCollector
InputTagsrun again