Closed missirol closed 7 months ago
cms-bot internal usage
A new Issue was created by @missirol.
@rappoccio, @makortel, @antoniovilela, @Dr15Jones, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign heterogeneous, reconstruction, hlt
@cms-sw/trk-dpg-l2 FYI
New categories assigned: heterogeneous,reconstruction,hlt
@Martin-Grunewald,@mmusich,@fwyzard,@jfernan2,@makortel,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks
The assertion fails because in RecoLocalTracker/SiPixelClusterizer/plugins/alpaka/PixelClustering.h
we end up with
hist.size(): 4529
block elements: 256
that require 17 iterations, while maxIterGPU
is 16... so 17 < 16
fails.
In the CUDA version the block size is 384 (to accommodate for TrackerTraits::maxPixInModule
which is 6000, divided by 16 and round up by 128).
In the alpaka version the block size is 256 (which seems arbitrary).
Should be fixed by #44081 and #44082.
+hlt
explicitly tested with:
cmsrel CMSSW_14_0_0
cd CMSSW_14_0_0/src/
cmsenv
git cms-merge-topic 44082
scram b -j 20
and then following the recipe at https://github.com/cms-sw/cmssw/issues/44077#issue-2152647187
+heterogeneous
Running a recent HLT menu with
customizeHLTforAlpaka
inCMSSW_14_0_0
as in [1] leads to a runtime error.The full stack trace from running [1] can be found in pixel_findclus_cpu.log. Note that [1] forces the job to run on CPU only.
A similar crash occurs also on GPU (stack track in pixel_findclus_gpu.log), but a GPU is not needed to reproduce the issue.
There is no runtime error if the Alpaka customisation is not used.
Could experts please have a look ?
FYI: @AdrianoDee @borzari @fwyzard @cms-sw/hlt-l2
[1]
PS. Just for my own reference, I encountered this crash while testing a recent HLT menu in 14_0_X on one of the HiLTON nodes as described here.