Closed AdrianoDee closed 1 week ago
type bug-fix
cms-bot internal usage
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46686/42626
A new Pull Request was created by @AdrianoDee for master.
It involves the following packages:
@cmsbuild, @jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks. @GiacomoSguazzoni, @VinInn, @VourMa, @dkotlins, @felicepantaleo, @ferencek, @gpetruc, @missirol, @mmusich, @mroguljic, @mtosi, @rovere, @threus, @tsusa, @tvami this is something you requested to watch as well. @antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here
enable gpu
test parameters:
please test
urgent
please abort
test parameters:
please test
-1
Failed Tests: RelVals-GPU
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-46ae81/42791/summary.html
COMMIT: 2c3f90d5beee66b2ba9f6efb796b89680d85b32b
CMSSW: CMSSW_14_2_X_2024-11-12-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46686/42791/install.sh
to create a dev area with all the needed externals and cmssw changes.
14949.402_HydjetQMinBias_5020GeV+2022HI_Patatrack_PixelOnlyAlpaka/step3_HydjetQMinBias_5020GeV+2022HI_Patatrack_PixelOnlyAlpaka.log
Summary:
test parameters:
Ok the failure
src/RecoVertex/PixelVertexFinding/plugins/alpaka/clusterTracksByDensity.h:205: void alpaka_cuda_async::vertexFinder::clusterTracksByDensity(const alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1UL>, unsigned int> &, reco::ZVertexLayout<128UL, false>::ViewTemplateFreeParams<128UL, false, true, true> &, reco::ZVertexTracksLayout<128UL, false>::ViewTemplateFreeParams<128UL, false, true, true> &, vertexFinder::PixelVertexWSSoALayout<128UL, false>::ViewTemplateFreeParams<128UL, false, true, true> &, int, float, float, float): block: [0,0,0], thread: [768,0,0] Assertion `static_cast<int>(foundClusters) < data.metadata().size()` failed.
needs to be investigated (I think it's related to https://github.com/cms-sw/cmssw/pull/45887) but it is for sure unrelated to this PR.
+1 Signing due to the urgency, @AdrianoDee please create another PR for the new failure at your earliest convenience
This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @mandrenguyen, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)
needs to be investigated (I think it's related to https://github.com/cms-sw/cmssw/pull/45887) but it is for sure unrelated to this PR.
indeed, I tested in a plain CMSSW_14_2_X_2024-11-12-2300
(without anything checked out on top) with:
runTheMatrix.py --what gpu,upgrade -l 14949.402 -t 4
and I still see
$ more 14949.402_HydjetQMinBias_5020GeV+2022HI_Patatrack_PixelOnlyAlpaka/step3_HydjetQMinBias_5020GeV+2022HI_Patatrack_PixelOnlyAlpaka.log
RAW2DIGI:RawToDigi_pixelOnly,RECO:reconstruction_pixelTrackingOnly,VALIDATION:@pixelTrackingOnlyValidation,DQM:@pixelTrackingOnlyDQM
We have determined that this is simulation (if not, rerun cmsDriver.py with --data)
with DB:
entry file:step2.root
Step: RAW2DIGI Spec: ['RawToDigi_pixelOnly']
Step: RECO Spec: ['reconstruction_pixelTrackingOnly']
Step: VALIDATION Spec: ['@pixelTrackingOnlyValidation']
@pixelTrackingOnlyValidation in preparing validation
Step: DQM Spec: ['@pixelTrackingOnlyDQM']
customising the process with customiseAlpakaServiceMemoryFilling from HeterogeneousCore/AlpakaServices/customiseAlpakaServiceMemoryFilling
customising the process with setCrossingFrameOn from SimGeneral/MixingModule/fullMixCustomize_cff
Starting cmsRun step3_RAW2DIGI_RECO_VALIDATION_DQM.py
%MSG-i ThreadStreamSetup: (NoModuleName) 13-Nov-2024 14:45:57 CET pre-events
setting # threads 4
setting # streams 4
%MSG
%MSG-i AlpakaService: (NoModuleName) 13-Nov-2024 14:45:59 CET pre-events
AlpakaServiceSerialSync succesfully initialised.
Found 1 device:
- Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
%MSG
%MSG-i CUDAService: (NoModuleName) 13-Nov-2024 14:45:59 CET pre-events
CUDA runtime version 12.4, driver version 12.4, NVIDIA driver version 550.127.05
CUDA device 0: Tesla T4 (sm_75)
%MSG
%MSG-i AlpakaService: (NoModuleName) 13-Nov-2024 14:45:59 CET pre-events
AlpakaServiceCudaAsync succesfully initialised.
Found 1 device:
- Tesla T4
%MSG
13-Nov-2024 14:46:06 CET Initiating request to open file file:step2.root
13-Nov-2024 14:46:11 CET Successfully opened file file:step2.root
%MSG-w NonConsumedConditionalModules: AfterModConstruction 13-Nov-2024 14:46:14 CET pre-events
The following modules were part of some ConditionalTask, but were not
consumed by any other module in any of the Paths to which the ConditionalTask
was associated. Perhaps they should be either removed from the
job, or moved to a Task to make it explicit they are unscheduled.
siPixelDigis@cpu
siPixelRecHitsPreSplitting@cpu
%MSG
Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 2 at 13-Nov-2024 14:46:16.182 CET
Begin processing the 2nd record. Run 1, Event 6, LumiSection 1 on stream 1 at 13-Nov-2024 14:46:16.211 CET
Begin processing the 3rd record. Run 1, Event 7, LumiSection 1 on stream 3 at 13-Nov-2024 14:46:16.229 CET
Begin processing the 4th record. Run 1, Event 5, LumiSection 1 on stream 0 at 13-Nov-2024 14:46:16.229 CET
Begin processing the 5th record. Run 1, Event 9, LumiSection 1 on stream 2 at 13-Nov-2024 14:46:18.913 CET
Begin processing the 6th record. Run 1, Event 10, LumiSection 1 on stream 1 at 13-Nov-2024 14:46:18.955 CET
Begin processing the 7th record. Run 1, Event 2, LumiSection 1 on stream 3 at 13-Nov-2024 14:46:18.995 CET
Begin processing the 8th record. Run 1, Event 8, LumiSection 1 on stream 2 at 13-Nov-2024 14:46:19.258 CET
Begin processing the 9th record. Run 1, Event 4, LumiSection 1 on stream 0 at 13-Nov-2024 14:46:19.258 CET
Begin processing the 10th record. Run 1, Event 3, LumiSection 1 on stream 1 at 13-Nov-2024 14:46:20.056 CET
src/RecoVertex/PixelVertexFinding/plugins/alpaka/clusterTracksByDensity.h:205: void alpaka_cuda_async::vertexFinder::clusterTracksByDensity(const alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::in
tegral_constant<unsigned long, 1UL>, unsigned int> &, reco::ZVertexLayout<128UL, false>::ViewTemplateFreeParams<128UL, false, true, true> &, reco::ZVertexTracksLayout<128UL, false>::ViewTemplateFreeParam
s<128UL, false, true, true> &, vertexFinder::PixelVertexWSSoALayout<128UL, false>::ViewTemplateFreeParams<128UL, false, true, true> &, int, float, float, float): block: [0,0,0], thread: [640,0,0] Asserti
on `static_cast<int>(foundClusters) < data.metadata().size()` failed.
Should 14949.402 be added to the set of workflows being tested in IBs?
Should 14949.402 be added to the set of workflows being tested in IBs?
sounds like it should.
ignore tests-rejected with ib-failure See discussion above. Failures are unrelated to this PR.
+1
sounds like it should.
I opened https://github.com/cms-sw/cmssw/issues/46693 to keep track.
PR description:
Small fix for
SiPixelRecHitFromSoAAlpaka
that uses the obsoletepixelClustering::maxHitsInModule()
to limit the number of hits in a module. It should beTrackerTraits::maxHitsInModule
. The backport to14_1_X
will fix https://github.com/cms-sw/cmssw/issues/46656.PR validation:
*.402
wfs for HI conditions