Closed ariostas closed 1 year ago
This is something I asked about a while ago, and the response was that we didn't want to go down this route because it ties our standalone version to CMSSW. Have people changed their view on this?
I added instructions for how to run LST when CVMFS is not available. The setup scripts already heavily rely on CVMFS, so removing the alpaka_interface
headers doesn't really make it any more difficult to run without CVMFS. I think we will almost never need to do this, but I think it's still reasonably straightforward.
Also, I did more testing and I can confirm that the issue #312 no longer occurs. I've included the list of commands that I used to test it, and the full output in lnx7188.
git clone --branch upgrade_cmssw https://github.com/SegmentLinking/TrackLooper.git
cd TrackLooper
source setup.sh
make code/rooutil/
sdl_make_tracklooper -c
scramv1 project CMSSW $CMSSW_VERSION
cd $CMSSW_VERSION/src
eval `scramv1 runtime -sh`
git cms-init
git remote add SegLink https://github.com/SegmentLinking/cmssw.git
git fetch SegLink CMSSW_13_3_0_pre3_LST_X
git cms-addpkg RecoTracker Configuration
git checkout CMSSW_13_3_0_pre3_LST_X
cat <<EOF >lst_cpu.xml
<tool name="lst_cpu" version="1.0">
<client>
<environment name="LSTBASE" default="$PWD/../../../TrackLooper"/>
<environment name="LIBDIR" default="\$LSTBASE/SDL"/>
<environment name="INCLUDE" default="\$LSTBASE"/>
</client>
<runtime name="LST_BASE" value="\$LSTBASE"/>
<lib name="sdl_cpu"/>
</tool>
EOF
cat <<EOF >lst_cuda.xml
<tool name="lst_cuda" version="1.0">
<client>
<environment name="LSTBASE" default="$PWD/../../../TrackLooper"/>
<environment name="LIBDIR" default="\$LSTBASE/SDL"/>
<environment name="INCLUDE" default="\$LSTBASE"/>
</client>
<runtime name="LST_BASE" value="\$LSTBASE"/>
<lib name="sdl_cuda"/>
</tool>
EOF
scram setup lst_cpu.xml
scram setup lst_cuda.xml
eval `scramv1 runtime -sh`
git cms-checkdeps -a -A
scram b -j 32
cmsDriver.py step3 -s RAW2DIGI,RECO:reconstruction_trackingOnly,VALIDATION:@trackingOnlyValidation,DQM:@trackingOnlyDQM --conditions auto:phase2_realistic_T21 --datatier GEN-SIM-RECO,DQMIO -n 10 --eventcontent RECOSIM,DQM --geometry Extended2026D88 --era Phase2C17I13M9 --procModifiers gpu,trackingLST,trackingIters01 --no_exec
sed -i "29i process.load('Configuration.StandardSequences.Accelerators_cff')\nprocess.load('HeterogeneousCore.AlpakaCore.ProcessAcceleratorAlpaka_cfi')" step3_RAW2DIGI_RECO_VALIDATION_DQM.py
sed -i "s|fileNames = cms.untracked.vstring('file:step3_DIGI2RAW.root')|fileNames = cms.untracked.vstring('file:/data2/segmentlinking/step2_21034.1_100Events.root')|" step3_RAW2DIGI_RECO_VALIDATION_DQM.py
cmsRun step3_RAW2DIGI_RECO_VALIDATION_DQM.py
Singularity> cmsRun step3_RAW2DIGI_RECO_VALIDATION_DQM.py
%MSG-i CUDAService: (NoModuleName) 09-Nov-2023 14:35:16 EST pre-events
CUDA runtime version 11.8, driver version 12.2, NVIDIA driver version 535.86.10
CUDA device 0: Tesla V100-SXM2-32GB (sm_70)
CUDA device 1: Tesla V100-SXM2-32GB (sm_70)
%MSG
%MSG-i AlpakaService: (NoModuleName) 09-Nov-2023 14:35:16 EST pre-events
AlpakaServiceSerialSync succesfully initialised.
Found 1 device:
- Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
%MSG
%MSG-i AlpakaService: (NoModuleName) 09-Nov-2023 14:35:16 EST pre-events
AlpakaServiceCudaAsync succesfully initialised.
Found 2 devices:
- Tesla V100-SXM2-32GB
- Tesla V100-SXM2-32GB
%MSG
09-Nov-2023 14:36:45 EST Initiating request to open file file:/data2/segmentlinking/step2_21034.1_100Events.root
09-Nov-2023 14:36:54 EST Successfully opened file file:/data2/segmentlinking/step2_21034.1_100Events.root
%MSG-w UnusedProductsForCanDeleteEarly: AfterModDestruction 09-Nov-2023 14:37:02 EST pre-events
The following products in the 'canDeleteEarly' list are not used in this job and will be ignored.
If possible, remove the producer from the job.
IntermediateHitDoublets_tripletElectronHitDoublets__RECO
RegionsSeedingHitSets_tripletElectronHitTriplets__RECO
TrajectorysToOnerecoTracksAssociation_muonSeededTracksInOut__RECO
TrajectorysToOnerecoTracksAssociation_muonSeededTracksOutIn__RECO
TrajectorysToOnerecoTracksAssociation_preDuplicateMergingGeneralTracks__RECO
Trajectorys_muonSeededTracksInOut__RECO
Trajectorys_muonSeededTracksOutIn__RECO
Trajectorys_preDuplicateMergingGeneralTracks__RECO
%MSG
Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 09-Nov-2023 14:39:23.177 EST
#--------------------------------------------------------------------------
# FastJet release 3.4.1
# M. Cacciari, G.P. Salam and G. Soyez
# A software package for jet finding and analysis at colliders
# http://fastjet.fr
#
# Please cite EPJC72(2012)1896 [arXiv:1111.6097] if you use this package
# for scientific work and optionally PLB641(2006)57 [hep-ph/0512210].
#
# FastJet is provided without warranty under the GNU GPL v2 or higher.
# It uses T. Chan's closest pair algorithm, S. Fortune's Voronoi code
# and 3rd party plugin jet algorithms. See COPYING file for details.
#--------------------------------------------------------------------------
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:40:31 EST Run: 1 Event: 1
Seed collection size (8692) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:40:31 EST Run: 1 Event: 1
Found too many clusters (179032), bailing out.
%MSG
Begin processing the 2nd record. Run 1, Event 2, LumiSection 1 on stream 0 at 09-Nov-2023 14:40:36.675 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:40:43 EST Run: 1 Event: 2
Seed collection size (6760) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:40:43 EST Run: 1 Event: 2
Found too many clusters (157706), bailing out.
%MSG
Begin processing the 3rd record. Run 1, Event 3, LumiSection 1 on stream 0 at 09-Nov-2023 14:40:47.670 EST
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:40:54 EST Run: 1 Event: 3
Found too many clusters (169109), bailing out.
%MSG
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:40:55 EST Run: 1 Event: 3
Seed collection size (7767) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
Begin processing the 4th record. Run 1, Event 4, LumiSection 1 on stream 0 at 09-Nov-2023 14:41:00.265 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:41:07 EST Run: 1 Event: 4
Seed collection size (8887) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:41:08 EST Run: 1 Event: 4
Found too many clusters (180093), bailing out.
%MSG
Begin processing the 5th record. Run 1, Event 5, LumiSection 1 on stream 0 at 09-Nov-2023 14:41:12.684 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:41:22 EST Run: 1 Event: 5
Seed collection size (11681) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:41:25 EST Run: 1 Event: 5
Found too many clusters (205679), bailing out.
%MSG
Begin processing the 6th record. Run 1, Event 6, LumiSection 1 on stream 0 at 09-Nov-2023 14:41:28.667 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:41:38 EST Run: 1 Event: 6
Seed collection size (12934) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:41:38 EST Run: 1 Event: 6
Found too many clusters (211208), bailing out.
%MSG
Begin processing the 7th record. Run 1, Event 7, LumiSection 1 on stream 0 at 09-Nov-2023 14:41:44.255 EST
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:41:52 EST Run: 1 Event: 7
Found too many clusters (188974), bailing out.
%MSG
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:41:52 EST Run: 1 Event: 7
Seed collection size (10382) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
Begin processing the 8th record. Run 1, Event 8, LumiSection 1 on stream 0 at 09-Nov-2023 14:41:57.749 EST
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:42:05 EST Run: 1 Event: 8
Found too many clusters (180919), bailing out.
%MSG
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:42:05 EST Run: 1 Event: 8
Seed collection size (8971) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
Begin processing the 9th record. Run 1, Event 9, LumiSection 1 on stream 0 at 09-Nov-2023 14:42:10.064 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:42:17 EST Run: 1 Event: 9
Seed collection size (8261) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:42:17 EST Run: 1 Event: 9
Found too many clusters (170110), bailing out.
%MSG
Begin processing the 10th record. Run 1, Event 10, LumiSection 1 on stream 0 at 09-Nov-2023 14:42:21.754 EST
%MSG-w TrackingMonitor: TrackingMonitor:TrackSeedMonhighPtTripletStep 09-Nov-2023 14:42:30 EST Run: 1 Event: 10
Seed collection size (10323) differs from seed stop info collection size (0). This is a sign of inconsistency in the configuration. Not filling associated histograms.
%MSG
%MSG-e TooManyClusters: PhotonConversionTrajectorySeedProducerFromSingleLeg:photonConvTrajSeedFromSingleLeg 09-Nov-2023 14:42:30 EST Run: 1 Event: 10
Found too many clusters (192559), bailing out.
%MSG
09-Nov-2023 14:42:36 EST Closed file file:/data2/segmentlinking/step2_21034.1_100Events.root
Singularity>
This PR is a follow-up on #336. I upgraded the CMSSW version to 13_3_0_pre3, so as to match the latest version that is being used in the CMSSW fork.
I also changed things so that all libraries are pulled from the CMSSW version that is currently being used. This prevents possible library mismatches that could cause trouble. In particular, I deleted the copy
alpaka_interface
that was being used for the caching allocator, since this version of CMSSW has a recent version of it.The standalone version works fine, but before we merge we'll need to make sure that it still works with CMSSW.