dmwm / T0

Apache License 2.0
7 stars 60 forks source link

NanoScouting in Prompt #4991

Open silviodonato opened 2 months ago

silviodonato commented 2 months ago

Dear Tier-0 experts, from the Scouting group we would like to start having the NanoScouting production in Prompt, possibly before the end of the pp collision data taking (16/10/2024). All the code is available in IB and it will be included in 14_2_0_pre1 (#44970) and in one of the upcoming 14_0_X release (#45950 )

Describe the new feature you would like T0 to implement.

What are the changes and technical challenges?

Describe the implementation timeline of the new feature and relevance for the CMS data taking.

Additional information This has been already discussed at the DeepDive on NanoAOD, with PPD (@malbouis @vlimant ) and TSG (@missirol @mtosi). @elfontan

[1] cmsDriver.py step2 -s NANO:@Scout --process NANO --data --eventcontent NANOAOD --datatier NANOAOD -n 10000 --era Run3_2024 --conditions auto:run3_data_prompt --no_exec:

import FWCore.ParameterSet.Config as cms

from Configuration.Eras.Era_Run3_2024_cff import Run3_2024

process = cms.Process('NANO',Run3_2024)

# import of standard configurations
process.load('Configuration.StandardSequences.Services_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')
process.load('FWCore.MessageService.MessageLogger_cfi')
process.load('Configuration.EventContent.EventContent_cff')
process.load('Configuration.StandardSequences.GeometryRecoDB_cff')
process.load('Configuration.StandardSequences.MagneticField_cff')
process.load('PhysicsTools.NanoAOD.custom_run3scouting_cff')
process.load('Configuration.StandardSequences.EndOfProcess_cff')
process.load('Configuration.StandardSequences.FrontierConditions_GlobalTag_cff')

process.maxEvents = cms.untracked.PSet(
    input = cms.untracked.int32(10000),
    output = cms.optional.untracked.allowed(cms.int32,cms.PSet)
)

# Input source
process.source = cms.Source("PoolSource",
    fileNames = cms.untracked.vstring('file:step2_PAT.root'),
    secondaryFileNames = cms.untracked.vstring()
)

process.options = cms.untracked.PSet(
    IgnoreCompletely = cms.untracked.vstring(),
    Rethrow = cms.untracked.vstring(),
    TryToContinue = cms.untracked.vstring(),
    accelerators = cms.untracked.vstring('*'),
    allowUnscheduled = cms.obsolete.untracked.bool,
    canDeleteEarly = cms.untracked.vstring(),
    deleteNonConsumedUnscheduledModules = cms.untracked.bool(True),
    dumpOptions = cms.untracked.bool(False),
    emptyRunLumiMode = cms.obsolete.untracked.string,
    eventSetup = cms.untracked.PSet(
        forceNumberOfConcurrentIOVs = cms.untracked.PSet(
            allowAnyLabel_=cms.required.untracked.uint32
        ),
        numberOfConcurrentIOVs = cms.untracked.uint32(0)
    ),
    fileMode = cms.untracked.string('FULLMERGE'),
    forceEventSetupCacheClearOnNewRun = cms.untracked.bool(False),
    holdsReferencesToDeleteEarly = cms.untracked.VPSet(),
    makeTriggerResults = cms.obsolete.untracked.bool,
    modulesToCallForTryToContinue = cms.untracked.vstring(),
    modulesToIgnoreForDeleteEarly = cms.untracked.vstring(),
    numberOfConcurrentLuminosityBlocks = cms.untracked.uint32(0),
    numberOfConcurrentRuns = cms.untracked.uint32(1),
    numberOfStreams = cms.untracked.uint32(0),
    numberOfThreads = cms.untracked.uint32(1),
    printDependencies = cms.untracked.bool(False),
    sizeOfStackForThreadsInKB = cms.optional.untracked.uint32,
    throwIfIllegalParameter = cms.untracked.bool(True),
    wantSummary = cms.untracked.bool(False)
)

# Production Info
process.configurationMetadata = cms.untracked.PSet(
    annotation = cms.untracked.string('step2 nevts:10000'),
    name = cms.untracked.string('Applications'),
    version = cms.untracked.string('$Revision: 1.19 $')
)

# Output definition

process.NANOAODoutput = cms.OutputModule("NanoAODOutputModule",
    compressionAlgorithm = cms.untracked.string('LZMA'),
    compressionLevel = cms.untracked.int32(9),
    dataset = cms.untracked.PSet(
        dataTier = cms.untracked.string('NANOAOD'),
        filterName = cms.untracked.string('')
    ),
    fileName = cms.untracked.string('step2_NANO.root'),
    outputCommands = process.NANOAODEventContent.outputCommands
)

# Additional output definition

# Other statements
from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, 'auto:run3_data_prompt', '')

# Path and EndPath definitions
process.nanoAOD_step = cms.Path(process.nanoSequence)
process.endjob_step = cms.EndPath(process.endOfProcess)
process.NANOAODoutput_step = cms.EndPath(process.NANOAODoutput)

# Schedule definition
process.schedule = cms.Schedule(process.nanoAOD_step,process.endjob_step,process.NANOAODoutput_step)
from PhysicsTools.PatAlgos.tools.helpers import associatePatAlgosToolsTask
associatePatAlgosToolsTask(process)

# Customisation from command line

# Add early deletion of temporary data products to reduce peak memory need
from Configuration.StandardSequences.earlyDeleteSettings_cff import customiseEarlyDelete
process = customiseEarlyDelete(process)
# End adding early deletion
silviodonato commented 2 months ago

FYI: https://github.com/cms-sw/cmssw/pull/45950 has been just merged, so this request can be implemented directly on the next 14_0_X release (ie. CMSSW_14_0_15_patch2 or CMSSW_14_0_16)

patinkaew commented 2 months ago

Hi @silviodonato, all

CMSSW_14_0_16 was just released with https://github.com/cms-sw/cmssw/pull/45950. Release note

Additionally, scenario hltScoutingEra_Run3_2024 is included in the PR as well. This is designed for producing ScoutingNano from HLTSCOUT datatier (ScoutingPFRun3 dataset).

Some testing with RunPromptReco.py was also performed in PR:

python3 Configuration/DataProcessing/test/RunPromptReco.py --scenario=hltScoutingEra_Run3_2024 \
--global-tag=140X_dataRun3_Prompt_v4 --nanoaod --nanoFlavours=@Scout --lfn=fileIN.root
malbouis commented 1 month ago

Dear @patinkaew and @silviodonato , would you please take a look at this cmsTalk post from Antonio, where he tried out the NanoScouting workflow and let him know if anything is missing from the configuration? Thanks!

LinaresToine commented 1 month ago

Thank you @malbouis @patinkaew @silviodonato for your responses on the cmstalk post. The replay was successful after disabling AOD and MINIAOD output.

malbouis commented 1 month ago

Thanks, @LinaresToine ! I think it would be nice if the NanoScouting experts could check the files produced by the replay and provide feedback on the content as well as if it is according to expectations.

silviodonato commented 1 month ago

Hi @malbouis, from scouting point of view we confirm that the event content looks as expected, so everything looks ok to deploy it online.

LinaresToine commented 1 month ago

In agreement with ORM @jeyserma , we will deploy it in production along with the likely coming era change during MD4.

LinaresToine commented 1 month ago

For the record, the new scenario was deployed and ScoutingPFRun3 PD will now produce nanoaod in production.

https://cms-talk.web.cern.ch/t/acqusition-era-change-to-run2024i/51084