cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.09k stars 4.33k forks source link

Failure in NanoAOD v11 on MC samples. #41084

Open bbilin opened 1 year ago

bbilin commented 1 year ago

We observe failures in nanoAOD production with the mismatch of genweight size. (CepGen + PY6 samples) (using 12_6_0_patch1)

cmsunified

The message as below:

Fatal Exception (Exit code: 8002)
An exception of category 'StdException' occurred while
[0] Processing Event run: 1 lumi: 809 event: 1 stream: 0
[1] Running path 'NANOEDMAODSIMoutput_step'
[2] Prefetching for module PoolOutputModule/'NANOEDMAODSIMoutput'
[3] Calling method for module GenWeightsTableProducer/'genWeightsTable'
Exception Message:
A std::exception was thrown.
vector::_M_range_check: __n (which is 27) >= this->size() (which is 24)

Fatal Exception (Exit code: 8002)
An exception of category 'StdException' occurred while
[0] Processing Event run: 1 lumi: 1 event: 11 stream: 3
[1] Running path 'NANOEDMAODSIMoutput_step'
[2] Prefetching for module PoolOutputModule/'NANOEDMAODSIMoutput'
[3] Calling method for module GenWeightsTableProducer/'genWeightsTable'
Exception Message:
A std::exception was thrown.
vector::_M_range_check: __n (which is 27) >= this->size() (which is 24)

The cmsDriver command as follows:

cmsDriver.py step1 --filein "dbs:/CEPDijets-GluGlu_M-250_survfact0_13p6TeV_superchic/Run3Summer22MiniAODv3-124X_mcRun3_2022_realistic_v12-v2/MINIAODSIM" --fileout file:PPS-Run3Summer22NanoAODv11-00005.root --mc --eventcontent NANOEDMAODSIM --datatier NANOAODSIM --conditions 126X_mcRun3_2022_realistic_v2 --step NANO --nThreads 4 --scenario pp --era Run3,run3_nanoAOD_124

And can be run on the following miniAOD set: /CEPDijets-GluGlu_M-250_survfact0_13p6TeV_superchic/Run3Summer22MiniAODv3-124X_mcRun3_2022_realistic_v12-v2/MINIAODSIM

PdmV

@kskovpen @sunilUIET @swertz @vlimant

cmsbuild commented 1 year ago

A new Issue was created by @bbilin Bugra Bilin.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

makortel commented 1 year ago

assign xpog

cmsbuild commented 1 year ago

New categories assigned: xpog

@swertz,@vlimant you have been requested to review this Pull request/Issue and eventually sign? Thanks

vlimant commented 1 year ago

assign generators in case they can help out

cmsbuild commented 1 year ago

New categories assigned: generators

@mkirsano,@menglu21,@alberto-sanchez,@SiewYan,@GurpreetSinghChahal,@Saptaparna you have been requested to review this Pull request/Issue and eventually sign? Thanks

swertz commented 1 year ago

This seems to happen because of the Pythia PS weights. The NanoAOD weights producer recognizes the weights as "alternative" ones here: https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/plugins/GenWeightsTableProducer.cc#L999 It then tries to read weight indices {27, 5, 26, 4}, see https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/plugins/GenWeightsTableProducer.cc#L186 But for some reason weight index 27 apparently doesn't exist for that sample, which only has 24 weights.

So, either fix the pythia fragment, or fix the weights producer for this special case.

Note that we really hoped to not have to touch that code anymore, as we've been eagerly expecting this to be integrated: https://github.com/cms-sw/cmssw/pull/32167

makortel commented 1 year ago

Do we have any workflow in runTheMatrix testing this generator configuration in conjunction of nanoAOD?

swertz commented 1 year ago

Do we have any workflow in runTheMatrix testing this generator configuration in conjunction of nanoAOD?

AFAIK not, anyway there are way too many different generator configurations in production to test them all with the matrix... This sort of things could be avoided by producing a few NanoGEN events when preparing a MC request and checking the weights.

makortel commented 1 year ago

Do we have any workflow in runTheMatrix testing this generator configuration in conjunction of nanoAOD?

AFAIK not, anyway there are way too many different generator configurations in production to test them all with the matrix... This sort of things could be avoided by producing a few NanoGEN events when preparing a MC request and checking the weights.

I realize I was imprecise. What I was trying to mean with "generator configuration" was the "CepGen + PY6". I think it would be very beneficial to have at least one workflow for every generator we use in production.

Saptaparna commented 1 year ago

Let me add @SanghyunKo to this discussion.

sunilUIET commented 1 year ago

Hi, do we have any further instructions/wayout to fix this issue? We are facing the same issue in Summer23 with NanoV12 as well. https://cms-unified.web.cern.ch/cms-unified/showlog/?search=task_PPD-Run3Summer23pLHEGS-00001

vlimant commented 1 year ago

I think this would be in @cms-sw/generators-l2 hand.

vlimant commented 8 months ago

we will get to this soon, similarly to #43784

cmsbuild commented 8 months ago

cms-bot internal usage

vlimant commented 4 months ago

@sunilUIET : the request you pointed out was force-completed, but does not seem to be marked "done" ... has there been other cases of such failures ?

vlimant commented 3 days ago

new failures in production https://its.cern.ch/jira/browse/CMSCOMPPR-47504