cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.08k stars 4.29k forks source link

Pythia8EGun generator module requires synchronization on LuminosityBlock boundaries #37385

Open jordan-martins opened 2 years ago

jordan-martins commented 2 years ago

We have 2 Guns (which we know does not work with the concurrent modules) samples from the Run3Winter22 campaign that are failing 100% at the GEN step with a very odd error to us:

Exception Message:
The framework is configured to use at least two streams, but the following modules
require synchronizing on LuminosityBlock boundaries:
Pythia8EGun generator

It seems that it has some stream defined anywhere but we (PdmV nor PnR) can not identify from where this config is set up (and we haven't set up anything on our own). Here are the two wfs:

wf1 wf2

Turnaround proposed to deal with the problem is:

The situation can be fixed by either
* modifying the modules to support concurrent LuminosityBlocks (preferred), or
* setting 'process.options.numberOfConcurrentLuminosityBlocks = 1' in the configuration file

However, we rather prefer to see the fix from the module perspective if possible.

In addition, the error is not reproducible from lxplus nor validation jobs from McM tool.

Thanks, Jordan

cmsbuild commented 2 years ago

A new Issue was created by @jordan-martins Jordan Martins.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

makortel commented 2 years ago

assign generators

cmsbuild commented 2 years ago

New categories assigned: generators

@mkirsano,@alberto-sanchez,@SiewYan,@GurpreetSinghChahal,@Saptaparna you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel commented 2 years ago

The Pythia8EGun is declared as one of the module types for which cmsDriver should explicitly set the numberOfConcurrentLuminosityBlocks = 1 https://github.com/cms-sw/cmssw/blob/7f2d0d87f04b68a05cb5872399b6a52aacf2e62a/Configuration/Generator/python/concurrentLumisDisable.py#L1-L11

so in that sense this behavior is expected (more details on "why" in https://github.com/cms-sw/cmssw/issues/25090). It is currently unclear what overrides this parameter for these workflows (investigation continues in https://mattermost.web.cern.ch/cms-o-and-c/pl/sw55fqpr7pna8x581q453wcsne).

Nevertheless it would be better to migrate these generator modules to support concurrent lumis :)

makortel commented 2 years ago

The problem in ConfigBuilder is fixed in https://github.com/cms-sw/cmssw/pull/37417. I'll make backports to 12_3_X and 12_2_X after the review of that has completed.