dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

HLT not displayed in CMS_extendedJobType #12006

Open nikodemas opened 4 months ago

nikodemas commented 4 months ago

Impact of the bug CMS Job Monitoring, managed by https://github.com/dmwm/cms-htcondor-es

Describe the bug TaskType (which is filled with CMS_extendedJobType from the WMCore) of StepChains quite often have UNKNOWN as one of their steps (see Grafana here, OpenSearch here). We have checked a few requests and it seems that the HLT step ends up being treated as UNKNOWN (also there is not a single StepChain with HLT in its TaskType). Can it be that there is some configuration missing?

How to reproduce it Check the TaskType field on Grafana or OpenSearch of StepChain requests that should include HLT as one of its steps.

Expected behavior I would expect to see HLT instead of UNKNOWN (e.g. "GEN,SIM,DIGI_premix,HLT,RECO,MINIAOD,NANOAOD" instead of "GEN,SIM,DIGI_premix,UNKNOWN,RECO,MINIAOD,NANOAOD")

Additional context and error message The initial feature was documented on https://github.com/dmwm/WMCore/issues/10604

@khurtado @leggerf @brij01

khurtado commented 4 months ago

This is actually expected. HLT is not part of the list of physics steps tagged at present.

More info:

Basically, tagging these physics steps is not automatic and we need to define the selection criteria for the relevant physics steps we want to keep track of from the cmsDriver --step options. The current list was gotten from this thread:

https://github.com/cms-sw/cmssw/issues/42587#issuecomment-1697977062

So, HLT is not part of the list right now, and step configurations like --step GEN,somethingelse,SIM,HLT would only display GEN,SIM at present. HLT alone would be treated as UNKNOWN as well.

https://cms-wmcore.docs.cern.ch/wmcore/Job-task-type-characterization-based-on-cmsDriver-command-line-arguments/#characterization-of-physics-task-types-based-on-the-cmsdriver-arguments

If HLT were to be included, this is a feature request and we would need some information on the selection criteria. E.g.: Is HLT alone (meaning selection is e.g.: something,HLT,somethigelse) enough? Are there any cases like HLT:v2, HLT-somethingelse, etc when defining it in the cmsDriver --step options?

leggerf commented 4 months ago

@khurtado how do you suggest that we proceed then? open a new feature request? comment on the issue you linked? As CMS monitoring, we do not really have the knowledge to say if HLT is enough or it should be detailed further, but I think having so many UNKNOWNS in the monitoring is the worst choice we can make.

nikodemas commented 4 months ago

@khurtado also I am not sure if I fully understand the example with --step GEN,somethingelse,SIM,HLT displaying GEN,SIM. Wouldn't that mean that there shouldn't be types like GEN,SIM,DIGI_premix,UNKNOWN,RECO,MINIAOD,NANOAOD (where UNKNOWN is filled instead of HLT) that we currently have and it should simply be GEN,SIM,DIGI_premix,RECO,MINIAOD,NANOAOD then?

khurtado commented 3 months ago

@leggerf Yes, I think supporting HLT would be a new feature. For that, we need to understand if HLT only has the following variations:

Or if there are other variations that we should take into account (xx_HLT_yy, HLT_xx, etc?)

@nikodemas This will happen with StepChain workflows when there is more than 1 cmsRun step per job.

For example, if we have a job executing 4 steps like below:

step1.stepPhysicsType = GEN,SIM
step2.stepPhysicsType = DIGI_nopileup,L1,DATAMIX
step3.stepPhysicsType = HLT:v2018r3
step4.stepPhysicsType = RECO,PAT,NANOAOD 

The final string will be: `GEN,SIM,DIGI_nopileup,UNKNOWN,RECO,MINIAOD,NANOAOD. Step2 has other variables besides DIGI_nopileup but they are ignored because we don't support reporting those as a relevant physics step. Step 3 shows UNKNOWN because we could not find any match with the current known supported physics types.

Note if the job only had 3 steps like this:

step1.stepPhysicsType = GEN,SIM
step2.stepPhysicsType = DIGI_nopileup,HLT:v2018r3,L1,DATAMIX
step4.stepPhysicsType = RECO,PAT,NANOAOD 

Then the string would have no UNKNOWNs, it would look like: GEN,SIM,DIGI_nopileup,RECO,MINIAOD,NANOAOD. If we had a job with a single step like:

step2.stepPhysicsType = L1

This would show UNKNOWN as well, because we don't have L1 in the list of physics types that were discussed in https://github.com/cms-sw/cmssw/issues/42587#issuecomment-1697977062

nikodemas commented 3 months ago

@khurtado than you for the explanation. So if we then want to add support for the HLT do we need to create a feature request in the WMCore or in the cmssw repository (or maybe just comment on the https://github.com/cms-sw/cmssw/issues/42587)? Is there any other information that would be required from us?