cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.08k stars 4.32k forks source link

getMany in OutputModule #27434

Open schneiml opened 5 years ago

schneiml commented 5 years ago

Dear framework experts,

while converting the DQMRootOuptutModule to work with the new DQM products, I hit the problem that the edm::RunForOutput et. al. seem to not support getManyByType.

Trying to use the GetterOfProducts instead (which is slightly overkill here, getManyByType would suffice) fails because callWhenNewProductsRegistered does not exist for the OutputModule.

Sample code here [1].

Am I missing something? A workaround could be to have a producer that consumes all the MonitorElementCollections and turns them into a single product with a well-defined name, but that seems less than elegant.

[1]

mkdir /dev/shm/$USER
cd /dev/shm/$USER
cmsrel CMSSW_11_0_0_pre1
cd CMSSW_11_0_0_pre1/src/
cmsenv
git cms-merge-topic schneiml:dqm-new-dqmstore-on-CMSSW_11_0_0_pre1-outputmodule
cd DataFormats/
scram b -j10
cd ../DQMServices/FwkIO/
scram b -j10 -k
scram b
cmsbuild commented 5 years ago

A new Issue was created by @schneiml Marcel Schneider.

@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

Dr15Jones commented 5 years ago

@wddgit please explain how outputCommands work in OutputModules.

wddgit commented 5 years ago

I am coming into this discussion not knowing all the context and I am not quite sure now deep and detailed an answer is needed. Please ask more if this does not answer the question or if you have more questions. @Dr15Jones if this is not what you had in mind or If you want me to spend time to give a more detailed answer, let me know.

OutputModule classes are configurable with an interface that looks something like this:

process.out = cms.OutputModule("PoolOutputModule",
     fileName = cms.untracked.string('someFile.root'),
     outputCommands = cms.untracked.vstring(
         'keep *', 
         'drop *_someLabel_*_*'
     )
)

Where the four fields in the branch name are type, label, instance and process name. An asterix is a wildcard. There is more detail describing this here: https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideSelectingBranchesForOutput

I would presume Chris is suggesting a similar interface for the DQM output module you are working on for the sake of consistency. There is a lot of code in the base classes that support this interface and does most of the work for you. For example, this is the code used by the PoolOutputModule which uses the base class functionality. Without looking at this carefully and understanding all the details, I would think you ought to be able to do something similar (not exactly the same) in your case. See this example in this file IOPool/Output/src/RootOutputFile.cc:

 void RootOutputFile::fillBranches(BranchType const& branchType,
                                     OccurrenceForOutput const& occurrence,
                                     StoredProductProvenanceVector* productProvenanceVecPtr,
                                     ProductProvenanceRetriever const* provRetriever) {
...

     OutputItemList const& items = om_->selectedOutputItemList()[branchType];
...
    for (auto const& item : items) {
...
         BasicHandle result = occurrence.getByToken(item.token_,     item.branchDescription_->unwrappedTypeID());
         product = result.wrapper();

Note that the base class code takes care of calling consumes automatically and fills the output item list for you. And then you can use getByToken to get the products. We usually try to avoid the getManyByType function if possible.

Note that DQMRootOutputModule already inherits from the base classes that provide this functionality. So this would not be adding new base classes, but using functionality in base classes that are already there. It would take me some time and effort to understand exactly how to make this work. I don't know a lot about the internals of DQMRootOutputModule.

schneiml commented 5 years ago

Thanks @wddgit. Since I don't know much about output modules (and there seems to be no FW Guide page on how to write them, and there are also not too many examples to look at...) I don't get much about the terminology there, but I'll have a look tomorrow.

Some differences to the PoolOutputModule that might be relevant:

wddgit commented 5 years ago

As far as I know, there is no documentation about writing output modules. It is a rare activity and each case seems to be different. The only way forward seems to be studying the OutputModule base class code, PoolOutputModule as an example, and the DQMRootOutputModule code.

Unless there are changes in the Framework code, I agree that getManyByType and GetterOfProducts are not going to work in an output module. So we are probably left with the options of using the base class functionality (preferred if it works) or developing something new that serves a similar purpose.

Note that it is a holiday here tomorrow so I likely won't reply to any additional questions until Friday. @Dr15Jones, let me know if you want me to stop my other work and make helping with this my first priority.

schneiml commented 5 years ago

Quick status update: