cms-jet / JetToolbox

Python framework for configuration of jet tools via the jet toolbox.
https://twiki.cern.ch/twiki/bin/viewauth/CMS/JetToolbox
7 stars 36 forks source link

Refactor for maintainability and consistency #59

Closed kpedro88 closed 6 years ago

kpedro88 commented 6 years ago

The Python programming language supports a feature called "variables". Using this feature, the program can store the result of a computation and reuse it, rather than repeating the computation every time it is needed. In this PR, I have employed this feature for the string concatenations used to create module names. This makes it less likely for future developers to forget which collection to use, or to apply the postfix, etc.

In the process, I found a few places where the postfix was not applied correctly. These have been fixed. I also added an option to print all the available module names in order (stored in a dictionary) so the user can see what is available (more easily than doing a full edmConfigDump).

I tested this PR by creating a test config that exercised all the available options, dumping it, and comparing with a clean version. The only differences appear because of the aforementioned postfix fixes.

I noticed a few potential inconsistencies:

  1. In some places, addJetCollection is called with the postFix provided as a separate parameter. In other places, the postFix is included in the labelName parameter. I think the module names should be unique either way, but maybe there should be another look at this.
  2. The BoostedJetMerger for CMSTopTag is given patJets as the jetSrc. In other places, it is given the selectedPatJets.
  3. Some substructure quantities are named as jetALGO+'PF'+PUMethod+postFix while others are named as jetALGO+PUMethod+postFix (without the 'PF'). I'm not sure why this is.

Some other suggestions for future improvement (these are all fairly easy to implement, but I wanted to get agreement on this PR first):

  1. Remove QJets: I'm not even sure what that is, but it's been commented out for 3 years.
  2. Convert "warnings" for improper options to exceptions. It's better to stop the code from running when the user specifies something unsupported, rather than potentially letting them think it worked when it didn't (if they miss the warning message).
  3. Add overall verbosity control in case the user wants to suppress all printouts.
  4. Fix bTagDiscriminators = None behavior to avoid the default list (was proposed in #47 with other changes that proved controversial; the other changes should no longer be necessary, as the huge CPU usage I noticed in 80X was caused by prefetching inefficiencies that are no longer present in 94X).
  5. Fix root file creation with tasks (address #51). Probably easiest just to add a switch that disables task association in case the user wants to do it themselves.
kpedro88 commented 6 years ago

@alefisico @rappoccio any comments on the various questions/suggestions?

alefisico commented 6 years ago

Hi @kpedro88 1) QJets was from the old jetToolbox (when it was inside CMSSW) so yeah I guess we can remove it 2) and 3) Yeah why not :) 4) and 5) true, it is something that we need to fix.