KIT-CMS / CROWN

C++ based ROOT Workflow for N-tuples (CROWN)
https://crown.readthedocs.io
MIT License
4 stars 21 forks source link

add exeption for misconfiguration of sampleypes #242

Closed harrypuuter closed 7 months ago

harrypuuter commented 7 months ago

Now, CROWN will throw an error during configuration, when a rule is configured to be applied to a nonexistent sampletype:

code_generation.exceptions.SampleRuleConfigurationError: Sampletype embedding,embedding_mc cannot be used in Rule ProducerRule - add [ProducerGroup: RenameJetsData] for ['embedding,embedding_mc'] in scopes ['global'] since the type is not defined. Available samples types are {'ttbar', 'embedding', 'embedding_mc', 'wjets', 'singletop', 'data', 'ggh_hbb', 'vbf_hbb', 'rem_hbb', 'dyjets', 'diboson', 'rem_htautau', 'ggh_htautau', 'vbf_htautau', 'electroweak_boson'}
CMake Error at CMakeLists.txt:338 (message):
  Code Generation Failed - Exiting !
harrypuuter commented 7 months ago

This should catch some errors, ofc things like

samples=[
            sample
            for sample in available_sample_types
            if sample not in ["data, embedding", "embedding_mc"]
        ],

are still not checked by this

a-monsch commented 7 months ago

This should catch some errors, ofc things like

samples=[
            sample
            for sample in available_sample_types
            if sample not in ["data, embedding", "embedding_mc"]
        ],

are still not checked by this

Would it make sense to add somewhere a check like searching for '("\s*\w+\s*,\s*\w+\s*")' to catch those obvious but from time to time hardly detectable mistakes?

harrypuuter commented 7 months ago

I don't want to add such checks to the framework, since regex checking a configuration is not something a framework should do. On the other hand, the case that is now checked is something well within the information given to CROWN by the configuration, this second way of setting up rules is not really intended.

What would make more sense would be to extend the capabilities of ProducerRules so the user can set the samples, where the rule should not be applied, something like

configuration.add_shift(
        SystematicShiftByQuantity(
            name="metUnclusteredEnUp",
            quantity_change={
                nanoAOD.MET_pt: "PuppiMET_ptUnclusteredUp",
                nanoAOD.MET_phi: "PuppiMET_phiUnclusteredUp",
            },
            scopes=["global"],
        ),
        exclude_samples=["data", "embedding", "embedding_mc"],
    )

which would then mean, we can validate the list again, and the Rule is automatically applied to all samples that are not excluded