scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
283 stars 83 forks source link

feat: Pruning with logical AND instead of OR #2391

Open vaustrup opened 10 months ago

vaustrup commented 10 months ago

Description

As initially discussed in #2374, being able to chain keywords in the pruning function with a logical AND rather than an OR would be very valuable as it would allow pruning e.g. modifiers only in a certain region.

With this PR, an additional keyword is introduced in the Workspace.prune function: mode which can either be logical_or (default, current behaviour) or logical_and, as suggested by @alexander-held in #2374. A ValueError is raised if a different value is set.

The Workspace.prune function is extended to call Workspace._prune_and_rename separately for each of the keywords (measurements, regions, samples, ...) when mode == 'logical_or', resulting in the same behaviour as before. On the other hand, the logic is more complicated when mode == 'logical_and'. To achieve a logical AND between keywords, Workspace._prune_and_rename is modified.

Before proceeding, we need to agree on the desired behaviour since additional logic may be required. Consider the following cases:

# should prune the 'lumi' parameter in
# channels 'channel1' and 'channel2'
# but keep it in measurements
# if it is used in other channels still
ws.prune(
    channels=["channel1", "channel2"], 
    modifiers=["lumi"],
    mode="logical_and",
)

# should prune the 'lumi' parameter
# in measurement 'measurement1'
# but keep it for all regions and samples
# if it is used in other measurements still
ws.prune(
    modifiers=["lumi"],
    measurements=["measurement1"],
    mode="logical_and",
)

# this should fail because we cannot prune
# samples and channels for certain measurements
ws.prune(
    modifiers=["mod1", "mod2"],
    samples=["process1", "process2"],
    measurements=["measurement1"],
    channels=["channel1", "channel2"],
    mode="logical_and",
)

Should the option of using logical_and also be added to Workspace.rename to e.g. allow renaming modifiers only for a certain sample?

Checklist Before Requesting Reviewer

Before Merging

For the PR Assignees: