cms-nanoAOD / correctionlib

A generic correction library
https://cms-nanoaod.github.io/correctionlib/
BSD 3-Clause "New" or "Revised" License
16 stars 22 forks source link

Unable to call evaluate() on jagged arrays #125

Open alesaggio opened 2 years ago

alesaggio commented 2 years ago

Dear experts,

I am trying to use the evaluate() method to evaluate btv scale factors. I work with jagged arrays and this method does not seem to work in this scenario. Below a minimal example:

import correctionlib
import awkward as ak

btvjson = correctionlib.CorrectionSet.from_file("/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2018_UL/btagging.json.gz")

pt = ak.Array([[50., 60., 120.],
               [40., 70.],
               [55., 76., 110., 120.]])
abs_eta = ak.Array([[1.2, 2.4, 2.4],
                    [0.9, 2.1],
                    [2.3, 2.4, 2.2, 0.8]])
flav = ak.Array([[0, 4, 5],
                 [4, 4],
                 [5, 4, 5, 0]])
discr = ak.Array([[0.5, 0.6, 0.4],
                  [0.9, 0.8],
                  [0.2, 0.8, 0.9, 0.7]])
jet_sf = btvjson["deepJet_shape"].evaluate("central", flav, abs_eta, pt, discr)

This fails with:

return self._base.evaluate(*args)  # type: ignore
RuntimeError: Unable to cast Python instance to C++ type (compile in debug mode for details)

Calling the method in a loop over the events is not an option when the number of events (or jets) is too high.

Is there a solution for this? Thank you in advance for any help.

Cheers, Alessia

nsmith- commented 2 years ago

Hi,

Currently correctionlib does not accept jagged arrays even when they are broadcastable. This is a planned feature and will be available eventually. For now you can flatten and unflatten as done in this example: https://gist.github.com/nsmith-/8d3d41aaffda92148ebc7bfcc5c827f5 (in particular cell 7)

yimuchen commented 1 year ago

Not entirely self-contained within the original question, but if you are using coffea to get jagged arrays, coffea has a coffea.lookup_tools.correctionlib_wrappers [1] that you can use with something like:

lookup = correctionlib_wrapper.correctionlib_wrapper(
    correctionlib.CorrectionSet.from_file(sf_file)['deepJet_shape'])
lookup('central', jets.hadronFlavor, np.abs(jets.eta), jets.pt, bscore)

[1] https://github.com/CoffeaTeam/coffea/blob/master/coffea/lookup_tools/correctionlib_wrapper.py