nils-braun / b2luigi

Task scheduling and batch running for basf2 jobs made simple
GNU General Public License v3.0
17 stars 11 forks source link

Include global tags in pickled path object #77

Closed philiptgrace closed 3 years ago

philiptgrace commented 3 years ago

Partially addresses #35 by saving the global tags list to the pickle file.

philiptgrace commented 3 years ago

Tested this works as expected using the following two scripts.

## 01-write-pickle.py

import basf2 as b2
import modularAnalysis as ma

from b2luigi.batch.processes.gbasf2_utils.pickle_utils import (
    write_path_and_aliases_to_file,
)

files = ["/group/belle2/dataprod/MC/SkimTraining/proc11_exp10.mdst.root"]
pickle_file_path = "load_gammas.pickle"

path = b2.Path()
ma.inputMdstList("default", files, path=path)

ma.fillParticleList("gamma:NoSelections", "", True, path)
ma.variablesToNtuple(
    "gamma:NoSelections", ["p", "E"], filename="load_gammas.ntuple.root", path=path
)
write_path_and_aliases_to_file(path, pickle_file_path)

print(f"Global tags: {b2.conditions.globaltags}")
## 02-read-pickle.py

import pickle

import basf2 as b2
from basf2 import pickle_path as b2pp

def get_global_tags_from_file(file_path):
    """
    Extract list of global tags from pickle file, and update the list of global tags.
    """
    with open(file_path, "br") as pickle_file:
        serialized = pickle.load(pickle_file)
    try:
        b2.conditions.globaltags += serialized["globaltags"]
    except KeyError:
        pass

pickle_file_path = "load_gammas.pickle"

get_global_tags_from_file(pickle_file_path)
print(f"Global tags: {b2.conditions.globaltags}")
path = b2pp.get_path_from_file(pickle_file_path)

b2.print_path(path)
b2.process(path, max_event=100)
meliache commented 3 years ago

Looks good to me :+1:

The only issue I found you can find in the comment above. Not sure if we should document this elsewhere, I don't explicitly say anywhere that globaltags are not supported, it is only implicit in the caveats section of the documentation where I write

  • It can be used only for pickable basf2 paths, as it stores the path created by create_path in a python pickle file and runs that on the grid.
philiptgrace commented 3 years ago

This PR only implements one part of the ConditionsConfiguration object. If the BASF2StateRecorder solution in #35 doesn't work out, then we could capture more of the state by making the entire ConditionsConfiguration object pickle-able. From some quick testing, it seems we'd need to make a PR to basf2 to make this this work.

meliache commented 3 years ago

@philiptgrace Thanks for the quick fix of the naming/docs, if the checks from our new CI succeed I'll merge this.

If the BASF2StateRecorder solution in #35 doesn't work out, then we could capture more of the state by making the entire ConditionsConfiguration object pickle-able. From some quick testing, it seems we'd need to make a PR to basf2 to make this this work.

Sounds like a good idea but something for a separate PR. I remember that didn't get the solution with the BASF2StateRecorder from pickable_basf2 to work for our use-case, but it might be just me :man_shrugging:. I gave up on that but feel free to try.

If you figure out how to pickle the entire ConditionsConfiguration I'd be happy. We can always try to add more and more code to pickle different parts of the basf2 state, but if possible I'd try to keep things simple (KISS).

I'm still not sure whether it wouldn't be a simpler solution when we would allow the user to provide an additional python file that will then also be send to the grid via gbasf2 -f and that can be imported from the steering file wrapper (the jinja template). So the user could have a file grid_basf2_setup.py with content like:

import basf2
basf2.conditions = ...
...

And then from the steering file wrapper we could run

try:
    import grid_basf2_setup
except ModuleNotFoundError:
    pass

Not sure if this would work and even if so, it might lead to various kinds of problems that I can't predict now, so just pickling parts of the basf2 state seperately seemed easier. But if we ever really need something and this can't be just added in any other way, this or something similar might be an approach.