XENON1T / hax

Handy Analysis for XENON (reduce processed data)
8 stars 0 forks source link

Lichens as preselections, delayed lichen evaluation #229

Closed JelleAalbers closed 6 years ago

JelleAalbers commented 6 years ago

This improves support for lichen cuts from lax:

[1] Apply lichens as preselections. Since preselections are applied per dataset as they are loaded, this should make loading datasets with heavy lichen-based cuts RAM-friendlier.

For example:

hax.minitrees.load(dsets, preselection=['cs1 < 200', 'FiducialCylinder1T'])

will apply the fiducial volume cut as well as the usual cs1 < 200 preselection while loading.

By default the lichens are drawn from the default lichen file. To change this, specify the lichen file followed by a colon and the preselection name, e.g. sciencerun0:FiducialCylinder1T.

[2] Apply lichens to a delayed (dask) dataframe. This was triggered by a question from @dcichon. Nothing special is needed for this: just use hax.cuts.apply_lichen on the delayed dataframe.

These changes also work together, so you can load a delayed dataset with lichen preselections.

[3] Switch the default lichen file to postsr1, see https://github.com/XENON1T/lax/pull/149. This will not impact frozen analyses for the last paper as long as these are run in their appropriate frozen hax+lax environment.

JelleAalbers commented 6 years ago

OK, I now use both the presence of a capitalized first character AND the absence of spaces to distinguish a lichen name from a preselection.

Anyone willing to review this?

tunnell commented 6 years ago

Why not just add the word Cut? I think that's how it's stored in DataFrame anyways.

JelleAalbers commented 6 years ago

Thanks! Hax actually passes a copy of the dataframe to hax to avoid getting those (and other) columns back. They're great for studying cuts, but when you're applying them you don't really want twelve columns that just have all False. Using Cut or some other identifier added to the lichen name would certainly work, though it would be inconsistent with how apply_lichen works at the moment. I'm guessing nobody will write a selection string that starts with a capital letter and has no spaces, and if they do they will just get an error.