HEP-KBFI / stpol

Single top polarisation
0 stars 4 forks source link

step3 skimmer/projector #30

Open jpata opened 11 years ago

jpata commented 11 years ago

Make a standalone program which takes as input a step3 .root file, containing an arbitrary number of TTrees under the directory "trees/" (may be generalized) and a cutstring and projects out only the events that pass a certain cut string into a separate file. Care must be taken that all the TTrees must be reduced, so that the events that fail a cut are thrown away from every tree.

Example:

projectCut infile.root outfile_2J1T.root --cut "n_jets==2 && n_tags==1"

  • infile.root: trees/Events => 10k entries, trees/WJets_weights => 10k entries, trees/MVA => 10k entries,
  • outfile_2J1T.root: trees/Events => 123 entries, trees/WJets_weights => 123 entries, trees/MVA => 123 entries,
toruonu commented 11 years ago

Ok, I think this needs some background checking from roottalk because the methods ROOT has for CopyTree etc keep it tied to the original TTree. The only method as far as I know is to clone the tree structure to a new file without data and refill it while running over the original trees.

However here comes the question what to do when we're dealing with Friends and we want to filter those too without merging the whole structure into the main Events tree (could be an option though).

toruonu commented 11 years ago

Interesting idea from Philippe Canal:

For this purpose, you can use a TEntryList, which can be created by TTree::Draw itself. You can create the TEntryList to contains the entries that pass your cut and you can call TTree::SetEntryList. After this call TTree::Draw will restrict the reading and plotting to just those entries.

We should do some performance testing, but in this case we can for each cut run the TTree::Draw to create the entry list and save that in some file (probably a pickle or smth). Then for plotting we can just use the entry list. We'll have to see.

jpata commented 11 years ago

Interesting and also old :). That's what I'm already doing in the new histogrammer, just on the fly. So if I need 100 histograms with a cut X, I make the TEntryList corresponding cut X and then the 100 histograms. It may be beneficial to store a couple of TEntryLists inside the TFile itself, but one would then have to have some kind of logic inside sample.py to load/set those entry lists manually. It adresses the question of making a predefined set of operations faster on the files, but not the problem of having the files fit on memory/on a laptop.

jpata commented 11 years ago

See Issue #31 for the proposal.

jpata commented 10 years ago

no longer relevant