elaird / supy

analyze events stored in TTrees in parallel
8 stars 7 forks source link

example data-sample with multiple sub-samples #145

Closed gerbaudo closed 12 years ago

gerbaudo commented 12 years ago

Hi Burt and Ted,

I cannot understand how to do the following: I would like to specify one data sample that is composed of several sub-samples, each one of them defined by one call to SampleHolder.add(). I looked for examples in supy and susycaf, but I can only see the SampleHolder.addInclusiveGroup() function, that requires a 'ptHatMin' parameter, and seems to meant for MC only. Is there anything equivalent for data? if not I can implement this feature.

Thanks,

davide

betchart commented 12 years ago

Hi Davide,

Currently there is no structure to specifying samples, except in the case you found where HT > 50 sample needs to be cut at HT 100 in order to convert it into 50 < HT < 100 sample, so that HT>100 sample can also be used.

Instead, we simply merge the samples in the organizer after running. If you've got something in mind for a problem as yet unsolved, I'd be happy to discuss it.

Burt

gerbaudo commented 12 years ago

Hi Burt, Thanks a lot. Can you point me to an example where you merge several samples in the organizer, please? What I was trying to do is to add several data runs with SampleHolder.add() (adding them one by one so that I can process a single run, if needed), and have the option of using a "super-sample" that is just the sum of all these samples.

Here is an example of what I tried: https://github.com/davidegerbaudo/supy-d3pdtrig/blob/master/analyses/fatJetTurnOn.py#L115 (and then L139 with SampleHolder.specify)

Thanks!

davide

betchart commented 12 years ago

Hi Davide,

For example, see: https://github.com/betchart/susycaf/blob/master/analyses/topAsymm.py#L625 There are several calls to mergeSamples().

I guess your other option is to provide a filesCommand in the sample specification such that all the files you want are simply listed together. This option is less nuanced, since it implies that events in all files are equivalent; you won't need to scale some by a different factor than others in the organizer.

gerbaudo commented 12 years ago

Hi Burt,

Thanks a ton! I think that organizer.mergeSamples is actually perfect for this. Once more, supy is really well thought out :-) Thank you, davide