elaird / supy

analyze events stored in TTrees in parallel
8 stars 7 forks source link

Sampleholder add tuple #105

Closed gerbaudo closed 11 years ago

gerbaudo commented 12 years ago

I often get confused when passing all the arguments of SampleHolder.add as a string that's then evaluated. This change allows to pass in filesCommand=(function, dictOfParameters)

I tested it with eos, and it works; if you agree with this change, we should define in 'sites' a function returning a dict with default parameters for the other storage commands (see supy.sites.eosPars). Example usage:

exampleDict.add("sampleName", (supy.utils.fileListFromEos, dict(supy.sites.eosPars().items() +{'location':"/eos/atlas/blah" }.items())), lumi = 1.0) One bit that I don't like (but I couldn't come up with a better solution) is to concatenate the dict of the default parameters with the ones in the analysis file:

dict( dict1.items() + dict2.items() )

Thanks,

davide

betchart commented 12 years ago

Hi Davide,

This seems like a nice feature. I agree that the function in a string is not very cool. One thing we had trouble with in the past was that function addresses can not be passed when trying to do parallel processing. This may be why we were passing the whole function as a string in this case, and it needs to be tested. If it turns out to be a problem, it could still be nice to pass the function name as a string, and the parameters as a dict.

Related to your complaint about about merging two dictionaries, your solution is the recommended one. I have two other options:

  1. We can create a function in utils, def updated(defaults,updates) : return defaults.update(updates) or defaults
  2. You can use a cpython 2.* implementation detail to write dict(defaults,**updates)

Burt

betchart commented 11 years ago

Too old, closing. Reopen to request again.

Burt