arogozhnikov / hep_ml

Machine Learning for High Energy Physics.
https://arogozhnikov.github.io/hep_ml/
Other
176 stars 64 forks source link

Using sWeights with GBReweighter #60

Open jcob95 opened 5 years ago

jcob95 commented 5 years ago

Hi,

I've noticed some issues with very large weights using GBReweighter. I am trying to reweight data to look like some toy data I have. I have signal sWeights for the data but not for the toy data. I've trained the BDT using only the original_weight but not target_weight arguments. Reading previous issues I've tried to ensure that there is overlap in my data and toy distributions. I'm using 800k toy and 500k data events which I think should be enough. Do I need sWeights information for both datasets? Without the sWeights the reweighting is much better, however how should I account for background if I don't use sWeights?

Thanks

withsWeightsTraining

withoutsWeightsTraining

arogozhnikov commented 5 years ago

Hi, reweighter supports sweights (which can be provided for original or target or both or none, any combination. Choose which to provide according to problem at hand).

From your description it sounds like sweights change distribution so much that it's effective support becomes smaller than target (regions with too small or negative 'weight'). If it's not the case, then try simpler model (i.e. default parameters).