arogozhnikov / hep_ml

Machine Learning for High Energy Physics.
https://arogozhnikov.github.io/hep_ml/
Other
177 stars 64 forks source link

sPlot returns NAN sWeights #58

Closed marthaisabelhilton closed 5 years ago

marthaisabelhilton commented 5 years ago

I am currently trying to use help_ml splot and am getting some sWeights as nan. I am using a sample of ~1.4M events and this seems to be happening after event ~200k. I have checked the signal and background probabilities and these look reasonable. I have also checked the sWeights before event ~200k and these also look reasonable. I have checked the sWeighted signal and background distributions for a relevant pT variable and these also look ok.

So I am wondering is there some reason they will not be calculated correctly after a certain event? Any help would be much appreciated.

image image

image

alexpearce commented 5 years ago

What fraction of events get a NaN value? You should inspect the input data for the entries that get a NaN weight asit might be that, for example, you're feeding the log of some column which is negative for a handful of events.

arogozhnikov commented 5 years ago

I agree with @alexpearce

I'd expect problems with input - please check range of both input probs and weights, and presence of NaNs or infty values.

marthaisabelhilton commented 5 years ago

Sorry for the spam I realised the problem was with the indexing of the dataframes - I was trying to copy the sWeights dataframe to an existing dataframe and the indexes where not matching so the empty values were being filled with nans.