vhbb / cmssw

CMS Offline Software
cms-sw.github.io/cmssw
4 stars 5 forks source link

Top mass in VHbb ntuples #483

Open silviodonato opened 8 years ago

silviodonato commented 8 years ago

@capalmer85 @scooperstein
I think it would be very useful to add the leptonic top mass and the hadronic top mass in VHbb ntuples.

jpata commented 8 years ago

do you mean the generated top? That's already there (GenTop_mass, GenTop_decayMode) https://github.com/vhbb/cmssw/blob/vhbbHeppy80X/VHbbAnalysis/Heppy/test/vhbb.py#L133

silviodonato commented 8 years ago

No, I don't. I was meaning the reconstructed top mass to reject TT in W(lv)H(bb) and Z(vv)H(bb). The leptonic top mass is defined including the lepton, MET, and the closest b-jet.

jpata commented 8 years ago

ok nevermind. I had added the GenTop_decayMode relatively recently, that's why I was wondering if you are looking for that.

scooperstein commented 8 years ago

Hi Silvio,

Sorry for the late reply. We calculate the leptonic top mass using code borrowed from the top group.

https://github.com/capalmer85/AnalysisTools/blob/master/plugins/VHbbAnalysis.cc#L1586

I would be happy to help add the calculation of this variable to the Heppy code, although I am not really familiar with adding significant additions like this to the Heppy code so I think it would be much more efficient if someone with more experience could help me add it. Would that be possible?

Best, Stephane

arizzi commented 7 years ago

anyone working on this?

capalmer85 commented 7 years ago

Stephane asked for help in his last email. I don't think anyone offered him any. :(

I think Stephane basically needs to know where to put the code (we already have it written in our framework) and perhaps a pushing recipe if it is complicated.

Can you or someone else offer those two pointers?

On Tue, Jan 10, 2017 at 8:35 AM, arizzi notifications@github.com wrote:

anyone working on this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vhbb/cmssw/issues/483#issuecomment-271576685, or mute the thread https://github.com/notifications/unsubscribe-auth/AFKqhwgbP00X5u0QqvI4iFh3Ove2pxjoks5rQ4kNgaJpZM4JLKfw .

arizzi commented 7 years ago

the code I see is in c++, can you make it in python ? consider that all variables you use are already in form of p4 in heppy and they are named e.g. event.selectedLeptons or stuff like that...

the final location doesn;t matter we can cut and paste it anywhere, but you can implement a computeTopMass function in VHbbAnalyzer.py and then call it in the main function "process"

scooperstein commented 7 years ago

okay thanks for the info. I will start with implementing a computeTopMass(lep 4-vector, MET 4-vector, b-jet 4-vector) function in python and I'll put it somewhere in VHbbAnalyzer.py. If it is not clear to me where to go from there I may consult someone more experienced with Heppy development to help with the final incorporation into the ntupler. I'm working on something else for tomorrow but I am intending to have this together either tomorrow or Thursday.

scooperstein commented 7 years ago

I adapted the C++ code from the top group that we had incorporated into the WH(bb) framework into a self-contained python script.

https://github.com/scooperstein/PrincetonAnalysisTools/blob/master/scripts/getTopMass.py

Like we discussed at the meeting today, it is now a matter of incorporating this into Heppy. Maybe we can iterate further via email to discuss how is best to do that.

arizzi commented 7 years ago

@scooperstein I thought it was just few lines of python to write, not 200 lines of math operations! What was supposed to be pythonized was the computeTopMass function that you pointed us to (I thought that was it and nothing else was needed, I did not notice it was calling other functions in the same file). It could be quite slow to make all that math in python indeed (beside you spent a day converting it I imagine!!). Apologies for the misunderstanding, now we can use this as it is if it is not too slow, otherwise we can use the pythonized computeTopMass calling external c++ functions.

scooperstein commented 7 years ago

Sorry for the misunderstanding. Well we now have the necessary pieces written in both python and C++. :) Please let me know how you think we should proceed. I can quickly provide the same pieces in C++ if needed.

arizzi commented 7 years ago

we can probably do some speed test of the python solution, if ok we go with it, otherwise we can put the C++ class in src, make a dictionary and use it the same way as SoftActivity and ColorFlow classes

scooperstein commented 7 years ago

Hi Andrea,

I did a test calling this python computeTopMass() function on 1 million randomly generated sets of lep, met, jet1, jet2 vectors and I get:

time python getTopMass.py

213.181u 17.460s 3:39.47 105.0%

maybe you or someone else can comment on whether this is reasonable. I am happy to perform any other tests if requested.

RemKamal commented 7 years ago

Hi all, did it get to vhbb? The version of the script that was posted way above, is okay, if I am running on low size samples, but gives ' Problem (math domain error)', if I use all available stats. I guess this python math exception comes from some negative number(or complex?), which screws some math function. Have you seen anything like this? Thanks.

scooperstein commented 7 years ago

Hi,

As far as I understand this has not been integrated into Heppy, since it was not viewed as a priority for the V25 campaign. Yes indeed there are some rare cases where it was getting domain errors due to negative inputs to power(). If you change it slightly to include a protection like

if (r+sqrt(Delta) > 0): s = complex(pow((r+sqrt(Delta)),(1./3)),0) else: s = complex(-pow(abs((r+sqrt(Delta))),(1./3)),0) if (r-sqrt(Delta) > 0): t = complex(pow((r-sqrt(Delta)),(1./3)),0) else: t = complex(-pow(abs((r-sqrt(Delta))),(1./3)),0)

that should fix it.

arizzi commented 7 years ago

it did not enter V25

On Thu, Feb 9, 2017 at 3:09 PM, RemKamal notifications@github.com wrote:

Hi all, did it get to vhbb? The version of the script that was posted way above, is okay, if I am running on low size samples, but gives ' Problem (math domain error)', if I use all available stats. I guess this python math exception comes from some negative number(or complex?), which screws some math function. Have you seen anything like this? Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vhbb/cmssw/issues/483#issuecomment-278650743, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyiltb2EFFjZ23RrZuUuqQjaie5NQPlks5rax3-gaJpZM4JLKfw .

RemKamal commented 7 years ago

@scooperstein thanks, I will try!

RemKamal commented 7 years ago

As math.pow(x, y) says: If both x and y are finite, x is negative, and y is not an integer then pow(x, y) is undefined, and raises ValueError.

For example: r ,Delta = -3033136.74138, 8.63740408607e+12 a, b, c, d, q, rho, theta = 1.0 -135.135850433 -18677.4330836 7090404.49876 -8254.88859146 0.0 0.0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Problem (math domain error)

r ,Delta = -940610.687614, 8.79093981695e+11 a, b, c, d, q, rho, theta = 1.0 281.726948841 21112.0456248 2207484.79714 -1781.54853655 0.0 0.0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Problem (math domain error)