Missing PU mitigation in isolation

rmanzoni commented 3 years ago

Hi,

@friti and I noticed that all tracks that pass minimal kinematic requirements are considered for the isolation https://github.com/CMSBParking/BParkingNANO/blob/master/BParkingNano/plugins/BToKLLBuilder.cc#L206-L239

This includes PU tracks, while typically these are rejected via conditions on dz impact parameters.

Is this intended?

ottolau commented 3 years ago

Hi Riccardo,

I also spotted this problem. I was thinking of using the SelectedTracks produced by tracksBPark producer, instead of the full packedPFCandidate without any selection. The SelectedTracks was produced from packedPFCandidate, with the same kinematics (pT>0.5, eta < 2.5), but with additional cuts (e.g. dz from trigger muon is less than 1 to remove PU tracks, also with some cross cleaning conditions). I think the SelectedTracks are more appropriate to use.

I will do a little study on it.

Best, Otto

rmanzoni commented 3 years ago

Hi Otto,

we implemented this little selection here and it improves the data-MC agreement dramatically https://github.com/friti/BParkingNANO/blob/ul/BParkingNano/plugins/BTommmBuilder.cc#L605-L608

(the plots have been produced at different times, so no 1-to-1 comparison, yet it gives you an idea of the effect)

before PU rejection l1_iso03_rel

after PU rejection l1_iso03_rel

ottolau commented 3 years ago

Hi Riccardo,

Thanks a lot! This is very helpful. Do you have an idea of why choosing dz<0.4? And do you think using dz(trk, SV) instead of dz(trk, lepton) works as well?

Thanks, Otto

rmanzoni commented 3 years ago

Hi Otto,

dz<0.4 is a loose enough but sensible choice for our signal, it may be suboptimal for your signal but I guess it's reasonable. I am not sure what you mean by dz(trk, SV) vs dz(trk, lepton) as in the former you are referring to an IP wrt to a vertex and in the latter to a trk-trk distance I guess, so I guess they are not really the same thing. In any case, I believe that if you don't cut too tight and you use differences, rather than absolute positions, the different options are pretty much equivalent.

Along the same lines, I suspect that relying on the vertex assignment to compute the distance between two particle is kind of risky https://github.com/CMSBParking/BParkingNANO/blob/master/BParkingNano/python/BToKLL_cff.py#L11 Using the difference between two dz computed wrt to the same reference should be a safer option.

Just my 2 cents here ;-) your mileage may vary

Cheers

ottolau commented 3 years ago

Hi Riccardo,

Oh, I thought vz corresponds to the absolute z-position w.r.t. the detector coordinate. The reference would change? Could you explain a bit? I think we use vz very often in the ntuplizer.

Thanks, Otto

rmanzoni commented 3 years ago

Hi Otto,

yes, vz is the absolute vertex position, but the way vertices are assigned to tracks (or tracks to vertices, if you like) may be suboptimal. IIRC, it relies on the probability / weight of a given track in the vertex fit. This might not work great, especially with PU tracks. https://arxiv.org/pdf/1405.6569.pdf

Whereas with dz one gets the distance between a track and a reference point (be it (0,0,0), the beam spot or a chosen vertex, this depends on the argument you pass to it) and in the differences of two dzs, the position of the reference point is removed and one is left with the distance in z between two tracks.

In my experience dz is to be preferred over vz, but if this is critical you may want to dig a bit deeper.

Cheers, Riccardo

gkaratha commented 3 years ago

Hi @rmanzoni ,

Yes it is intentional. The iso variable was added by @sarafiorendi . I can find her presentation but she can find it and send it much faster probably. Essentially we use the same tracks as the "signal" tracks if you wish.

Best, George

CMSBParking / BParkingNANO

Missing PU mitigation in isolation #96