umd-lhcb / lhcb-ntuples-gen

ntuples generation with DaVinci and in-house offline components
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Comparison between FullSim and tracker-only #64

Closed yipengsun closed 3 years ago

yipengsun commented 3 years ago

We should finish this comparison ASAP to finalize our QA for the MC samples so that we can proceed with the large MC request.

So far, my plan is to compute chi2 with FullSim as template and plot the branches with largest chi2.

yipengsun commented 3 years ago

b0_Hlt1TrackMVA_TOS_el b0_Hlt1TrackMVA_TOS_mmiss2 b0_Hlt1TrackMVA_TOS_q2 b0_Hlt1TwoTrackMVA_TOS_el b0_Hlt1TwoTrackMVA_TOS_mmiss2 b0_Hlt1TwoTrackMVA_TOS_q2

yipengsun commented 3 years ago

@afernez The tracker only ntuple is stored at:

ntuples/0.9.3-production_for_validation/Dst_D0-mc/Dst_D0--21_03_10--mc--tracker_only--MC_2016_Beam6500GeV-2016-MagDown-TrackerOnly-Nu1.6-25ns-Pythia8_Sim09j_Reco16_Filtered_11574021_D0TAUNU.SAFESTRIPTRIG.DST.root
yipengsun commented 3 years ago

@manuelfs @afernez I think @Svende and I found the problem that lead to poor agreement between emulated and "real" Hlt1TrackMVA trigger: The cuts are different between data TCK and MC TCK.

Initially I was using data TCK, and the plots are here: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_15/gen/rdx-sample-trigger-emu/plots_raster

Now I revert back to Simone's TCK, and the agreements for both triggers are better. Plots here: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_16/gen/rdx-sample-trigger-emu/plots_raster

manuelfs commented 3 years ago

That agreement looks great, nice work! We will worry about full agreement at a later time, but if you can reproduce the procedure with the Tracker-only MC, then it looks like you've validated that TO has everything we need for HLT1, right?

On Apr 15, 2021, at 8:19 PM, Yipeng Sun @.**@.>> wrote:

@manuelfshttps://github.com/manuelfs @afernezhttps://github.com/afernez I think @Svendehttps://github.com/Svende and I found the problem that lead to poor agreement between emulated and "real" Hlt1TrackMVA trigger: The cuts are different between data TCK and MC TCK.

Initially I was using data TCK, and the plots are here: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_15/gen/rdx-sample-trigger-emu/plots_raster

Now I revert back to Simone's TCK, and the agreements for both triggers are better. Plots here: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_16/gen/rdx-sample-trigger-emu/plots_raster

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/umd-lhcb/lhcb-ntuples-gen/issues/64#issuecomment-820820319, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABF7GD3GVFGDRVJKRS2QIH3TI57AFANCNFSM4ZFRHUOQ.

yipengsun commented 3 years ago

That agreement looks great, nice work! We will worry about full agreement at a later time, but if you can reproduce the procedure with the Tracker-only MC, then it looks like you've validated that TO has everything we need for HLT1, right?

By "reproduce ...", do you mean that I also should generate comparison between emulated triggers in FS and TO?

yipengsun commented 3 years ago

Also, I think if we have a larger sample, then the agreement should be even better. Note that the points with smaller error bars actually have better agreements.

manuelfs commented 3 years ago

Yes, that was the ultimate goal, no?

Svende commented 3 years ago

Great Yipeng, thanks for posting these plots! Yes I think it makes sense to check for the HLT1response in tracker-only samples to see if they agree with what we see emulating fullsim.

Svende commented 3 years ago

Looking closer at these plots, it looks like that for both q2 and m^2miss the emulated efficiency is below the 'real response', it's not much but we should keep an eye on it when we compare larger samples. Also eventually it would be good to have comparison plots for all variables we cut on in HLT1, posting here what Simone suggested:
1) Fit variables 2) All the variables used in the HLT1 selection, as they will have sharper selection shoulders, and can help you debugging your code better 3) the HLT1 MVA variables, for the same reason as before 4) angular variables (eta, theta, phi), to check the acceptance of the HLT1 cuts is correct. So in our case we need all plots for the muon, eta,phi, theta as well as all variables for the Hlt1TwoTrackMVAUnit listed here https://github.com/umd-lhcb/rdx-run2-analysis/blob/master/docs/triggers/hlt1.md

yipengsun commented 3 years ago

L0Hadron_TOS emulation, without a BDT:

B0

b0_L0Hadron_TOS_el b0_L0Hadron_TOS_mmiss2 b0_L0Hadron_TOS_q2

B

b_L0Hadron_TOS_el b_L0Hadron_TOS_mmiss2 b_L0Hadron_TOS_q2

yipengsun commented 3 years ago

@manuelfs @Svende @afernez I posted the naive L0Hadron TOS emulation above, without using a BDT.

yipengsun commented 3 years ago

L0Global_TIS emulation:

B0

b0_L0Global_TIS_el b0_L0Global_TIS_log_b0_true_pt b0_L0Global_TIS_log_b0_true_pz b0_L0Global_TIS_mmiss2 b0_L0Global_TIS_q2

B

b_L0Global_TIS_el b_L0Global_TIS_log_b_true_pt b_L0Global_TIS_log_b_true_pz b_L0Global_TIS_mmiss2 b_L0Global_TIS_q2

manuelfs commented 3 years ago

Can you also plot the ratio of these?

On Apr 20, 2021, at 2:14 PM, Yipeng Sun @.**@.>> wrote:

L0Global_TIS emulation:

B0

[b0_L0Global_TIS_el]https://user-images.githubusercontent.com/33738176/115444485-eb8f1300-a214-11eb-8c8f-0bbc24407c06.png [b0_L0Global_TIS_log_b0_true_pt]https://user-images.githubusercontent.com/33738176/115444490-ec27a980-a214-11eb-9f46-6a7f0fde3b64.png [b0_L0Global_TIS_log_b0_true_pz]https://user-images.githubusercontent.com/33738176/115444492-ecc04000-a214-11eb-9edb-a64c308b6aab.png [b0_L0Global_TIS_mmiss2]https://user-images.githubusercontent.com/33738176/115444494-ecc04000-a214-11eb-9570-1ca51dedf79d.png [b0_L0Global_TIS_q2]https://user-images.githubusercontent.com/33738176/115444496-ed58d680-a214-11eb-9ef9-31227bcc6184.png

B

[b_L0Global_TIS_el]https://user-images.githubusercontent.com/33738176/115444497-ed58d680-a214-11eb-8e35-50dcaf370ed2.png [b_L0Global_TIS_log_b_true_pt]https://user-images.githubusercontent.com/33738176/115444498-edf16d00-a214-11eb-94f0-1ca8f14665b3.png [b_L0Global_TIS_log_b_true_pz]https://user-images.githubusercontent.com/33738176/115444499-edf16d00-a214-11eb-9f30-ddffa4d0fa89.png [b_L0Global_TIS_mmiss2]https://user-images.githubusercontent.com/33738176/115444500-ee8a0380-a214-11eb-8246-ca785286e132.png [b_L0Global_TIS_q2]https://user-images.githubusercontent.com/33738176/115444502-ee8a0380-a214-11eb-8f78-aa1e7d28c59d.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/umd-lhcb/lhcb-ntuples-gen/issues/64#issuecomment-823497246, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABF7GD3XPW6OFYQWY2TFNSLTJXAAPANCNFSM4ZFRHUOQ.

Svende commented 3 years ago

Yes I agree, a ratio in the bottom of the plots would be helpful for all the trigger emulation plots. Also for L0_Hadron the pT and momenta plots for pions & kaons as well as D*0/+ pT would be good to look at as a comparison. For L0GlobalTIs you can combine some bins compared to the tracker-only Note.

yipengsun commented 3 years ago

For the ratio plot, what should I do for error propagation? Given that current error is based on Clopper-Pearson, the error bars are not Gaussian and non-symmetrical, so I don't think the naive error propagation for Gaussian error would work.

yipengsun commented 3 years ago

Updated plots are available at: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_23/gen/rdx-trigger-emu-nor/plots_raster

yipengsun commented 3 years ago

L0HadronTOS efficiency regarding various PT:

B0

b0_L0Hadron_TOS_d0_pt b0_L0Hadron_TOS_k_pt b0_L0Hadron_TOS_pi_pt

B

b_L0Hadron_TOS_d0_pt b_L0Hadron_TOS_k_pt b_L0Hadron_TOS_pi_pt

Svende commented 3 years ago

Updated plots are available at: https://github.com/umd-lhcb/lhcb-ntuples-gen/tree/master/archive/21_04_23/gen/rdx-trigger-emu-nor/plots_raster

Thanks, I assume those are the higher stats plot we talked about. Seems that now the emulated efficiency is always below the FullSim response. For the muon I only see phi & theta, can you also plot p, pt and the rest of the variables for it?

yipengsun commented 3 years ago

By other variables for muon, you mean track quality variables like GhostProb and various chi2 variables?

Svende commented 3 years ago

For the muon I only see phi & theta, can you also plot p, pt and the rest of the variables for it?

yipengsun commented 3 years ago

I've managed to make the BDT script work, but currently the BDT vastly over-correct efficiencies: b0_L0Hadron_TOS_d0_pt_step

I have a good explanation for this. In both RD+ ANA note and their training script, the regression BDT was trained on difference between true and emulated trigger ET (of D0). But currently, my D0 true ET is just max(k_true_ET, pi_true_ET), which doesn't make too much sense given that we did various corrections to get D0 emulated ET.

The problem now is: How'd we find D0 true ET? This is not available either our or Simone's sample ntuple.

yipengsun commented 3 years ago

For trigger efficiency studies, they will be tracked in their separate issue. Every other comparison for FullSim and tracker-only looks good.