Sensitivity in mono-Z and mono-H sections

LHC-DMWG / DMWG-2HDM-whitepaper

2HDM whitepaper repository

GNU General Public License v3.0

4 stars 3 forks source link

Sensitivity in mono-Z and mono-H sections #5

Open urania277 opened 6 years ago

urania277 commented 6 years ago

Hi all,

could someone help me with a sentence or two on how the sensitivity formulas compare between the monoH and the monoZ sections?

It would be useful to give a simple (hand-wavy is fine) conversion factor between the two, and why we chose the values we did in order to state that Run-2 searches would be sensitive to this set of parameters.

One way to do it is to have one analyser from each group try the formula of the other group for a point, and see how the two compare.

Thanks, Caterina

urania277 commented 6 years ago

Hi all,

I'm back on this, and I had a short e-mail discussion with @AndreasAlbert, that clarifies things:

I think the main difference is that mono-H bases their result on what ATLAS mono-H recently publishes as "model-independent limits", i.e. a limit on the detector-level cross-sectionAeff in each bin. So they just have to calculate the generator cross-section, acceptance and eff and do not have to worry about uncertainties or anything (see eq 6.1).

For Mono-Z we rely on the published background estimate rather than published XS limits. Therefore, the monster formula 6.4 has to be employed (one could argue that a simple s/sqrt(B) would also have sufficed).

For your actual question of how the two relate to one another: no idea.

We surely can chat about this some time also with Eiko. For mono-Z we could also ask Chris Anelli to join, who actually did the significance calculation and may therefore have additional insights to share.

One easy way out to understand how the two relate would be to simply use the s/sqrt(B) formula on one of the mono-Higgs scan and see how close that gets to the current model-independent limits significance / find a scaling factor between the two. Is this easy to do for @obrandt1 or @lhenkelm? The risk is that if we don't do this, it is not clear how the two searches can be compared to each other.

I am also happy to talk about this on Skype or in an informal DMWG meeting.

Thanks, Caterina

obrandt1 commented 6 years ago

Hi @urania277 and @AndreasAlbert , happy to help! Just to understand a bit better the scope of the question: is the question how the choice of parameters was made for the mono-Z and mono-h grids in view of complementarity? Or is the question really about the senstivity formulae (if so, which)?

urania277 commented 6 years ago

Hi @obrandt1, thanks! Actually, it's a pretty simple question. The mono-H and mono-Z analyses choose a grid where the complementarity is shown based on two different sensitivity formulae. How do those two formulas compare?

If you take the Master branch, in the PDF in https://github.com/LHC-DMWG/DMWG-2HDM-whitepaper/blob/master/DMWG-2HDM-whitepaper_Main.pdf, monoH uses 4.2 while monoZ uses 4.6. It is clear why this difference is there (in the case of the H there are the model-independent limits), but what is not clear to the reader yet is how the two formulas compare.

A suggestion, if it doesn't take too long and if all samples are available, is to try a simple S/sqrt(B) approximation for both analyses as suggested by Andreas, and see how much that metric differs from the ones used in the paper and use that for comparison. This does not yet account for parton-to-particle effects, but it's a starting point. If mono-H doesn't have the background samples available, then we will have to think of something else.

Does this make sense to you?

Thanks, Caterina

lhenkelm commented 6 years ago

Hi @urania277, @obrandt1, @AndreasAlbert,

I am not convinced that setting up a toy analysis is the fastest or most consistent way to compare these two. I'd worry that the result is really then telling us about how the toy analysis compares to either estimate, after spending several weeks setting things up and validating them.

I instead tried a simple back-of-the envelope estimate. It gives factors of ~ 4 -11 (depending on MET region) between the monoHbb estimate and s/sqrt(b) (details below). I suspect that s/sqrt(b) is a decent enough approximation of the median asimov significance estimator used for MonoZ (since there is no signal ...) ? @AndreasAlbert and @cranelli will know this much better than I.
So using these factors 4-11, a monoZ significance of 2 would roughly be a monoH sensitivity of 0.5 -0.15 (and a monoH sensitivity of one would be significance 4-11 in monoZ terms), depending on the MET distribution.

Do you think an estimate like this would be sufficient?

Cheers, Lars

PS: How I guesstimated the 4-11: The monoH estimate per MET bin is S_mil = [partonLevelXsec AccEff BR(h --> bb) ] / (observed Xsec Limit) we get numbers of events by expanding above and below with the lumi S_mil = N_s/N_limit which can be related to N_s/sqrt(N_b) like so S_mil = sqrt(N_b)/N_limit N_s/sqrt(N_b) which is the naive significance N_s/sqrt(N_b) times a correction factor which is entirely a property of the monoH search (i.e. signal-independent). We get the correction factor for each bin, using -- that the Lumi is 36.1 fb⁻¹, -- the MIL cross-sections from https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/EXOT-2016-25/tab_02.png, -- the expected pre-fit bkg numbers per MET bin for two b-tags from https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/EXOT-2016-25/tabaux_06.png. this gives us, from lowest to highest MET bin, these correction factors: 1/7, 1/6, 1/4, 1/11. It may be possible to get something a bit more motivated by using poisson likelihoods, but I did not try that.

cranelli commented 6 years ago

Hi @urania277 , @obrandt1, @AndreasAlbert,@lhenkelm ,

I agree with what has been said. Mono-Z and mono-H use different sensitivity formulas because Mono-Z works from published backgrounds and Mono-H published MIL cross-sections.

Mono-Z has looked at s/sqrt(b) and I remember these results being similar to the Asimov significances. I can circulate the sensitivity grids for comparison.

@lhenkelm A quick s/sqrt(b) check should be possible using the tables you provide. I would have expected a s/sqrt(b) of 2 to correspond with a S_mil of 1, so a factor of 4 -11 sounds quite large.

Calculate S_mil and s/sqrt(b) for each MET bin. -- s = [partonLevelXsec BR(h --> bb) Acc*Eff ] -- b = pre-fit bkg number

Doing a quick check for the (350, 500) MET bin, the observed limit cross-section of 2.4 is compatible with a s/sqrt(b) of ~2.

For the final MET bin, since the signal is no longer small compared to the background, the formula could be updated to the asimov significance or at least s/sqrt(s+b)

Best Chris

obrandt1 commented 6 years ago

Hi @urania277 , @obrandt1, @AndreasAlbert, @lhenkelm ,

my apologies for the delay. I agree with the general direction of the discussion. I am not sure it is interesting to justify how formulae 6.2 and 6.6 follow from one another in the paper; for the purpose of the paper they are just two different metrics which are convenient for a given use case.

For the mono-Z case, as @cranelli and @AndreasAlbert pointed out, one starts from the background estimate and works with significances; for a 2 sigma effect one would need S/sqrt(B) > 2. If a larger S is predicted, one can expect to exclude it at ~95% CR, at least in my limited understanding. By the way: one can show that the Asimov significance in Eq. 6.4 approximately reduces to S/sqrt(B) for S << B IIRC.
For the mono-h case where generic limits are available, it makes sense to use the sensitivity in Eq. 6.2 which is the ratio of the excluded cross section to the predicted one. If a larger S is predicted, it would be excluded at 95% CR. The limits are actually full-blown, i.e. S/sqrt(B) would be only an approximation.

So this question essentially boils down to: how well does S/sqrt(B) compare to the actual full blown limits for the mono-h analysis. We could answer this, but I am not sure this is really so interesting.

All that said, I agree with @urania277 that it may be good to say something about how the two approaches in mono-h and mono-Z are related to each other. Let me make a wording suggestion -- at the beginning of the Section "6.1.2 Studies of the mono-Z (leptonic) signature", right after "Expected significance" we could write: "In absence of generic limits on the anomalous production of Z(ll)+MET events, the expected sensitivity of the Mono-Z(ll) channel to 2HDM+a models is approximated using generator level signal samples and background estimates from recent Z(ll) + Emiss searches using 36.1 fb−1 of 13 TeV data [23], where twice the square root of the background estimate approximates a generic limit at 95% confidence level." To make sure it is clear what "generic limits" are I suggest we write "is based on generic limits on anomalous production" in the 2nd line of Section 6.1.1.

Does this sound reasonable to everybody?

Many thanks & all best,

 Oleg

lhenkelm commented 6 years ago

Hi all,

I agree with the main thrust of Olegs suggestion, but we should reference the significance estimator actually used. Perhaps also break it up a bit, as otherwise its a really long sentence: "In absence of generic limits on the anomalous production of Z(ll)+MET events, the expected sensitivity of the Mono-Z(ll) channel to 2HDM+a models is approximated using generator level signal samples and background estimates from recent Z(ll) + Emiss searches using 36.1 fb−1 of 13 TeV data [23]. Twice the Asimov [33] significance estimate in Equation (6.3) approximates a generic limit at 95% confidence level."

@cranelli Thanks a lot for taking a closer look! I'm not sure if I misunderstood you, but I think the number you want to calculate is what I found to be ~4-11 based on the 2-b-tag numbers?
Anyways, using the sum of 1 and 2 b-tags bkgs (which maybe I should have done from the first) instead of just the 2 b-tag number, gives 2.8, 2.4, 1.6, and 2.8 from lowest to highest MET bin. As an aside, since there is a lot of bkg in the 1 b-tag region, using s/sqrt(s+b) gives 2.6 instead of 2.8 for s/sqrt(b) in the highest MET bin. I'd say that's close enough to ~2 for a rough approximation of the (conservatively biased) generic limits.

Cheers, Lars

urania277 commented 6 years ago

Hi @obrandt1, @lhenkelm, @cranelli,

I agree that we shouldn't be redoing / re-validating an analysis, a comparison made sense only if there was already a setup for it, and both estimators would appear in the paper. What we need is simply is a sentence to state that we have checked the compatibility of the two in a region of sufficiently high statistics and (if it can be extracted easily and with sufficient confidence) also an approximate scaling factor like "s/sqrt(b) of A corresponds to a S_mil of 1".

Could you let me know what you'd prefer?

Thanks, Caterina

cranelli commented 6 years ago

Hi @urania277 , @obrandt1, @lhenkelm,

Since Mono-H has available cross-section limits, I will let Oleg and Lars comment on how best to check that an Asimov significance approximation (or s/sqrt(b)) of X corresponds to a S_mil of 1. Lars's new numbers look encouraging.

Oleg's proposed text with the edit to include Asimov significance may be sufficient (small changes): "In the absence of generic limits on anomalous production of Z(ll)+MET events, the expected sensitivity of the Mono-Z(ll) channel to 2HDM+a models is approximated using generator level signal samples and background estimates from recent Z(ll) + Emiss searches using 36.1 fb−1 of 13 TeV data [23]. An Asimov [33] significance estimate of 2, Equation (6.3), approximates 95% confidence level limits." *Technically Mono-Z is not approximating generic limits, we are approximating 2HDMa limits, which the generic limits also approximate.

Also, s/sqrt(b) figures are on the repository for reference, but not included in the paper: https://github.com/LHC-DMWG/DMWG-2HDM-whitepaper/blob/DMWG_edited/texinputs/04_grid/figures/mAma_SoN_ll_2HDMa.pdf

Best Chris

urania277 commented 6 years ago

Hi all, especially @lhenkelm, @AndreasAlbert, @cranelli,

we got a few comments during the review period regarding the harmonization of the plots in the limits section, fig 21 and 22. Would it be possible for you to make them consistent in terms of notation? It would also be nice if we could have the contours for the S_mil = 1 and Z_p = 2 highlighted, if it doesn't take you too much time - you can use the examples here: https://root-forum.cern.ch/t/th2d-contour-lines/19860/9.

Thanks, Caterina and Uli