Kyle Cranmer, Nov 19, 2015
Based on Estimating the significance of a signal in a multi-dimensional search by Ofer Vitells and Eilam Gross http://arxiv.org/pdf/1105.4355v1.pdf
Thanks to Ruggero Turra for detailed checks and debugging.
Note: You can run the notebook from your browser right now by gong to everware and paste the URL to this repository.
@misc{kyle_cranmer_2015_34842,
author = {Kyle Cranmer},
title = {look-elsewhere-2d: v1.0},
month = dec,
year = 2015,
doi = {10.5281/zenodo.34842},
url = {http://dx.doi.org/10.5281/zenodo.34842}
}
import lee2d
You start with several 2d numpy arrays that represent the
It's up to you to threshold on each scan to make 2d numpy arrays for the excursion sets
This should be done at two different threshold levels $u_1$ and $u_2$ giving new numpy arrays with values 0. or 1. For example
q_scan = np.array((nx, ny))
#get excursion sets above those two levels
A_u1 = (q_scan>u1) + 0. #add 0. to convert from bool to double
A_u2 = (q_scan>u2) + 0.
(The values for $u_1$ and $u_2$ are arbitrary. If there are enough toy scans, the choice shouldn't matter, but you may want to do some tests with other choices. Suggestion is to use something like $u_1=0.1$ and $u_2=1$.)
For each of these scans you calculate the Euler characteristic
using this function in lee2d.py
def calculate_euler_characteristic(a):
"""Calculate the Euler characteristic for level set a"""
after calculating the expected (mean) value of the Euler characteristics
for those two different levels, you can correct the local siginficance with
this function in lee2d.py
. The maximum local significance is given by
def do_LEE_correction(max_local_sig, u1, u2, exp_phi_1, exp_phi_2):
"""
Return the global p-value for an observed local significance
after correcting for the look-elsewhere effect
given expected Euler characteristic exp_phi_1 above level u1
and exp_phi_2 above level u2
"""
See an example using ROOT histograms
Note: You can run the notebook from your browser right now by gong to everware and paste the URL to this repository.
This is for the special case of a likelihood function of the form $L(\mu, \nu_1, \nu_2)$ where $\mu$ is a single parameter of interest and $\nu_1,\nu_2$ are two nuisance parameters that are not identified under the null. For example, $\mu$ is the signal strength of a new particle and $\nu_1,\nu_2$ are the unknown mass and width of the new particle. Under the null hypothesis, those parameters don't mean anything... aka they "are not identified under the null" in the statistics jargon. This introduces a 2-d look elsewhere effect.
The LEE correction in this case is based on
\begin{equation} E[ \phi(A_u) ] = P(\chi^2_1 > u) + e^{-u/2} (N_1 + \sqrt{u} N_2) \, \end{equation} where