scailfin / MadGraph5-simulation-configs

MadGraph5_aMC@NLO source files and configuration files for event simulaiton
MIT License
0 stars 1 forks source link

Understanding MoMEMta examples and MoMEMta-MaGMEE #3

Open matthewfeickert opened 3 years ago

matthewfeickert commented 3 years ago

Hi @swertz @FlorianBury. I have a question that by definition is going to be pretty dumb RE: using MoMEMta and MoMEMta-MaGMEE that will probably be obvious once I have time to reread the "In depth sections of https://momemta.github.io/ and https://arxiv.org/abs/1805.08555 and https://arxiv.org/abs/2008.10949. (Or maybe I'm confused on the Lua configuration process.)

If this doesn't make, please ask for clarification as I'm writing this Issue somewhat quickly.

How does one practically go from the MadGraph5 process and building the matrix elements with MoMEMta-MaGMEE to producing simulated events to the computation of weights?

The examples that are given in the tutorial repo (c.f. run_ttbar_tutorial.sh) start out with provided MatrixElements and a simulated event file to read in (Tutorials/TTbar_FullyLeptonic/tt_20evt.root). That all works fine, and those simulated events are generated using MG5_aMC@NLO, Pythia and Delphes.

However, as a starting example, I'd like to be able to start with just the MadGraph5 process and have MoMEMta-MaGMEE produce the matrix element

generate p p > l+ l-
output MoMEMta pp_drell_yan

if I want to be able to use MoMEMta with these matrix elements it isn't clear to me the in between steps required. Would I need to take that same MadGraph5 generation

generate p p > l+ l-

and then produce all the simulated events with the toolchain and then come back and have MoMEMta use that same generation process in combination with MoMEMta-MaGMEE to have MoMEMta know what was done?

I think this is probably rambly enough to have lost any clarity, but I guess I'm missing the connecting step of the requirements on event simulation and connecting simulation back to MoMEMta.

swertz commented 3 years ago

Hi Matthew,

then produce all the simulated events with the toolchain and then come back and have MoMEMta use that same generation process in combination with MoMEMta-MaGMEE to have MoMEMta know what was done?

That is correct. In MoMEMta it is basically assumed that you already have your events (simulated or data), and that you need to compute weights for them.

In your case it indeed comes down to generating the process in MG5 and writing out the C++ matrix element for MoMEMta on the one hand, and generating events using the MG5 toolchain (madevent, then Pythia or Delphes or whatever), with the same original generate command in MG5, on the other.

Note that in general there is no relationship between the two: you can compute weights under any process hypothesis, not necesseraly the same that was used to generate your events (in fact, when you compute weights on data, you don't know how the events were produced!). Very often the generator used is also different, i.e. you can very well compute weights under the hypothesis of leading-order ttbar production (with a matrix element coming from MoMEMta-MaGMEE), whereas the events were simulated using powheg at NLO.

Does this clarify things?

FlorianBury commented 3 years ago

I do not have much to add to what Sebastien already explained, except that I never used event generation myself. Instead I use centrally produced CMS samples in NanoAOD format from which I extract my own ntuples. Only after I have them do I produce the matrix elements for the different processes I want to obtain the MEM weights for, and run MoMEMta on these events.

matthewfeickert commented 3 years ago

Thanks @swertz and @FlorianBury. You've both been very helpful (truly appreciate it) and this does indeed help. Seems like I'll be on track now. :+1:

matthewfeickert commented 3 years ago

Hi again @swertz and @FlorianBury. I have some further questions which I'm hoping will be obvious once I think more about things, but I figured I'd ask in the case that I'm missing something incredibly obvious.

Simulation toolchain

In an effort to try to test the simplest (but uninteresting in reality) case scenario with MoMEMta, I wanted to test the hypothesis of Drell-Yan against Drell-Yan simulation. I started with the following MadGraph5 configuration (though for what I'll be showing later this used 1e4 events for a faster example)

https://github.com/scailfin/MadGraph5-simulation-configs/blob/9ec7c5cd9cc02c2e7a240f7949a80cd7e10c6357/configs/madgraph5/drell-yan.mg5#L1-L7

which I then ran the hepmc file PYTHIA gave through Delphes with the ATLAS card

https://github.com/scailfin/MadGraph5-simulation-configs/blob/9ec7c5cd9cc02c2e7a240f7949a80cd7e10c6357/bluewaters/drell-yan/delphes.pbs#L60-L63

and then did some preprocessing to move from the detector level event information in the Delphes output ROOT file to event selection level information where I could have the components of the particle 4-momentum

https://github.com/scailfin/MadGraph5-simulation-configs/blob/9ec7c5cd9cc02c2e7a240f7949a80cd7e10c6357/bluewaters/drell-yan/preprocessing.pbs#L60

which resulted in the preprocessing_output_10e4.root file in the attached: example_files.zip

MoMEMta stage

If I then use the following Drell-Yan hypothesis with the MoMEMta-MaGMEE plugin

https://github.com/scailfin/MadGraph5-simulation-configs/blob/9ec7c5cd9cc02c2e7a240f7949a80cd7e10c6357/configs/momemta/drell-yan.mg5#L1-L2

with the preprocessing_output_10e4.root with the following example C++ script and Lua config I'm able to produce the attached momemta_weights.root file with

$ git clone https://github.com/scailfin/MadGraph5-simulation-configs.git
$ cd MadGraph5-simulation-configs
$ docker pull neubauergroup/momemta-python-centos:1.0.1
$ docker run --rm -ti -v $PWD:$PWD -w $PWD neubauergroup/momemta-python-centos:1.0.1
[root@ac7e4ff8e23d MadGraph5-simulation-configs]# cd momemta/drell-yan/
[root@ac7e4ff8e23d drell-yan]# bash run_momemta.sh preprocessing_output_10e4.root
Click for full output ```console [root@ac7e4ff8e23d drell-yan]# bash run_momemta.sh preprocessing_output_10e4.root Unable to download /cvmfs/sft.cern.ch/lcg/external/lhapdfsets/current/CT10nlo.tar.gz CT10nlo.tar.gz: 10.1 MB [100.0%] /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan ************************************************************ * * * W E L C O M E to * * M A D G R A P H 5 _ a M C @ N L O * * * * * * * * * * * * * * * * * * * * 5 * * * * * * * * * * * * * * * * * * VERSION 3.1.1 2021-05-28 * * * * The MadGraph5_aMC@NLO Development Team - Find us at * * https://server06.fynu.ucl.ac.be/projects/madgraph * * and * * http://amcatnlo.web.cern.ch/amcatnlo/ * * * * Type 'help' for in-line help. * * Type 'tutorial' to learn how MG5 works * * Type 'tutorial aMCatNLO' to learn how aMC@NLO works * * Type 'tutorial MadLoop' to learn how MadLoop works * * * ************************************************************ load MG5 configuration from ../../../../../../../usr/local/venv/MG5_aMC/input/mg5_configuration.txt set fastjet to fastjet-config set lhapdf to lhapdf-config set lhapdf to lhapdf-config Using default text editor "vi". Set another one in ./input/mg5_configuration.txt No valid eps viewer found. Please set in ./input/mg5_configuration.txt No valid web browser found. Please set in ./input/mg5_configuration.txt import /home/feickert/workarea/MadGraph5-simulation-configs/configs/momemta/drell-yan.mg5 The import format was not given, so we guess it as command generate p p > l+ l- No model currently active, so we import the Standard Model INFO: Restrict model sm with file ../../../../../../../usr/local/venv/MG5_aMC/models/sm/restrict_default.dat . INFO: Run "set stdout_level DEBUG" before import for more information. INFO: Change particles name to pass to MG5 convention Defined multiparticle p = g u c d s u~ c~ d~ s~ Defined multiparticle j = g u c d s u~ c~ d~ s~ Defined multiparticle l+ = e+ mu+ Defined multiparticle l- = e- mu- Defined multiparticle vl = ve vm vt Defined multiparticle vl~ = ve~ vm~ vt~ Defined multiparticle all = g u c d s u~ c~ d~ s~ a ve vm vt e- mu- ve~ vm~ vt~ e+ mu+ t b t~ b~ z w+ h w- ta- ta+ INFO: Checking for minimal orders which gives processes. INFO: Please specify coupling orders to bypass this step. INFO: Trying process: g g > e+ e- WEIGHTED<=4 @1 INFO: Trying process: g g > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: g g > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: g g > mu+ mu- WEIGHTED<=4 @1 INFO: Trying process: u u~ > e+ e- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: u u~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: u u~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: u u~ > mu+ mu- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: u c~ > e+ e- WEIGHTED<=4 @1 INFO: Trying process: u c~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: u c~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: u c~ > mu+ mu- WEIGHTED<=4 @1 INFO: Trying process: c u~ > e+ e- WEIGHTED<=4 @1 INFO: Trying process: c u~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: c u~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: c u~ > mu+ mu- WEIGHTED<=4 @1 INFO: Trying process: c c~ > e+ e- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: c c~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: c c~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: c c~ > mu+ mu- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: d d~ > e+ e- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: d d~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: d d~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: d d~ > mu+ mu- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: d s~ > e+ e- WEIGHTED<=4 @1 INFO: Trying process: d s~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: d s~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: d s~ > mu+ mu- WEIGHTED<=4 @1 INFO: Trying process: s d~ > e+ e- WEIGHTED<=4 @1 INFO: Trying process: s d~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: s d~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: s d~ > mu+ mu- WEIGHTED<=4 @1 INFO: Trying process: s s~ > e+ e- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Trying process: s s~ > e+ mu- WEIGHTED<=4 @1 INFO: Trying process: s s~ > mu+ e- WEIGHTED<=4 @1 INFO: Trying process: s s~ > mu+ mu- WEIGHTED<=4 @1 INFO: Process has 2 diagrams INFO: Process u~ u > e+ e- added to mirror process u u~ > e+ e- INFO: Process u~ u > mu+ mu- added to mirror process u u~ > mu+ mu- INFO: Process c~ c > e+ e- added to mirror process c c~ > e+ e- INFO: Process c~ c > mu+ mu- added to mirror process c c~ > mu+ mu- INFO: Process d~ d > e+ e- added to mirror process d d~ > e+ e- INFO: Process d~ d > mu+ mu- added to mirror process d d~ > mu+ mu- INFO: Process s~ s > e+ e- added to mirror process s s~ > e+ e- INFO: Process s~ s > mu+ mu- added to mirror process s s~ > mu+ mu- 8 processes with 16 diagrams generated in 0.046 s Total: 8 processes with 16 diagrams output MoMEMta pp_drell_yan Output will be done with PLUGIN: MoMEMta-MaGMEE INFO: Creating subdirectories in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan INFO: Organizing processes into subprocess groups INFO: Generating Helas calls for process: u u~ > e+ e- WEIGHTED<=4 @1 INFO: Processing color information for process: u u~ > e+ e- @1 INFO: Combined process c c~ > e+ e- WEIGHTED<=4 @1 with process u u~ > e+ e- WEIGHTED<=4 @1 INFO: Generating Helas calls for process: d d~ > e+ e- WEIGHTED<=4 @1 INFO: Reusing existing color information for process: d d~ > e+ e- @1 INFO: Combined process s s~ > e+ e- WEIGHTED<=4 @1 with process d d~ > e+ e- WEIGHTED<=4 @1 INFO: Generating Helas calls for process: u u~ > mu+ mu- WEIGHTED<=4 @1 INFO: Processing color information for process: u u~ > mu+ mu- @1 INFO: Combined process c c~ > mu+ mu- WEIGHTED<=4 @1 with process u u~ > mu+ mu- WEIGHTED<=4 @1 INFO: Generating Helas calls for process: d d~ > mu+ mu- WEIGHTED<=4 @1 INFO: Reusing existing color information for process: d d~ > mu+ mu- @1 INFO: Combined process s s~ > mu+ mu- WEIGHTED<=4 @1 with process d d~ > mu+ mu- WEIGHTED<=4 @1 INFO: Creating files in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_epem INFO: Created files P1_Sigma_sm_uux_epem.h and P1_Sigma_sm_uux_epem.cc in /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_epem INFO: Creating files in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_mupmum INFO: Created files P1_Sigma_sm_uux_mupmum.h and P1_Sigma_sm_uux_mupmum.cc in /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_mupmum Generated helas calls for 4 subprocesses (8 diagrams) in 0.009 s ALOHA: aloha starts to compute helicity amplitudes ALOHA: aloha creates 5 routines in 0.262 s INFO: Created files HelAmps_sm.h and HelAmps_sm.cc in directory INFO: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/include and /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/src INFO: Created files Parameters_sm.h and Parameters_sm.cc in directory INFO: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/include and /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/src quit Checking if MG5 is up-to-date... (takes up to 5s) impossible to update: local 966 web 964 /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan -- The CXX compiler identification is GNU 8.3.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /opt/rh/devtoolset-8/root/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found ROOT: /usr/local/root-cern/bin/root-config (Required is at least version "5.34.09") -- Configuring done -- Generating done -- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/build -- Configuring done -- Generating done -- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/build -- Cache values CMAKE_BUILD_TYPE:STRING= CMAKE_INSTALL_PREFIX:PATH=/usr/local/venv MoMEMta_DIR:PATH=/usr/local/venv/lib64/cmake/MoMEMta ROOT_Cint_LIBRARY:FILEPATH=ROOT_Cint_LIBRARY-NOTFOUND [ 20%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/SubProcesses/P1_Sigma_sm_uux_epem/P1_Sigma_sm_uux_epem.cc.o [ 40%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/SubProcesses/P1_Sigma_sm_uux_mupmum/P1_Sigma_sm_uux_mupmum.cc.o [ 60%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/src/HelAmps_sm.cc.o [ 80%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/src/Parameters_sm.cc.o [100%] Linking CXX shared library libme_pp_drell_yan.so [100%] Built target me_pp_drell_yan -- The C compiler identification is GNU 8.3.1 -- The CXX compiler identification is GNU 8.3.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /opt/rh/devtoolset-8/root/usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /opt/rh/devtoolset-8/root/usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found ROOT: /usr/local/root-cern/bin/root-config (Required is at least version "5.34.09") -- Configuring done -- Generating done -- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/build -- Configuring done -- Generating done -- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/build -- Cache values CMAKE_BUILD_TYPE:STRING= CMAKE_INSTALL_PREFIX:PATH=/usr/local/venv MoMEMta_DIR:PATH=/usr/local/venv/lib64/cmake/MoMEMta ROOT_Cint_LIBRARY:FILEPATH=ROOT_Cint_LIBRARY-NOTFOUND ROOT_TREEPLAYER_LIBRARY:FILEPATH=/usr/local/root-cern/lib/libTreePlayer.so [ 50%] Building CXX object CMakeFiles/drell-yan_example.dir/drell-yan_example.cxx.o [100%] Linking CXX executable drell-yan_example [100%] Built target drell-yan_example preprocessing_output_10e4.root calculated weights for 1000 events calculated weights for 2000 events calculated weights for 3000 events calculated weights for 3690 events Info in : ROOT file momemta_weights.root has been created real 13m59.021s user 13m57.955s sys 0m1.039s ```

This all seems fine. However, I'm having trouble interpreting if the construction of my Lua config to handle the hypothesis is correct as when I look at the distribution of the weights, the distribution is very heavily skewed (and it doesn't appear to be from the small number of events). As the weights are the integral result without normalisation then the spread of the values of the weights are important by themselves but only in the context of comparison to other hypothesis weights. But the distribution's highly peaked nature seems strange.

(venv) $ python -m pip install --upgrade pip setuptools wheel
(venv) $ python -m pip install uproot "hist[plot]"  # dependencies for below
from pathlib import Path

import numpy as np
import uproot
from hist import Hist
from matplotlib.figure import Figure

if __name__ == "__main__":
    input_file = Path.cwd().joinpath("momemta_weights.root")
    tree_path = "momemta"

    with uproot.open(f"{input_file}:{tree_path}") as tree:
        drell_yan_weight_values = tree["weight_DY"].array()

    log_10_weights = -np.log10(drell_yan_weight_values)

    hist_drell_yan_weights_log = Hist.new.Reg(
        50, 0.0, 10, name="weights", metadata="drell-yan"
    ).Double()
    hist_drell_yan_weights_log.fill(log_10_weights)

    fig = Figure()
    ax = fig.subplots()
    artists = hist_drell_yan_weights_log.plot(
        ax=ax, label=f"{len(log_10_weights)} weights"
    )
    ax.legend(loc="best", frameon=False)

    ax.set_xlabel(r"$-\log_{10}\,($Drell-Yan MoMEMta Weights$)$")
    ax.set_ylabel("Count")
    ax.set_yscale("log")

    fig.savefig("drell_yan_weights_log.png")

drell_yan_weights_log

If you have time, can you look and let me know if I'm doing something wrong with the Lua config? Or am I missing something fundamental about the physics here? (I'll go and refresh myself with your papers of course in the meantime to try to answer this.)

(cc @mihirkatare)

matthewfeickert commented 3 years ago

Also @FlorianBury, to try to get a weight plot that would give me the ability to roughly compare to your Drell-Yan hypothesis weights plot for the llbb topology (in Figure 2 of your paper https://arxiv.org/abs/2008.10949) I made a first stab in PR #11 for a:

As the Lua config that you used is very similar to mine, would you be able to make any spot check comments on what I'm doing wrong it if you have time? With

logging::set_level(logging::level::debug);

I can see there are some errors RE: the transfer function evaluation bits that I'm messing up on (running on branch fix/get-llbb-topology-working)

16: [2021-08-10 05:50:07.810] [warning] Warnings found during validation of parameters for module GaussianTransferFunctionOnEnergyEvaluator::tfEval_bjet1
17: [2021-08-10 05:50:07.810] [warning]     Unexpected parameter: ps_point
18: [2021-08-10 05:50:07.810] [warning] These parameters will never be used by the module, check your configuration file.
19: [2021-08-10 05:50:07.810] [error] Validation of parameters for module GaussianTransferFunctionOnEnergyEvaluator::tfEval_bjet1 failed: 
20: [2021-08-10 05:50:07.810] [error]     Input not found: gen_particle
21: [2021-08-10 05:50:07.810] [error] Check your configuration file.
22: [2021-08-10 05:50:07.810] [warning] Warnings found during validation of parameters for module GaussianTransferFunctionOnEnergyEvaluator::tfEval_bjet2
23: [2021-08-10 05:50:07.810] [warning]     Unexpected parameter: ps_point
24: [2021-08-10 05:50:07.810] [warning] These parameters will never be used by the module, check your configuration file.
25: [2021-08-10 05:50:07.810] [error] Validation of parameters for module GaussianTransferFunctionOnEnergyEvaluator::tfEval_bjet2 failed: 
26: [2021-08-10 05:50:07.810] [error]     Input not found: gen_particle
27: [2021-08-10 05:50:07.810] [error] Check your configuration file.
28: [2021-08-10 05:50:07.810] [fatal] Validation of modules' parameters failed. Check the log output for more details on how to fix your configuration file.
terminate called after throwing an instance of 'lua::invalid_configuration_file'
  what():  Validation of modules' parameters failed. Check the log output for more details on how to fix your configuration file.
swertz commented 3 years ago

Hi Matthew, about your first Drell-Yan example: I've had a look and things look pretty good to me, I could not spot any inconsistency.

The distribution also looks quite reasonable to me. We've always seen such skewed distributions of -log(W). I don't know of any argument that would justify whether those shapes are expected or not. In general, if x ~ p, then the distribution of p(x) (or of -log(p(x)) here, i.e. some "event entropy") is not "universal" and really depends on p in the first place, no?

You can compare with the shapes in pp. 107-108 of this thesis: https://inspirehep.net/files/94258ee627e914a1d48dd1c7e2c9a21e. Although not in the same phase space, the weight distributions all feature a peak and a long skewed tail.

matthewfeickert commented 3 years ago

Hi Matthew, about your first Drell-Yan example: I've had a look and things look pretty good to me, I could not spot any inconsistency.

Thanks very much for taking the time to check @swertz — I appreciate it!

The distribution also looks quite reasonable to me. We've always seen such skewed distributions of -log(W). I don't know of any argument that would justify whether those shapes are expected or not. In general, if x ~ p, then the distribution of p(x) (or of -log(p(x)) here, i.e. some "event entropy") is not "universal" and really depends on p in the first place, no?

You can compare with the shapes in pp. 107-108 of this thesis: https://inspirehep.net/files/94258ee627e914a1d48dd1c7e2c9a21e. Although not in the same phase space, the weight distributions all feature a peak and a long skewed tail.

This is all good to hear. The more that I think about it the more this distribution makes sense as the topology that I've invented for the example is just two leptons, and so should be quite clean, and I'm comparing a physics hypothesis that directly matches the generating process for the observations. So having extremely peaked distributions under these conditions seems reasonable — as you have pointed out (though I will admit that I haven't developed more of an intuition about the distributions of the -log(weight_hypothesis) other than the obvious smaller values represent more compatibility between the physics hypothesis and the observations for the given topology).

You are of course also correct in your point on the distribution not being universal.

Also thanks for the link to @BrieucF's thesis! I'll read over it in more depth, but Figure 4.2 and 4.3 are indeed nice references (especially seeing the distributions of simulation for various hypothesis weights). :+1: