tpgillam / mt2

Stransverse mass computation, compatible with numpy.
MIT License
4 stars 3 forks source link

Resolve Lally vs LesterNachman differences in test-suite #34

Open kesterlester opened 3 years ago

kesterlester commented 3 years ago

Reminder to self that it would be a good idea for @kesterlester to check out the events and MT2 discrepancies talked about in Appendix B of Lally's https://arxiv.org/pdf/1509.01831.pdf ....

kesterlester commented 3 years ago

To do this @kesterlester will need @tpgillam to have first solved issue #35 .

tpgillam commented 3 years ago

I have written a fuzz-test, which very quickly identified some very different cases.

For example, on MacOS I see:

bad_args = (
    29.359184426449335, -41.66748425813935, 2.4727555091581337, 
    68.86316293078755, 88.10079057588052, -39.0193751229873, 
    27.644883008122292, -29.00097683721164, 
    69.60314069891137, 28.917825385644313)
print(mt2(*bad_args))  # 101.71793913068785
print(mt2_lally(*bad_args))  # 100.58558700648662

Interestingly, varying the precision argument on mt2_lally makes no difference to the answer, even when set to very large answers. For mt2 the difference in final result can be quite significant.

kesterlester commented 3 years ago

When two “simplistic” (minimisation-based) MT2 calculators disagree over an MT2 value, then 9 times out of 10 it is the one predicting the smaller value which is correct. It is a minimisation problem, of course. So the bets here would be on Lally to have the best answer.

However the Lally calculator is not plain-old-simplistic minimisation based, so my prior is less useful.

In general it would be good to ensure that any “bad_args” (if/when they are found — or just a representative handul of them of there are loads) were collated in a bad-cases library for use in subsequent unit tests.

On 15 Feb 2021, at 22:06, tpgillam notifications@github.com wrote:

I have written a fuzz-test, which very quickly identified some very different cases.

For example, on MacOS I see:

bad_args = (

29.359184426449335, -41.66748425813935, 2.4727555091581337 ,

68.86316293078755, 88.10079057588052, -39.0193751229873 ,

27.644883008122292, -29.00097683721164 ,

69.60314069891137, 28.917825385644313 )

print(mt2(bad_args)) # 101.71793913068785 print(mt2_lally(bad_args)) # 100.58558700648662 Interestingly, varying the precision argument on mt2_lally makes no difference to the answer, even when set to very large answers. For mt2 the difference in final result can be quite significant.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

tpgillam commented 3 years ago

There are indeed loads. Currently my random sampler is almost guaranteed to fail after more than about 100 draws.

tpgillam commented 3 years ago

Here is the fuzz test, which is currently skipped as it would fail: https://github.com/tpgillam/mt2/blob/4aaf83c685e7e2c197e243bf713aec76f4cf9834/tests/test_mt2_lally.py#L36-L74

tpgillam commented 3 years ago

Here is a visualisation of the bad_args event above. I'll push the code to generate this plot soon, once it's tidied up a bit.

image

In this case the Lester method is correct.

kesterlester commented 3 years ago

One nill! [to be said with two syllables on each word]

kesterlester commented 3 years ago

5 hours ago is 3:45 am!

kesterlester commented 3 years ago

Lally makes a point somewhere in his paper that in some events the rate-of-change-of-size of ellipse wrt M is very large, and he comments that some of the places where he thinks I am wrong is in such cases. Is it possible to add a +- to your ellipses. I.e. show not only lester(M2T) but also lester(MT2+-small perturbation). to check that things just don't look unreasonably good for me merely because of a rounding coincidence.

tpgillam commented 3 years ago
image

This is +- 0.1% in MT2 for each method. Code to reproduce is now pushed:

PYTHONPATH=. python examples/plot_constraints.py
Rupt commented 3 years ago

In case it is of any use, here is a csv with some examples where test_collinear_endpoint_cases reports 1% relative error or more with the Lally calculator: lally_1em2.zip.

kesterlester commented 3 years ago

That is definitely going to be useful.

I am not doing a good job of staying on top of all the different topics I’m working on at the moment (I’m never good at splitting my time) but I’ve been meaning to acknowledge that I’m pleased/impressed that you’ve found a way of speeding up my alg. Though I have not actually looked yet ( :( ) to see what heavy lifiting you took out of the main loop. I am a bit embarassed that it was so easy for you to improve, though, as although I wrote the alg primarily to be “maintainable” rather than “fast” (i.e. I was satisfied that it was both faster and more accurate and more general than anything else available at the time I published it, so I didn’t then focus on making it faster still but instead focussed on making it (to my taste) as maintainable as I could — by which I mean the code had to speak to me (if not to others) in terms of saying what it was doing).

But, despite the fact that I was optimising on logical sense (by CGL’s definition rather than Tom’s, who found my code impenetrable!) if you had asked me whtehr I thought I was wasting a lot of cycles in the hottest bit of the code I’d have probably told you that I didn’t think I was. At least not a great deal. So I’m pleased/surprised/embarassed it was so easy to improve.

Can you add this zip file (or the CSVs within it) to the git rep? It will be safer and more useful long term there than in an email thread. Can you issue a pull request? Now I wound like tom! Last time he asked me to make one of those for somethign that he could have typed into vi and pressed save git add git commit in 10 seconds I quietly fumed. But actually since then I’ve come to see there is some merit in the tracability of who did what.

On 10 Mar 2021, at 21:23, Rupert Tombs notifications@github.com wrote:

In case it is of any use, here is a csv with some examples where test_collinear_endpoint_cases reports 1% relative error or more with the Lally calculator: lally_1em2.zip.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

Rupt commented 3 years ago

@kesterlester Very nice to hear - I certainly had the advantage of a fresh perspective.

The file is available on the git issue here https://github.com/tpgillam/mt2/issues/34#issuecomment-796136779, so I don't think it needs to be committed (but am open to persuasion.)

Yes Tom puts my keyboard skills to shame also ;)

kesterlester commented 3 years ago

As a result of your email I’ve wasted far too much time reading the C++17 standard, and I’ve concluded that the last two lines of this screengrab are the most bizarre part:

On 10 Mar 2021, at 22:09, Rupert Tombs notifications@github.com wrote:

@kesterlester Very nice to hear - I certainly had the advantage of a fresh perspective.

The file is available on the git issue here #34 (comment), so I don't think it needs to be committed (but am open to persuasion.)

Yes Tom puts my keyboard skills to shame also ;)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

kesterlester commented 3 years ago

image

Rupt commented 3 years ago

Vile heresy against our Lord Kahan!

This would be internally consistent with inf == inf, log(hypot(inf, inf), atan2(inf, inf) etc.

But my standard library returns inf +nan*i for both of these.

tpgillam commented 3 years ago

Ah, but which nan? 🙂

Looking at everywhere where pi is appearing, I was thinking at first that they’d switched to a polar representation... but then they hadn’t. So very strange.