Closed EdLeming closed 8 years ago
I'm not sure if you two get automatic notifications for pull requests? @ashleyrback @jwaterfield
Yes, I got a notification already. I was just having a look. Looks nice Ed. Thanks for having a look into this.
Two questions straight-off:
Hmmmm, ok. I probably should have checked this earlier....
The attached has 10keV bins. Execution time:
Pois = 520 s Gaus = 13 s
I think this is because of the factorial in the denominator of the poisson function. Once we get up to energies of 2 MeV = 400 photons. It's therefore calculating 400! at each function call. Not only is 400! an outrageously large number, it requires 400 multiplications (unless scipy is doing something clever).
That's odd, scipy is usually quite good at doing clever stuff to speed things up, but maybe not so much here.
The factorial in the denominator could become a major problem for say B8 where we want to go up like 10 MeV (well we might be able to get away with 5 MeV).
One solution might be to make a new class each for Gaussian and Poisson smearing, that both inherit from a common energy smearing class. You could then recombine them in a sort of combo smear class:
class ComboSmearEnergyLy(GaussianSmearEnergyLy, PoissonSmearEnergyLy):
# Do stuff
that calls the Poisson method for any bins below say 1 MeV, otherwise uses the Gaussian method. The only thing we will have to be careful of is making sure there are no problems in the transition from one method to the other.
That is an option. I would be hesitant to do that before we understand the discrepancy in tails.
Writing the poisson function myself I get identical results / speeds to scipy when I apply the Stirling approximation for the factorial calculation. However, this does require using the decimal module to round numbers to 10 decimal places - if you don't do this numbers become too large and overload the python floating point container. It may be that the tail anomaly is simply a result of the rounding error... Particularly as gaus should be equivalent to poisson at these high statistics levels.
I'll have a fresh look on Monday.
So I was having a look at this again today. I wanted to convince myself that, in our case, the gaussian approximation of the poission distribution at high NHit is a reasonable one.
I tested the gaussian-ness of smeared distributions by smearing delta functions (i.e. a load of events all of one energy) with both the methods. I then fitted gaussians to the resulting dists and took a chi2. The results are in the plot below. Interesting to note, the chi2 of the gaussian smeared distribution is ~3 orders of magnetude better than the poisson one at 5MeV (I can send on the root file with all the histos/fits etc if anyone's interested). I think this is strong motivation to use a poisson smear across the full energy range. After all, it's tails smearing to high energy that is our biggest worry for 2nuBB limit setting and it appears the gauss approx is underestimating this.
Considering this our best option is to find a way to speed up the computation time of the calc_poisson_energy method. After an afternoon of testing it appears to be limited by the multiplication of extemely large numbers - lambda**Nhits can be as large as 1e6500 for the largest energy bins. Does anyone have any tricks for speeds up that they can recommend @mjmottram? The gmpy2 module looks promising, but I'm having issues installing it.
I don't have anything to input on this I'm afraid, other than to mention that I can install gmpy2 (with some warnings) via pip.
Ok, I got it up to speed. It's now equivalent (even slightly faster... ) than the gaussian smear. @ashleyrback @jwaterfield
@EdLeming can you merge snoplusuk/master to resolve any conflicts please, and then I'll get this PR merged as well. I think @mjmottram is quite keen to start using it.
Should be good to go!
A few unittest problems:
======================================================================
ERROR: test_weight_energy_ly (test_smear.TestSmear)
Tests the weighted gaussian smearing method for energy
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ashley/snoplus/software/echidna/echidna-pr/echidna/test/test_smear.py", line 102, in test_weight_energy_ly
par="energy_mc")
File "echidna/core/smear.py", line 269, in weighted_smear
mean))
File "echidna/core/smear.py", line 144, in calc_poisson_energy
self._log_factorial[photons] = numpy.sum(numpy.log(np.arange(1,(photons+1))))
NameError: global name 'np' is not defined
======================================================================
ERROR: test_weight_energy_res (test_smear.TestSmear)
Tests the weighted gaussian smearing method for energy resolution.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ashley/snoplus/software/echidna/echidna-pr/echidna/test/test_smear.py", line 188, in test_weight_energy_res
par="energy_mc")
File "echidna/core/smear.py", line 508, in weighted_smear
for energy in np.arange(low, high, widths[i]):
NameError: global name 'np' is not defined
======================================================================
FAIL: test_random_energy_ly (test_smear.TestSmear)
Tests the random gaussian smearing method for energy light yield.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ashley/snoplus/software/echidna/echidna-pr/echidna/test/test_smear.py", line 154, in test_random_energy_ly
% (expected_sigma, sigma))
AssertionError: Expected sigma 0.111803398875, fitted sigma 0.10008393979
======================================================================
FAIL: test_random_energy_res (test_smear.TestSmear)
Tests the random gaussian smearing method for energy resolution.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ashley/snoplus/software/echidna/echidna-pr/echidna/test/test_smear.py", line 240, in test_random_energy_res
% (expected_sigma, sigma))
AssertionError: Expected sigma 0.0790569415042, fitted sigma 1.00681369374
----------------------------------------------------------------------
I think the np
Exceptions are my fault, these just need to be changed to numpy
- I thought I'd changed all of them.
The other failures are probably because poisson
is True
by default, so you'll need to set poisson
as False
for the expected numbers.
Should there be some additional tests to test the Poisson case explicitly?
Sorry to nag but there are also some pep8 errors, see below, it is important that we try and maintain the pep8 standard throughout echidna.
echidna/core/smear.py:113:53: W291 trailing whitespace
echidna/core/smear.py:118:80: E501 line too long (80 > 79 characters)
echidna/core/smear.py:119:44: W291 trailing whitespace
echidna/core/smear.py:138:17: W291 trailing whitespace
echidna/core/smear.py:143:31: W601 .has_key() is deprecated, use 'in'
echidna/core/smear.py:143:49: E712 comparison to False should be 'if cond is False:' or 'if not cond:'
echidna/core/smear.py:144:75: E231 missing whitespace after ','
echidna/core/smear.py:144:80: E501 line too long (89 > 79 characters)
echidna/core/smear.py:145:80: E501 line too long (101 > 79 characters)
echidna/core/smear.py:145:88: E261 at least two spaces before inline comment
echidna/core/smear.py:145:88: E262 inline comment should start with '# '
echidna/core/smear.py:147:1: W293 blank line contains whitespace
echidna/core/smear.py:267:52: E712 comparison to True should be 'if cond is True:' or 'if cond:'
echidna/core/smear.py:333:44: E712 comparison to True should be 'if cond is True:' or 'if cond:'
echidna/core/smear.py:336:80: E501 line too long (82 > 79 characters)
echidna/core/smear.py:352:53: W291 trailing whitespace
echidna/core/smear.py:357:80: E501 line too long (80 > 79 characters)
echidna/core/smear.py:358:44: W291 trailing whitespace
echidna/core/smear.py:377:17: W291 trailing whitespace
echidna/core/smear.py:382:31: W601 .has_key() is deprecated, use 'in'
echidna/core/smear.py:382:49: E712 comparison to False should be 'if cond is False:' or 'if not cond:'
echidna/core/smear.py:383:69: E231 missing whitespace after ','
echidna/core/smear.py:383:80: E501 line too long (83 > 79 characters)
echidna/core/smear.py:384:80: E501 line too long (98 > 79 characters)
echidna/core/smear.py:384:85: E261 at least two spaces before inline comment
echidna/core/smear.py:384:85: E262 inline comment should start with '# '
echidna/core/smear.py:509:52: E712 comparison to True should be 'if cond is True:' or 'if cond:'
echidna/core/smear.py:575:44: E712 comparison to True should be 'if cond is True:' or 'if cond:'
echidna/core/smear.py:578:80: E501 line too long (82 > 79 characters)
echidna/core/smear.py:592:53: W291 trailing whitespace
echidna/core/smear.py:596:80: E501 line too long (80 > 79 characters)
echidna/core/smear.py:597:44: W291 trailing whitespace
Once the unittests pass and pep8 errors are fixed, though it is good to go.
OK, should be good to go now. When writing the new unit tests I noticed a couple of issues with the random_smear poisson functionality. All works as expected now.
Is that a link to anywhere in particular @ashleyrback ? looks broken to me. Merge? :+1:
Looks good to me, merging!
Only added poisson smearing for energy spectra.. Not sure how to do so for radial as poisson is inherently statistical i.e. probability of no of events occurring. In energy this is obvious as events = Nhits, in radius I'm not sure how to make it work....
Also interesting is that using a poisson smear has increased the contribution of tails see attached... One to think about.