Closed malb closed 10 years ago
Branch pushed to git repo; I updated commit sha1. New commits:
8ebe5c7 | expanded documentation |
Based on discussions locally at SD59 and the latest check in I am setting this back to needs review.
Changed branch from u/malb/t15915_discrete_gaussians to u/saraedum/ticket/15915
A few questions (line numbers are all in the version I pushed):
Should it really be x-2 in line 518 of dgs_gauss.h?
What is the TODO in dgs_gauss_dp.c?
self->c_r = 0.0;
after an abort() in line 114 and 130 of dgs_gaus_dp.c?
What is "this B" in line 186 of discrete_gaussian_lattice.py?
Changed branch from u/saraedum/ticket/15915 to u/malb/t15915_discrete_gaussians
Author: Martin Albrecht
Reviewer: Julian Rueth
Changed branch from u/malb/t15915_discrete_gaussians to u/saraedum/ticket/15915
Replying to @saraedum:
A few questions (line numbers are all in the version I pushed):
- Should it really be x-2 in line 518 of dgs_gauss.h?
It should read x-c, fixed.
- What is the TODO in dgs_gauss_dp.c?
This was a left-over, removed.
self->c_r = 0.0;
after an abort() in line 114 and 130 of dgs_gaus_dp.c?
Removed.
- What is "this B" in line 186 of discrete_gaussian_lattice.py?
self.B, fixed this.
New commits:
da3efcf | fixed issues identified by Julian Rüth |
Changed branch from u/saraedum/ticket/15915 to u/malb/t15915_discrete_gaussians
PDF docs don't build
! Missing $ inserted.
<inserted text>
$
l.2746
?
! Emergency stop.
<inserted text>
$
l.2746
in ./src/doc/output/latex/en/reference/stats/stats.log
Branch pushed to git repo; I updated commit sha1. New commits:
025ac61 | the future isn't now: removed some utf-8 stuff + fixed latex error |
Fixed. Setting it back to "positive review" (?)
Changed branch from u/malb/t15915_discrete_gaussians to 025ac61
Various builbots report:
sage -t --long src/sage/stats/distributions/discrete_gaussian_integer.pyx
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 263, in sage.stats.distributions.discrete_gaussian_integer.DiscreteGaussianIntegerSampler.__init__
Failed example:
min(l), max(l), abs(mean(l)-2.5) < 0.01
Expected:
(-2, 7, True)
Got:
(-3, 7, True)
Also, on 32-bit linux:
sage -t --long src/sage/stats/distributions/discrete_gaussian_integer.pyx
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 38, in sage.stats.distributions.discrete_gaussian_integer
Failed example:
x=0; l.count(x), ZZ(round(n*exp(-x^2/(2*sigma^2))/norm_factor))
Expected:
(13350, 13298)
Got:
(13355, 13298)
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 40, in sage.stats.distributions.discrete_gaussian_integer
Failed example:
x=4; l.count(x), ZZ(round(n*exp(-x^2/(2*sigma^2))/norm_factor))
Expected:
(5543, 5467)
Got:
(5479, 5467)
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 42, in sage.stats.distributions.discrete_gaussian_integer
Failed example:
x=-10; l.count(x), ZZ(round(n*exp(-x^2/(2*sigma^2))/norm_factor))
Expected:
(46, 51)
Got:
(53, 51)
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 68, in sage.stats.distributions.discrete_gaussian_integer
Failed example:
mean(l).n() # long time
Expected:
0.48678000000...
Got:
0.486650000000000
**********************************************************************
1 item had failures:
4 of 21 in sage.stats.distributions.discrete_gaussian_integer
[88 tests, 4 failures, 7.06 s]
sage -t --long src/sage/schemes/toric/variety.py
[499 tests, 9.23 s]
sage -t --long src/sage/stats/distributions/discrete_gaussian_polynomial.py
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_polynomial.py", line 70, in sage.stats.distributions.discrete_gaussian_polynomial.DiscreteGaussianPolynomialSampler
Failed example:
[gs() for _ in xrange(3)]
Expected:
[4*x^7 + 4*x^6 - 2*x^5 + x^4 + 4*x^2 - 2*x + 7, 5*x^7 - 4*x^6 + 3*x^4 - 4*x^3 + x^2 - 4, 2*x^7 + 2*x^6 + x^5 + 2*x^3 - 3*x^2 + x]
Got:
[4*x^7 + 4*x^6 - 4*x^5 + 2*x^4 + x^3 - 4*x + 7, -5*x^6 + 4*x^5 - 3*x^3 + 4*x^2 + x, 2*x^7 + 2*x^6 + 2*x^5 - x^4 - 2*x^2 + 3*x + 1]
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_polynomial.py", line 97, in sage.stats.distributions.discrete_gaussian_polynomial.DiscreteGaussianPolynomialSampler.__init__
Failed example:
[gs() for _ in xrange(3)]
Expected:
[4*x^7 + 4*x^6 - 2*x^5 + x^4 + 4*x^2 - 2*x + 7, 5*x^7 - 4*x^6 + 3*x^4 - 4*x^3 + x^2 - 4, 2*x^7 + 2*x^6 + x^5 + 2*x^3 - 3*x^2 + x]
Got:
[4*x^7 + 4*x^6 - 4*x^5 + 2*x^4 + x^3 - 4*x + 7, -5*x^6 + 4*x^5 - 3*x^3 + 4*x^2 + x, 2*x^7 + 2*x^6 + 2*x^5 - x^4 - 2*x^2 + 3*x + 1]
**********************************************************************
2 items had failures:
1 of 5 in sage.stats.distributions.discrete_gaussian_polynomial.DiscreteGaussianPolynomialSampler
1 of 5 in sage.stats.distributions.discrete_gaussian_polynomial.DiscreteGaussianPolynomialSampler.__init__
[18 tests, 2 failures, 0.76 s]
Also some crypto stuff failed, though I don't know if it is related
4 items had failures:
4 of 15 in sage.crypto.lwe
1 of 7 in sage.crypto.lwe.RingLWE.__call__
2 of 7 in sage.crypto.lwe.balance_sample
3 of 7 in sage.crypto.lwe.samples
[100 tests, 10 failures, 0.60 s]
Replying to @vbraun:
Various builbots report:
sage -t --long src/sage/stats/distributions/discrete_gaussian_integer.pyx ********************************************************************** File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 263, in sage.stats.distributions.discrete_gaussian_integer.DiscreteGaussianIntegerSampler.__init__ Failed example: min(l), max(l), abs(mean(l)-2.5) < 0.01 Expected: (-2, 7, True) Got: (-3, 7, True)
On what kind of machine is this on? I cannot reproduce it on either 64 or 32 bit Linux.
Also, on 32-bit linux:
I can reproduce those and am currenly testing a fix (some default value was machine dependent).
Also some crypto stuff failed, though I don't know if it is related
4 items had failures: 4 of 15 in sage.crypto.lwe 1 of 7 in sage.crypto.lwe.RingLWE.__call__ 2 of 7 in sage.crypto.lwe.balance_sample 3 of 7 in sage.crypto.lwe.samples [100 tests, 10 failures, 0.60 s]
These are related, thanks.
I should have fixed the 32-bit issues.
I would guess these additional patches need review again as they are not mere documentation/doctest fixes, but changes to the code.
I set the ticket to needs info because the first doctest failure is still unaccounted for (see above).
Last 10 new commits:
a17b818 | follow sage conventions for the author list |
7842276 | added long time to long running doctests |
1939239 | added sig_on()/sig_off() to dgs calls |
da3efcf | fixed issues identified by Julian Rüth |
025ac61 | the future isn't now: removed some utf-8 stuff + fixed latex error |
85fd777 | discrete_gaussian_integer depends on header files |
bae27bd | default choice is should be machine independent |
e791536 | all bern functions should return long only |
61359b0 | C documentation fix |
2c11c5e | doctest fixes due to changed code |
Changed branch from 025ac61
to u/malb/t15915_discrete_gaussians
Branch pushed to git repo; I updated commit sha1. New commits:
bc732a2 | fixed silly typo |
Hi Volker, any idea how I could reproduce that first doctest failure? I don't get it on the 64-bit and 32-bit Linux systems I tried.
Branch pushed to git repo; I updated commit sha1. New commits:
27af749 | Merge branch 'develop' of trac.sagemath.org:sage into u/malb/t15915_discrete_gaussians |
Replying to @malb:
Hi Volker, any idea how I could reproduce that first doctest failure? I don't get it on the 64-bit and 32-bit Linux systems I tried.
It seems my question slipped through the cracks.
Changed branch from u/malb/t15915_discrete_gaussians to caa8ccf
I got this failure now on arando (32-bit linux). It worked on the previous test run, so the failure is non-deterministic. Try running tests in a loop to reproduce.
sage -t --long src/sage/stats/distributions/discrete_gaussian_integer.pyx
**********************************************************************
File "src/sage/stats/distributions/discrete_gaussian_integer.pyx", line 251, in sage.stats.distributions.discrete_gaussian_integer.DiscreteGaussianIntegerSampler.__init__
Failed example:
min(l) == 0-2*1.0, max(l) == 0+2*1.0, abs(mean(l)) < 0.01
Expected:
(True, True, True)
Got:
(True, True, False)
**********************************************************************
1 item had failures:
1 of 48 in sage.stats.distributions.discrete_gaussian_integer.DiscreteGaussianIntegerSampler.__init__
[88 tests, 1 failure, 7.14 s]
Hi, thanks for providing this. I've just committed a "fix". With precision='dp' results are not reproducible (our random seed does not control C's random()). I clarified this a bit more in the documentation and increased the allowed bound. Unfortunately, on the only 32-bit machine I have access to, I cannot reproduce the doctest failure, so I can ask you to run it again?
The name DiscreteGaussianIntegerSampler
might be a bit confusing to some people: at first reading it seems you're sampling from the Gaussian integers (which are discrete).
Wouldn't DiscreteGaussianDistributionSampler
be a more descriptive name? I get the impression that Discrete Gaussian Distribution is normally used for the distributions on the integers you're referring to.
The fact that you truncate at tau*sigma
also seems to disqualify this from legitimately being called "sampling from a discrete gaussian distribution", but I hope this is standard abuse of terminology in the field (and the difference is hard to see).
It also suggests that if the rejection rate is high (it would be for larger values of tau
) it might be worth sampling from a binomial distribution with adapted rejection rates. The shape should be much closer to your desired distribution and hence should lead to a much lower rejection rate. It depends a bit on whether the new rejection probabilities are sufficiently easy to compute to be worth it.
Replying to @nbruin:
The name
DiscreteGaussianIntegerSampler
might be a bit confusing to some people: at first reading it seems you're sampling from the Gaussian integers (which are discrete). Wouldn'tDiscreteGaussianDistributionSampler
be a more descriptive name? I get the impression that Discrete Gaussian Distribution is normally used for the distributions on the integers you're referring to.
The name was chosen to allow for samplers over lattices and polynomial rings as well: DiscreteGaussianLatticeSampler
and DiscreteGaussianPolynomialSampler
.
I can change it to DiscreteGaussianDistributionIntegerSampler
or something if I get a commitment that this change is reviewed in a timely fashion. This patch has had a hard time getting reviewed so I am not opening it up again lightly.
The fact that you truncate at
tau*sigma
also seems to disqualify this from legitimately being called "sampling from a discrete gaussian distribution", but I hope this is standard abuse of terminology in the field (and the difference is hard to see).
The difference is hard to see due to the tail bound. Roughly: Pr[x <- D | |x| > Csigma] <= 2/(C sqrt(2pi)) * exp(-C2/2). This is also the standard approach in the literature.
It also suggests that if the rejection rate is high (it would be for larger values of
tau
) it might be worth sampling from a binomial distribution with adapted rejection rates. The shape should be much closer to your desired distribution and hence should lead to a much lower rejection rate. It depends a bit on whether the new rejection probabilities are sufficiently easy to compute to be worth it.
Sampling from other distributions has been considered in the literature. The "sigma2+logtable" algorithm is a result of such an investigation in:
[DDLL13] Léo Ducas, Alain Durmus, Tancrède Lepoint and Vadim Lyubashevsky. Lattice Signatures and Bimodal Gaussians; in Advances in Cryptology – CRYPTO 2013; Lecture Notes in Computer Science Volume 8042, 2013, pp 40-56 http://www.di.ens.fr/~lyubash/papers/bimodal.pdf
Running sage -t --global-iterations 100 src/sage/stats/distributions/discrete_gaussian_integer.pyx
on 32-bit Archlinux passes.
Here are the last circa 210 lines of output in abbreviated form:
...
sage -t src/sage/stats/distributions/discrete_gaussian_integer.pyx
[82 tests, 16.02 s]
----------------------------------------------------------------------
Doctests interrupted: 1/1 files tested
Doctests interrupted: 2/1 files tested
...
Doctests interrupted: 98/1 files tested
Doctests interrupted: 99/1 files tested
----------------------------------------------------------------------
Total time for all tests: 16.1 seconds
cpu time: 1625.9 seconds
cumulative wall time: 1627.1 seconds
sage -t src/sage/stats/distributions/discrete_gaussian_integer.pyx
[82 tests, 15.99 s]
----------------------------------------------------------------------
Doctests interrupted: 1/1 files tested
Doctests interrupted: 2/1 files tested
...
Doctests interrupted: 98/1 files tested
Doctests interrupted: 99/1 files tested
----------------------------------------------------------------------
Total time for all tests: 16.1 seconds
cpu time: 1641.9 seconds
cumulative wall time: 1643.1 seconds
Apart from this ticket's branch, I have also merged those of tickets #15316 and #16480, as I need those for building.
Thanks for running those. As far as I can tell, this means for you, too, doctests pass.
I renamed DiscreteGaussianIntegerSampler
to DiscreteGaussianDistributionIntegerSampler
. Nils, can you review this commit?
Last 10 new commits:
e791536 | all bern functions should return long only |
61359b0 | C documentation fix |
2c11c5e | doctest fixes due to changed code |
bc732a2 | fixed silly typo |
27af749 | Merge branch 'develop' of trac.sagemath.org:sage into u/malb/t15915_discrete_gaussians |
92e61fd | Merge branch 'develop' of trac.sagemath.org:sage into u/malb/t15915_discrete_gaussians |
caa8ccf | Merge branch 'develop' of trac.sagemath.org:sage into u/malb/t15915_discrete_gaussians |
2051027 | increase allowed bound in testcase and clarified documentation |
c9d3c9c | DiscreteGaussianIntegerSampler -> DiscreteGaussianDistributionIntegerSampler |
a1b711f | fixed bug in iter_vectors and wrong doctests which accepted its results |
Changed branch from caa8ccf
to u/malb/t15915_discrete_gaussians
Changed branch from u/malb/t15915_discrete_gaussians to a1b711f
See #16968 for a follow-up
This ticket adds samplers for discrete Gaussian distributions to Sage.
Component: statistics
Keywords: sd59
Author: Martin Albrecht
Branch:
a1b711f
Reviewer: Julian Rueth
Issue created by migration from https://trac.sagemath.org/ticket/15915