sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 481 forks source link

add discrete Gaussian samplers to Sage #15915

Closed malb closed 10 years ago

malb commented 10 years ago

This ticket adds samplers for discrete Gaussian distributions to Sage.

Component: statistics

Keywords: sd59

Author: Martin Albrecht

Branch: a1b711f

Reviewer: Julian Rueth

Issue created by migration from https://trac.sagemath.org/ticket/15915

malb commented 10 years ago

Branch: u/malb/15915_discrete_gaussians

malb commented 10 years ago

Commit: 8872ac1

malb commented 10 years ago

New commits:

2978c29bernoulli distributions
14f95a6first discrete gaussian sampler implemented
3faa956slow uniform+online implementation of discrete gaussians added
6bb9421first version of mp versions of discrete gausians done
a94c2ddadded double precision variants, but these are not much faster
c6d3491added ZZ.random_element(gaussian) and fixed LWE to use new sampler
8872ac1added parameter c for discrete Gaussians
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 8872ac1 to 03c6dc4

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2413fcddiscrete gaussians (mp version) support arbitrary c now
37049f7more careful considerations of tau + documentation
03c6dc4fixed precision='dp'
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

853cff3documentation clarification
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 03c6dc4 to 853cff3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 853cff3 to 1829e21

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1829e21Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6aa658cDDLL23 -> DDLL13
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 1829e21 to 6aa658c

malb commented 10 years ago
comment:7

Daniel Cabarcas looked at the C part of the code, here is what he said (and my reply which I sent via e-mail)

I looked at the code and I think it is an important improvement over what we had before. I looked in detail at the uniform+table and the uniform+online versions and I > think they are correct. I don't know the details of the logtable algorithms so I didn't looked at the corresponding code. I have a few comments so far:

The name of the library is DGS for discrete Gaussian samplers. However, all implementations are over the integers, and not over a more general lattice. I was wondering if it would be wiser to call it discrete Gaussian over the integers, foreseeing implementations over other lattices.

My idea was to include the discrete Gaussian sampler over lattices (+ cosets) into this implementation. Hence the name. However, for that I needed to do #15976 first. I plan to go back to the samplers over the lattices once both tickets are merged (and I can rely on them).

Have you consider implementing other samplers? if you are looking for speed Knuth-Yao is quite impressive, and if you are looking for a time-memory tradeoff the > discrete Ziggurat algorithm is nice. Moreover, in terms of flexibility of the library > design, do you think it would be easy to add other samplers in the future which require perhaps other parameters?

I haven't thought about Knuth-Yao or Ziggurat, do you have any good references at hand? I'm happy to implement more samplers, might as well be comprehensive. I was also thinking about extending the logtable approach to arbitrary Gaussian parameters sigma.

As for the interface, I guess I should check the two algorithms you mentioned above to see what would need to be generalised. If other parameters are needed in the future I can simply add another constructor which accepts those parameters, that was my thinking anyway.

I was wondering about the source of randomness. You use standard c functions, which > is probably ok for many tests. However, you know for sure how critical is the source of randomness in crypto implementations. Moreover, there are huge differences in speed between the standard random c function and a secure random function, thus, for testing efficiency of Gaussian samplers it is crucial to have the flexibility to choose your source of randomness. For this reason, in some implementation we did, we had the source of randomness as a parameter. Perhaps, you'd like to consider doing that.

I currently have two versions, one using GMP's randstate and one using libc. GMP should give cryptographically strong pseudorandom numbers. In my experience the difference in running time is - as you write - pretty much down to how long it takes to sample random numbers, so I figured I offer one with a good random number generator and one with a fast which is good enough for testing.

malb commented 10 years ago
comment:8

So, any volunteer reviewers for the Sage part?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

01bdc19Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
0b87bffMerge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
dcfaf52document fact that some random bits are cached
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 6aa658c to dcfaf52

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from dcfaf52 to 32ba29a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

32ba29acheck for integer overflows
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 32ba29a to 9ea5bc4

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9ea5bc4documentation clean up
rwst commented 10 years ago
comment:13
sage -t --long src/sage/rings/integer_ring.pyx  # 1 doctest failed
sage -t --long src/sage/crypto/lwe.py  # 23 doctests failed
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f37edb6Merge branch 'u/malb/15915_discrete_gaussians' of trac.sagemath.org:sage into discrete_gaussian
0746a4efix doctest failures
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 9ea5bc4 to 0746a4e

malb commented 10 years ago
comment:15

Doctests fixed.

malb commented 10 years ago

Description changed:

--- 
+++ 
@@ -1 +1 @@
-
+This ticket adds samplers for discrete Gaussian distributions to Sage.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

853cff3documentation clarification
1829e21Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
6aa658cDDLL23 -> DDLL13
01bdc19Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
0b87bffMerge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
dcfaf52document fact that some random bits are cached
32ba29acheck for integer overflows
9ea5bc4documentation clean up
f15c339Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
fba8f61Merge branch 'develop' of trac.sagemath.org:sage into discrete_gaussian
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 0746a4e to fba8f61

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6dcd131include stats dir
bf03fd7fixed doctest failures
8711050moved & renamed discrete gaussian samplers
7926faeadded Gaussian samplers for polynomials and lattices
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from fba8f61 to 7926fae

saraedum commented 10 years ago

Changed keywords from none to sd59

robertwb commented 10 years ago
comment:19

Some comments at https://github.com/sagemath/sage/pull/18 (I didn't look at the .c files yet.)

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 7926fae to 6ace4d7

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6ace4d7pass polynomial ring explicitly to distributions over polynomial rings
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a910589addressed comments at https://github.com/sagemath/sage/pull/18
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 6ace4d7 to a910589

miguelmarco commented 10 years ago
comment:22

I don't feel able to review all the code, but testing it seems to work ok.

I just think it would make sense for the samplers over the polynomials to accept the polynomial ring as an input.

malb commented 10 years ago
comment:23

Replying to @miguelmarco:

I just think it would make sense for the samplers over the polynomials to accept the polynomial ring as an input.

Already taken care off :)

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from a910589 to 18279ac

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

042530dadded doctest for plotting histograms
18279acfaster discrete gaussians over lattices if lattice is trivial
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1112e41ZZ[x]/ via FLINT
45d0b44Merge branch 'discrete_gaussian' into flint_zz
d5cc4a2added random_element method
833723cMerge branch 'discrete_gaussian' into flint_zz
e6a80b2implemented lift()
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 18279ac to e6a80b2

saraedum commented 10 years ago
comment:26

Some comments/questions:

In the fmpz_poly linkage: What is the except -2 good for? Do these flint methods return -2 on error? Shouldn't you use sig_on()/sig_off() rather?

What is the reason for the following in Polynomial_integer_dense_flint?

- elif not isinstance(x, list):
-    x = [x] # constant polynomials
+ else:
+    try:
+       x = list(x)
+    except TypeError:
+       x = [x] # constant polynomials

In polynomial_quotient_ring.py, why don't you use r""" so you can write \ZZ? Also, there are now a lot of unicode characters; even though you put an encoding in the header, my vim does not understand these characters (strangely) and they also confused the patchbot.

I stopped reading at dgs.h. Would you mind relicensing this as GPLv2+? Should such things not be in an SPKG? I wonder if the amount of documentation is sufficient for including it directly in Sage and also if anybody can really review this (I cannot review that part I'm afraid.)

saraedum commented 10 years ago

Changed branch from u/malb/15915_discrete_gaussians to u/saraedum/ticket/15915

saraedum commented 10 years ago

Changed commit from e6a80b2 to d57f528

saraedum commented 10 years ago

New commits:

d57f528minor fixes in docstrings
malb commented 10 years ago
comment:29

It seems you were working off the wrong branch. Your branch has

polynomial_quotient_integer_dense_flint.pyx

in it, which it shouldn't ... oh, looks like I pushed to the wrong branch at some point. Sorry, I'll fix that.

malb commented 10 years ago
comment:30

Replying to @saraedum:

Some comments/questions:

In the fmpz_poly linkage: What is the except -2 good for? Do these flint methods return -2 on error? Shouldn't you use sig_on()/sig_off() rather?

It's the standard signature of clement_* functions used by polynomial templates. Specifiying -2 there allows to throw exceptions in C functions. This code wasn't meant to be in this ticket, though, see above. My apologies. The code in question belonged to #16574 but has been removed.

What is the reason for the following in Polynomial_integer_dense_flint?

- elif not isinstance(x, list):
-    x = [x] # constant polynomials
+ else:
+    try:
+       x = list(x)
+    except TypeError:
+       x = [x] # constant polynomials

This code wasn't meant to be in this ticket, my apologoies. It belongs in #16574. It allows R(tuple(x)) and R(vector(x)) to go through.

In polynomial_quotient_ring.py, why don't you use r""" so you can write \ZZ?

I could do that, I find the other notation more straight-forward, is there a particular reason why one is preferable to the other?

Also, there are now a lot of unicode characters; even though you put an encoding in the header, my vim does not understand these characters (strangely) and they also confused the patchbot.

I can remove them but that'd make the code less readable IMHO.

I stopped reading at dgs.h. Would you mind relicensing this as GPLv2+?

I have no strong feelings either way, what would be the advantage? I am quite happy with the code being BSD, it's not very sophisticated and BSD licensing it allows anyone to use it.

Should such things not be in an SPKG?

The C code really does only one thing, so an SPKG would be a bit of an overkill. Also, SPKG's take votes and extra bureaucracy. I'd like to avoid that, it's hard enough to get this stuff reviewed.

I wonder if the amount of documentation is sufficient for including it directly in Sage and also if anybody can really review this (I cannot review that part I'm afraid.)

Daniel Cabarcas reviewed the C part. I can add more documentation.

malb commented 10 years ago

Changed commit from d57f528 to 4681abe

malb commented 10 years ago

New commits:

4681abecherry picked doctest fix from u/saraedum/ticket/15915
malb commented 10 years ago

Changed branch from u/saraedum/ticket/15915 to u/malb/t15915_discrete_gaussians

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 4681abe to 8ebe5c7