sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.22k stars 428 forks source link

Improve formula for Bell numbers #17157

Closed jdemeyer closed 9 years ago

jdemeyer commented 9 years ago

Improve the error analysis for the implementation of Dobinski's formula to compute the Bell numbers leading to

  1. code which is actually consistent with the mathematical formulas
  2. code which is cleaner and simpler than before
  3. a complete error estimate including the final real approximation
  4. faster code:

Before:

sage: timeit('bell_number(200)', number=2000)
2000 loops, best of 3: 1.45 ms per loop

After:

sage: timeit('bell_number(200)', number=2000)
2000 loops, best of 3: 441 µs per loop

Depends on #16428 Depends on #15300

Component: combinatorics

Author: Jeroen Demeyer

Branch/Commit: 330617c

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/17157

jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -5,6 +5,7 @@

This can be improved by -1. Removing the 1 / 4 which is actually 0 in Python 2. -2. Not using global functions like ceil() and exp(). -3. Using a RealField instead of the N() function. +1. Using a RealField instead of the N() function. +2. Removing the 1 / 4 which is actually 0 in Python 2 and rounding up. +3. Removing the -n which is nowhere mentioned in the mathematical formula. +4. Not using global functions like ceil() and exp().

jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,11 +1 @@
-In the function `bell_number`, we have the line
-
-```
-return ZZ(ceil(N((b - n) / n2 * exp2(Integer(-1)) - 1 / 4, log(b, 2) + 3)))
-```
-
-This can be improved by
-1. Using a `RealField` instead of the `N()` function.
-2. Removing the `1 / 4` which is actually `0` in Python 2 and rounding up.
-3. Removing the `-n` which is nowhere mentioned in the mathematical formula.
-4. Not using global functions like `ceil()` and `exp()`.
+Improve the error analysis for the implementation of Dobinski's formula to compute the Bell numbers.
jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1 +1,5 @@
-Improve the error analysis for the implementation of Dobinski's formula to compute the Bell numbers.
+Improve the error analysis for the implementation of Dobinski's formula to compute the Bell numbers leading to
+
+1. code which is actually consistent with the mathematical formulas
+2. code which is cleaner and simpler than before, hopefully slightly faster
+3. a complete error estimate including the final real approximation
jdemeyer commented 9 years ago

Branch: u/jdemeyer/ticket/17157

jdemeyer commented 9 years ago

Commit: 0266cef

jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,5 +1,20 @@
 Improve the error analysis for the implementation of Dobinski's formula to compute the Bell numbers leading to

 1. code which is actually consistent with the mathematical formulas
-2. code which is cleaner and simpler than before, hopefully slightly faster
+2. code which is cleaner and simpler than before
 3. a complete error estimate including the final real approximation
+4. faster code:
+
+Before:
+
+```
+sage: timeit('bell_number(200)', number=2000)
+2000 loops, best of 3: 1.45 ms per loop
+```
+
+After:
+
+```
+sage: timeit('bell_number(200)', number=2000)
+2000 loops, best of 3: 441 µs per loop
+```
jdemeyer commented 9 years ago

New commits:

0266cefImprove formula for Bell numbers
fredrik-johansson commented 9 years ago
comment:6

You might also consider wrapping flint's arith_bell_number. On this computer it seems to be about twice as fast as Sage's current bell_number(n, algorithm='dobinski') for n = 1000, 2000, 4000, 10000, 20000, 40000.

jdemeyer commented 9 years ago
comment:7

Replying to @fredrik-johansson:

Sage's current bell_number(n, algorithm='dobinski')

With or without this patch?

jdemeyer commented 9 years ago
comment:8

Replying to @fredrik-johansson:

You might also consider wrapping flint's arith_bell_number.

If I do so, will you review this ticket?

fredrik-johansson commented 9 years ago
comment:9

Replying to @jdemeyer:

Replying to @fredrik-johansson:

Sage's current bell_number(n, algorithm='dobinski')

With or without this patch?

Without. The difference is bigger for smaller n of course. Here is a more accurate comparison:

flint:

100 cpu/wall(s): 2.6e-05 2.68e-05 200 cpu/wall(s): 8.2e-05 8.57e-05 400 cpu/wall(s): 0.00039 0.000403 1000 cpu/wall(s): 0.004 0.00441 2000 cpu/wall(s): 0.023 0.026 4000 cpu/wall(s): 0.16 0.172 10000 cpu/wall(s): 0.91 0.949 20000 cpu/wall(s): 4.02 4.173 40000 cpu/wall(s): 18.12 21.109

sage (without patch):

100 5 loops, best of 3: 1.05 ms per loop 200 625 loops, best of 3: 1.52 ms per loop 400 125 loops, best of 3: 1.96 ms per loop 1000 125 loops, best of 3: 7.39 ms per loop 2000 5 loops, best of 3: 38 ms per loop 4000 5 loops, best of 3: 216 ms per loop 10000 5 loops, best of 3: 2.17 s per loop 20000 1 loops, best of 3: 13 s per loop 40000 1 loops, best of 3: 68 s per loop

fredrik-johansson commented 9 years ago
comment:10

You can just compute the sum in Dobinski's formula as an exact fraction (keeping the numerator and denominator separate, not actually dividing rational numbers by successive factorials). Indeed this is what flint does for n < 5000 (for larger n it uses a multimodular algorithm). This simplifies the error analysis.

fredrik-johansson commented 9 years ago
comment:11

Which reminds me that flint is missing two tricks: firstly, batch computing all the powers 1^n, 3^n, 5^n, ..., N^n using sieving (one multiplication for every odd index, then one shift for every even index), and secondly, only working with approximate values of the powers near the end of that list.

The first trick would use much more memory, of course, but that's fine as one can fall back to the multimodular algorithm for very large n anyway.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

58984afRound down instead of up
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 0266cef to 58984af

tscrim commented 9 years ago
comment:13

Replying to @jdemeyer:

Replying to @fredrik-johansson:

You might also consider wrapping flint's arith_bell_number.

If I do so, will you review this ticket?

I will review it if you do so. I'm for including access to multiple implementations.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ce22602Cleanup/reorganization of FLINT imports
fc17740Merge remote-tracking branch 'trac/u/jdemeyer/ticket/16428' into mpir
a669ec7Fix test in cython.py for new FLINT layout.
b69be0dFix some ctypedef in FLINT pxd files.
5851a85Merge commit 'b69be0da26e43f9454e676867afb54de530d2a58' into HEAD
c93c4e9Add FLINT implementation of Bell numbers
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 58984af to c93c4e9

tscrim commented 9 years ago

Dependencies: #16428

tscrim commented 9 years ago
comment:15

Waiting now for all my cython files to recompile.

tscrim commented 9 years ago
comment:16

Some timings:

sage: %timeit bell_number(200)
1000 loops, best of 3: 612 µs per loop
sage: %timeit bell_number(200, algorithm='dobinski')
100 loops, best of 3: 2.2 ms per loop
sage: %timeit bell_number(200, algorithm='gap')
1 loops, best of 3: 21 ms per loop
sage: %timeit bell_number(200, algorithm='mpmath')
10 loops, best of 3: 21.2 ms per loop

sage: %timeit bell_number(1000)
10 loops, best of 3: 40.1 ms per loop
sage: %timeit bell_number(1000, algorithm='dobinski')
10 loops, best of 3: 49.5 ms per loop
sage: %timeit bell_number(1000, algorithm='gap')
1 loops, best of 3: 1.58 s per loop
sage: %timeit bell_number(1000, algorithm='mpmath')
1 loops, best of 3: 737 ms per loop

sage: %timeit bell_number(10000, algorithm='flint')
1 loops, best of 3: 3.57 s per loop
sage: %timeit bell_number(10000, algorithm='dobinski')
1 loops, best of 3: 20 s per loop

So LGTM.

tscrim commented 9 years ago

Reviewer: Travis Scrimshaw

vbraun commented 9 years ago
comment:17

Conflicts with #15300 most likely

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

165f943Rewrite Weyl algebra and reviewer suggestions added.
34c9b4cChanges from the reviewers.
b47af6aA minor tweaking of the doc.
53a9d7eFamilies return output based on order of variable names. Fixed coercion issue.
339b71dFixed some typos.
3ce3f6aMerge branch 'public/algebras/weyl_clifford-15300' of git://trac.sagemath.org/sage into clifford
8698275lifting bilinear forms onto exterior algebras
a002f4bSome tweaks to lifted_bilinear_form and fixed doctest.
ff27bdcfurther fixes
330617cMerge commit 'ff27bdc5d87967d0e7fd5c1305cef4340db816c7' into ticket/17157
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from c93c4e9 to 330617c

jdemeyer commented 9 years ago

Changed dependencies from #16428 to #16428, #15300

jdemeyer commented 9 years ago

Last 10 new commits:

165f943Rewrite Weyl algebra and reviewer suggestions added.
34c9b4cChanges from the reviewers.
b47af6aA minor tweaking of the doc.
53a9d7eFamilies return output based on order of variable names. Fixed coercion issue.
339b71dFixed some typos.
3ce3f6aMerge branch 'public/algebras/weyl_clifford-15300' of git://trac.sagemath.org/sage into clifford
8698275lifting bilinear forms onto exterior algebras
a002f4bSome tweaks to lifted_bilinear_form and fixed doctest.
ff27bdcfurther fixes
330617cMerge commit 'ff27bdc5d87967d0e7fd5c1305cef4340db816c7' into ticket/17157
vbraun commented 9 years ago

Changed branch from u/jdemeyer/ticket/17157 to 330617c