danaj / Math-Prime-Util-GMP

Perl prime number module using XS/GMP
Other
17 stars 9 forks source link

[suggestion] Faulhaber's summation formula #20

Closed trizen closed 4 years ago

trizen commented 4 years ago

I would like to suggest a new function that implements Faulhaber's formula:

Sum_{k=1..n} k^p = 1/(p+1) * Sum_{j=0..p} binomial(p+1, j) * n^(p-j+1) * bernoulli(j)
                 = 1/(p+1) * Sum_{j=0..p} binomial(p+1, p-j) * n^(j+1) * bernoulli(p-j)

Name suggestion:

faulhaber_sum(n,k) -- sum of the k-th powers of the first n positive integers

Reference implementation:

use 5.020;
use warnings;
use experimental qw(signatures);

use Math::GMPz;
use Math::GMPq;

sub _tangent_numbers ($n) {

    state @cache;

    if ($n <= $#cache) {
        return @cache;
    }

    $n <<= 1 if ($n <= 512);

    my @T = (Math::GMPz::Rmpz_init_set_ui(1));

    foreach my $k (1 .. $n) {
        Math::GMPz::Rmpz_mul_ui($T[$k] = Math::GMPz::Rmpz_init(), $T[$k - 1], $k);
    }

    foreach my $k (1 .. $n) {
        foreach my $j ($k .. $n) {
            Math::GMPz::Rmpz_mul_ui($T[$j], $T[$j], $j - $k + 2);
            Math::GMPz::Rmpz_addmul_ui($T[$j], $T[$j - 1], $j - $k);
        }
    }

    push @cache, @T[@cache .. (@T <= 1024 ? $#T : 1024)];

    return @T;
}

sub _bernoulli_numbers ($n) {    # with B_1 = 1/2

    $n = ($n >> 1) + 1;

    state @cache;

    if ($n <= $#cache) {
        return @cache;
    }

    my @B;
    my @T = _tangent_numbers($n);

    my $t = Math::GMPz::Rmpz_init();

    foreach my $k (scalar(@cache) .. 2 * @T) {

        $k % 2 == 0 or $k == 1 or next;

        my $q = Math::GMPq::Rmpq_init();

        if ($k == 0) {
            Math::GMPq::Rmpq_set_ui($q, 1, 1);
            $B[$k] = $q;
            next;
        }

        if ($k == 1) {
            Math::GMPq::Rmpq_set_si($q, 1, 2);
            $B[$k] = $q;
            next;
        }

        # T_k
        Math::GMPz::Rmpz_mul_ui($t, $T[($k >> 1) - 1], $k);
        Math::GMPz::Rmpz_neg($t, $t) if ((($k >> 1) - 1) & 1);
        Math::GMPq::Rmpq_set_z($q, $t);

        # (2^k - 1) * 2^k
        Math::GMPz::Rmpz_set_ui($t, 0);
        Math::GMPz::Rmpz_setbit($t, $k);
        Math::GMPz::Rmpz_sub_ui($t, $t, 1);
        Math::GMPz::Rmpz_mul_2exp($t, $t, $k);

        # B_k = q
        Math::GMPq::Rmpq_div_z($q, $q, $t);

        $B[($k >> 1) + 1] = $q;
    }

    push @cache, @B[@cache .. (@B <= 1024 ? $#B : 1024)];

    return (@cache, (@B > @cache ? @B[@cache .. $#B] : ()));
}

sub faulhaber_sum ($n, $p) {

    # p is a native integer >= 0

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new("$n");
    }

    my @B = _bernoulli_numbers($p);

    my $z = Math::GMPz::Rmpz_init();
    my $q = Math::GMPq::Rmpq_init();
    my $u = Math::GMPz::Rmpz_init_set_ui(1);

    my $sum = Math::GMPq::Rmpq_init();
    Math::GMPq::Rmpq_set_ui($sum, 0, 1);

    foreach my $j (0 .. $p) {

        Math::GMPz::Rmpz_mul($u, $u, $n);

        # Skip when bernoulli(p-j) == 0
        ($p - $j) % 2 == 0 or ($p - $j) == 1 or next;

        Math::GMPz::Rmpz_bin_uiui($z, $p + 1, $p - $j);

        Math::GMPz::Rmpz_mul($z, $z, $u);
        Math::GMPq::Rmpq_mul_z($q, ($p - $j) <= 1 ? $B[$p - $j] : $B[(($p - $j) >> 1) + 1], $z);
        Math::GMPq::Rmpq_add($sum, $sum, $q);
    }

    Math::GMPq::Rmpq_get_num($z, $sum);
    Math::GMPz::Rmpz_divexact_ui($z, $z, $p + 1);
    return $z;
}

foreach my $k (0 .. 800) {
    my $n = 1e10;
    say faulhaber_sum($n, $k);
}

__END__
% time perl faulhaber.pl | md5sum
764d3992c18171252bc56721e0cdd512  -
perl faulhaber.pl  1.33s user 0.00s system 99% cpu 1.333 total
md5sum  0.02s user 0.00s system 1% cpu 1.328 total

Alternative implementation, using bernfrac and not using mpq:

use 5.020;
use warnings;
use experimental qw(signatures);
use Math::Prime::Util::GMP qw(bernfrac);

use Math::GMPz;

sub faulhaber_sum ($n, $p) {

    # p is a native integer >= 0

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new("$n");
    }

    my @B = map {   # should be cached
        [map { Math::GMPz->new($_) } bernfrac($_)]
    } 0 .. $p;

    my $t = Math::GMPz::Rmpz_init();
    my $u = Math::GMPz::Rmpz_init_set_ui(1);

    my $num = Math::GMPz::Rmpz_init_set_ui(0);
    my $den = Math::GMPz::Rmpz_init_set_ui(1);

    foreach my $j (0 .. $p) {

        Math::GMPz::Rmpz_mul($u, $u, $n);

        # Skip when bernoulli(p-j) == 0
        ($p - $j) % 2 == 0 or ($p - $j) == 1 or next;

        Math::GMPz::Rmpz_bin_uiui($t, $p + 1, $p - $j);

        my $B = $B[$p - $j];

        Math::GMPz::Rmpz_mul($t,   $t,   $u);
        Math::GMPz::Rmpz_mul($t,   $t,   $B->[0]);
        Math::GMPz::Rmpz_mul($num, $num, $B->[1]);
        Math::GMPz::Rmpz_addmul($num, $t, $den);
        Math::GMPz::Rmpz_mul($den, $den, $B->[1]);
    }

    Math::GMPz::Rmpz_mul_ui($den, $den, $p + 1);
    Math::GMPz::Rmpz_divexact($num, $num, $den);
    return $num;
}

foreach my $k (0 .. 800) {
    my $n = 1e10;
    say faulhaber_sum($n, $k);
}

__END__
% time perl faulhaber_bernfrac.pl | md5sum
764d3992c18171252bc56721e0cdd512  -
perl faulhaber_bernfrac.pl  8.27s user 0.00s system 99% cpu 8.278 total
md5sum  0.02s user 0.00s system 0% cpu 8.277 total

The function is particularly useful in computing partial sums of the Dirichlet convolution of an arithmetic function f(k) with the positive m-th powers (for a fixed m >= 0):

Sum_{k=1..n} Sum_{d|k} d^m * f(k/d) = Sum_{k=1..n} f(k) * Sum_{j=1..floor(n/k)} j^m
                                    = Sum_{k=1..n} f(k) * F_m(floor(n/k))

where F_m(x) = faulhaber_sum(x, m).

By applying the Dirichlet hyperbola method, we have:

Sum_{k=1..n} Sum_{d|k} d^m * f(k/d) = (Sum_{k=1..floor(sqrt(n))} (f(k) * F_m(floor(n/k)) + S(floor(n/k)) * k^m)) - S(floor(sqrt(n)))*F_m(floor(sqrt(n)))

where

S(n) = Sum_{k=1..n} f(k)

Thanks for considering!

danaj commented 4 years ago

Interesting. Pari/GP has a bernvec(n) call that returns B_0 .. B_2n as rationals. It points out that while it takes a lot of memory, it is much faster than calling bernfrac(k) repeatedly. It also caches them so bernfrac(k) within the range is immediate.

Looks like Pari/GP also has an internal faulhauber call added in 2018, used by the sumformal() command.

I'll look at this soon.

danaj commented 4 years ago

I have a version of bernvec(n) similar to Pari/GP, along with simple caching.

Timing (+/- 0.1s not unusual):

  11.5s   mpz/bernfrac version as shown
   5.7s   adding bernfrac caching in module
   2.6s   mpz/bernfrac version using 1 line caching B instead of recalc each time
   2.3s   in module, without caching
   1.7s   mpq version
   1.6s   mpz/bernfrac version added a gcd in the loop (similar to what mpq does)
   1.0s   in module, with caching (call bernvec once)

It in interesting to see that while caching in the module is useful, there is still a bunch of GMP/string/GMP etc. back and forth that takes up a lot of time. Clearly we want to not constantly make new Math::GMPz objects.

Once computing B is done only as needed, and the gcd is added in the loop, there isn't much time difference between the two versions (this is all before module changes).

Pari/GP's bernvec is ~10x faster, probably because their zeta function is a lot faster.

Putting faulhaber_sum(n,k) in the module results in 1.0s if bernvec is called first, 2.3s if not. The bernvec call notices it is a void call merely requesting precalculation and caching of the Bernoulli numbers (Pari does something similar).

I tried both mpz and mpq and the performance is essentially the same if a gcd reduction is added, which is, I believe, something the mpq calls do internally.

danaj commented 4 years ago

API for bernvec, similar to Pari/GP for simplicity: bernvec(n) returns B[0], B[2], ..., B[2n]. E.g. gp:

? bernvec(5)
[1, 1/6, -1/30, 1/42, -1/30, 5/66]

So .... the question I have is what the return value should be for MPU. I am thinking something like: [ [1,1], [1,6], [-1,30], [1,42], [-1,30], [5,66] ] Which seems reasonably easy to parse. Easiest would be (1,1,1,6,-1,30,1,42,-1,30,5,66) but that seems a bit obscure.

faulhaber_sum(n,p) is easy -- just one huge number. I am restricting p to an unsigned long.

trizen commented 4 years ago

I prefer [ [1,1], [1,6], [-1,30], [1,42], [-1,30], [5,66] ], which allows easy map-ing of the result.

danaj commented 4 years ago

I just implemented and documented something close.

mpu 'say join ", ", map { join "/" ,@$_; } Math::Prime::Util::GMP::bernvec(8);'
1/1, 1/6, -1/30, 1/42, -1/30, 5/66, -691/2730, 7/6, -3617/510

The difference being it returns an array rather than an array reference. It should be easy to map.

At some point I should revive my experiment with direct objects, so everything would just be a Math::GMPz object directly if requested.

danaj commented 4 years ago

Pushed.

trizen commented 4 years ago

Great work as always. Thank you very much!