danaj / Math-Prime-Util-GMP

Perl prime number module using XS/GMP
Other
17 stars 9 forks source link

[suggestion] Two new factorization methods. #19

Open trizen opened 4 years ago

trizen commented 4 years ago

If n is a Carmichael number or a Lucas-Carmichael number, it can be factorized very fast, using the following methods.

These methods may also work for Fermat or Lucas pseudoprimes.

Miller-Rabin factorization method:

use 5.020;
use warnings;

use Math::GMPz;
use ntheory qw(:all);
use experimental qw(signatures);

sub miller_rabin_factor ($n, $tries = 100) {

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new("$n");
    }

    my $D = $n - 1;
    my $s = valuation($D, 2);
    my $r = $s - 1;
    my $d = $D >> $s;

    for (1 .. $tries) {

        my $p = random_prime(1e7);
        my $x = powmod($p, $d, $n);

        for (0 .. $r) {

            last if (($x == 1) || ($x == $D));

            foreach my $i (1, -1) {
                my $g = gcd($x + $i, $n);
                if ($g > 1 and $g < $n) {
                    return $g;
                }
            }

            $x = mulmod($x, $x, $n);
        }
    }

    return 1;
}

say miller_rabin_factor("1729");
say miller_rabin_factor("335603208601");
say miller_rabin_factor("30459888232201");
say miller_rabin_factor("162021627721801");
say miller_rabin_factor("1372144392322327801");
say miller_rabin_factor("7520940423059310542039581");
say miller_rabin_factor("8325544586081174440728309072452661246289");
say miller_rabin_factor("181490268975016506576033519670430436718066889008242598463521");
say miller_rabin_factor("57981220983721718930050466285761618141354457135475808219583649146881");
say miller_rabin_factor("131754870930495356465893439278330079857810087607720627102926770417203664110488210785830750894645370240615968198960237761");

Lucas-Miller factorization method:

use 5.020;
use warnings;

use Math::GMPz;
use ntheory qw(:all);
use experimental qw(signatures);

sub lucas_miller_factor ($n, $I = 1, $k = 100) {

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new("$n");
    }

    my $D = $n + $I;
    my $s = valuation($D, 2);
    my $r = $s - 1;
    my $d = $D >> $s;

    foreach my $i (1 .. $k) {

        my $P = vecmin(1+int(rand(1e6)), urandomm($n));
        my $Q = vecmin(1+int(rand(1e6)), urandomm($n));

        $Q *= -1 if (rand(1) < 0.5);

        next if is_square($P*$P - 4*$Q);

        my ($U, $V, $T) = lucas_sequence($n, $P, $Q, $d);

        foreach my $z (0 .. $r) {

            foreach my $g (gcd($U, $n), gcd($V, $n), gcd(subint($V, $P), $n)) {
                if ($g > 1 and $g < $n) {
                    return $g;
                }
            }

            $U = mulmod($U, $V, $n);
            $V = mulmod($V, $V, $n);
            $V = submod($V, addint($T, $T), $n);
            $T = mulmod($T, $T, $n);
        }
    }

    return 1;
}

say lucas_miller_factor("16641689036184776955112478816668559");
say lucas_miller_factor("17350074279723825442829581112345759");
say lucas_miller_factor("61881629277526932459093227009982733523969186747");
say lucas_miller_factor("122738580838512721992324860157572874494433031849", -1);
say lucas_miller_factor("173315617708997561998574166143524347111328490824959334367069087");
say lucas_miller_factor("2425361208749736840354501506901183117777758034612345610725789878400467");

(The Lucas-Miller method can also factorize Carmichael numbers, with gcd($V-$P, $n)).

Both methods take less than 100 ms to find a factor of n.

Additionally, I think the factor method should try more built-in special-purpose factorization methods, such as fermat_factor, holf_factor and pplus1_factor, after ECM failed for some small parameters.

For example: n = 2425361208749736840354501506901183117777758034612345610725789878400467:

[*] Factoring 2425361208749736840354501506901183117777758034612345610725789878400467 (70 digits)...
[*] Trial division...
[*] No small factors found...
[*] Trying to find more small factors...

[*] Factoring 2425361208749736840354501506901183117777758034612345610725789878400467 (70 digits)...

=> Perfect power check...
=> Miller-Rabin method...
=> Lucas-Miller method...
`-> prime factor: 19760381716819645293083

[*] Factoring 122738580838512721992324860157572874494433031849 (48 digits)...

=> Perfect power check...
=> Lucas-Miller method...
=> Miller-Rabin method...
=> Phi finder method...
=> Fermat's method...
=> HOLF method...
`-> prime factor: 151496259828950613913643

Prime factors: 19760381716819645293083, 151496259828950613913643, 810175650389605457016443

...can be factorized in ~300 ms.

trizen commented 4 years ago

When applied recursively, the Miller factor method is quite powerful and can be used in efficiently factoring Carmichael numbers and Fermat pseudoprimes, considerably improving the performance of is_carmichael() for numbers with large prime factors, as illustrated the example below:

use 5.020;
use strict;
use warnings;

use experimental qw(signatures);

use Math::GMPz;
use Time::HiRes qw(tv_interval gettimeofday);
use Math::Prime::Util::GMP qw(is_carmichael random_prime factor);

my @carmichael = qw(
  1001666920738109212575902709396360286042514352001 1012110594434569157062429115537584128764140943941 1045014306533373443979426951497622020057923922017 1053339635277542901010899345783338291109248790397 1053400344885599997752089151986769357812758525937 1081871379459884096368946874775537563065875391017 1093652437904507213581290745663274524416531797041 1111326326575364503929853737779578834322052994561 1160105468636348772676470950966332780191766612381 1164853402616941470960823486334903792568286730581 1169478576761298055203485556414021530255275077301 1206645039289292193855857798633467625911705635841 1230709981603318945493795541656197641321698480761 1249858227955388552172621952068950260834579657669 1295089465568864771332443758603601283497875890441 1304838383343296301351332718210730910418540099841 1323723065416353981603375428865098514128121783361 1334507029210111630558768567633722157546251884101 1362629169634282454513599267651272391228846209217 1442880177970774675837842334714527207189923633869 1475655025160546730551622606442021644112270679581 1515580920309959359771265671703087896731277559641 1517819344312497001926180544481026663014928099129 1521926932974129829697776969925549190313181060557 1542336381807576068282709165067727735884810110289 1546598611732749529621283418149656197571268187553 1551261910023680514454645273402649048862287955721 1574731995185743884014821308633032960819814719569 1630865563514131020833319250139971979917800117421 1641038479262279231165707500928375696326358962157 1648461229604117978436818144599029070464779903929 1681831447460265429243596072201635808698096875649 1752948163568730942357822638245274530189527486541 1825610392070365728978372712765290230661440955241 1847951910976157931493211389688772255047951193121 1903138663218555298168351000223679063729582633301 1911209222087226616683569618504858539960992626101 1919913241024684439596685877567403973543282058301 1938095786159585297692641714373477374208086585361 1941344911803440835628614263078243789281592097241 1994885910112892417160543141259765571778805614661 2041567579338692112276253813884159202251426969997 2049783314896564916876868117066699501952225423981 2129839922067134866556494380192369883070575450501 2146561696218436056136655650699878282366454748281 2153921550813763658169839760863028717577035689833 2203624044194404575862922180971779264731500762717 2218247317880600512417294260919048247299113511381 2233944309943718013191824622665369160096378169781 2291877460499899632700511492044371874891777971457 2465029062723695554153007324360348162804475515017 2473037106959925756860239057942267677618019774681 2745277156012825602281556284656738040889783464701 2763640213912576944085072620932873625647341154521 2781525878829477717443462491849770567577046257909 2812870984645267378355596963245124494600425521657 2834837872265689221159591973972051090944846803221 2933914115067531211877403255056483242129505340781 2958588558258092039064183649324442745172348999681 3036331783303707471021998086475979984828778730281 3044622894467691361709235607515266112644215559401 3045755349211185116524480655187176820199370046001 3112655986018761636081928502811960304256216053141 3140038650652974048084264669324947767490180759401 3146004728993938349512018215525318403366721524501 3183885840035571334933370555973155489870344027021 3274940120537645324110917760321840317634101174301 3369819519162690706759804374277989704673651979201 3374608711503425903437598648571614798488745349001 3410124705414476996252755235856820453742375953189 3413063914973605807767216535616372816114558577001 3506839809456090964601454508584009982749680377813 3545055407797824215369380536693988929811860367501 3582219872788049654049720891506379398755515294517 3723279336104042954266934930116023300418284215149 3734331384397031772649162052629549747209940645597 3778765017211070644104814022496993730588597900621 3978356265983157260557363449101493759316437206221 4070921399205694514991085455665919531988660987681 4114935346333563199713313135889899302903995980909 4605732378143429117200763026985694835368808136041 4627009145422728201125662517453763977431509554461 4628597382885307102176436079395061085576851777161 4674231013678999828735773677094106988680428995521 4688118307742112297259322681536588394082067317661 4808719728984752515646350421803730734219961062177 5281045407121556179684012203107820632629218193957 5533756586028463407915768413007157188396306058441 5556631632876822517900101350388735174138454102201 6037689292710372179297233631303294449526892176881 6069645758796601723856289876082924098356748498001 6341296674807458499093637099287011591861445826381 6534429999507898673000017407903557467204510707301 6568761156248056352259954810963289694632636018681 6771386480990775580312179715003490759761855921981 6894696646475660252604601736929063995956008036921 7341562369327860894456251510215830820576307362873 7532257207318136830404778705770895003274463258697 7655567065331550492775196642353711733296786692061 7851619899896776641436224550272316252088841723541 7993976331927052192557544467545338223696924278321 8280251106061137584586220625003138647763274951181 8449685033958212387156307576687726254356269361741 8485260771445329398023832674936352842514705514661 8564163810400836894868532956777516038429156343081 8969663843722926216153844851511049959695340879201 9023992928207623160019920586218562439943289627781 9080532131427038524494692299923514782247186452601 9655715733910468728935484639465791449163329439161 9771126748633927168995755453641722237353691505501 3769265196358872226828609335838020920585068076701
  );

sub miller_factor ($n, $tries = undef) {

    # Factor directly when n is a native integer (optional)
    if (Math::GMPz::Rmpz_fits_ulong_p($n)) {
        return factor(Math::GMPz::Rmpz_get_ui($n));
    }

    my $D = Math::GMPz::Rmpz_init();    # n-1
    Math::GMPz::Rmpz_sub_ui($D, $n, 1);

    my $s = Math::GMPz::Rmpz_scan1($D, 0) || return ($n);
    my $r = $s - 1;

    my $d = Math::GMPz::Rmpz_init();    # D >> s
    Math::GMPz::Rmpz_div_2exp($d, $D, $s);

    $tries //= 1 + CORE::int(200 / $s);

    my $x = Math::GMPz::Rmpz_init();
    my $g = Math::GMPz::Rmpz_init();

    for (1 .. $tries) {

        my $p = random_prime(1e7);

        Math::GMPz::Rmpz_set_ui($g, $p);
        Math::GMPz::Rmpz_powm($x, $g, $d, $n);

        foreach my $k (0 .. $r) {

            last if (Math::GMPz::Rmpz_cmp_ui($x, 1) == 0);
            last if (Math::GMPz::Rmpz_cmp($x, $D) == 0);

            foreach my $i (1, -1) {

                ($i < 0)
                  ? Math::GMPz::Rmpz_sub_ui($g, $x, -$i)
                  : Math::GMPz::Rmpz_add_ui($g, $x, +$i);

                Math::GMPz::Rmpz_gcd($g, $g, $n);

                if (Math::GMPz::Rmpz_cmp_ui($g, 1) > 0 and Math::GMPz::Rmpz_cmp($g, $n) < 0) {

                    Math::GMPz::Rmpz_divexact($x, $n, $g);

                    my @g_factors = (Math::GMPz::Rmpz_probab_prime_p($g, 5) ? $g : __SUB__->($g));
                    my @x_factors = (Math::GMPz::Rmpz_probab_prime_p($x, 5) ? $x : __SUB__->($x));

                    return (@g_factors, @x_factors);
                }
            }

            Math::GMPz::Rmpz_powm_ui($x, $x, 2, $n);
        }
    }

    return ($n);
}

sub is_carmichael_fast ($n) {

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new($n);
    }

    state $nm1 = Math::GMPz::Rmpz_init_nobless();
    state $pm1 = Math::GMPz::Rmpz_init_nobless();

    Math::GMPz::Rmpz_sub_ui($nm1, $n, 1);

    my $check_conditions = sub {
        my ($factors) = @_;

        my %seen;
        foreach my $p (@$factors) {

            # Check the Korselt criterion: p-1 | n-1, for all p|n.
            if ($p < ~0) {
                Math::GMPz::Rmpz_divisible_ui_p($nm1, $p - 1) || return;
            }
            else {
                Math::GMPz::Rmpz_set_str($pm1, $p, 10);
                Math::GMPz::Rmpz_sub_ui($pm1, $pm1, 1);
                Math::GMPz::Rmpz_divisible_p($nm1, $pm1) || return;
            }

            if ($seen{$p}++) {    # not squarefree
                return;
            }
        }

        return 1;
    };

    my $omega     = 0;
    my $remainder = $n;

    if (!Math::GMPz::Rmpz_fits_ulong_p($n)) {
        # Optional: run some trial division on n and check the Korselt criterion for small p.
    }

    my @factors = map { ref($_) ? Math::GMPz::Rmpz_get_str($_, 10) : $_ } miller_factor($remainder);

    if (scalar(@factors) > 1) {

        my @primes = grep { Math::Prime::Util::GMP::is_prob_prime($_) } @factors;

        if (@primes) {
            $check_conditions->(\@primes) || return 0;
        }

        if (scalar(@primes) == scalar(@factors)) {
            return 1;
        }
    }

    @factors = map { Math::Prime::Util::GMP::factor($_) } @factors;

    $omega += scalar(@factors);
    $omega >= 3 and $check_conditions->(\@factors);
}

my $time = [gettimeofday];

foreach my $n (@carmichael) {
    is_carmichael_fast($n) || die "error for n = $n";
}

say "is_carmichael_fast() took: ", tv_interval($time), " seconds";

$time = [gettimeofday];

foreach my $n (@carmichael) {
    is_carmichael($n) || die "error for n = $n";
}

say "MPU_GMP is_carmichael() took: ", tv_interval($time), " seconds";

__END__
is_carmichael_fast() took: 0.013925 seconds
MPU_GMP is_carmichael() took: 29.858434 seconds

The Miller factorization method can also be used in some other pseudoprime-testing methods that rely on the prime factorization of n, such as:

use 5.020;
use strict;
use warnings;

use experimental qw(signatures);

use Math::GMPz;
use Time::HiRes qw(tv_interval gettimeofday);
use Math::Prime::Util::GMP qw(is_carmichael random_prime factor);

# Insert the miller_factor() subroutine here.

sub is_super_pseudoprime ($z, @bases) {

    if (ref($z) ne 'Math::GMPz') {
        $z = Math::GMPz->new($z);
    }

    my $n = Math::GMPz::Rmpz_get_str($z, 10);

    # Make sure n is a Fermat psp to the given bases
    Math::Prime::Util::GMP::is_pseudoprime($n, @bases) || return 0;

    my $check_conditions = sub {
        my ($factors) = @_;

        # Using Thomas Ordowski's criterion from A050217.
        my $gcd = Math::Prime::Util::GMP::gcd(map { Math::Prime::Util::GMP::subint($_, 1) } @$factors);

        foreach my $base (@bases) {
            Math::Prime::Util::GMP::powmod($base, $gcd, $n) eq '1'
              or return;
        }

        return 1;
    };

    my @factors = map { ref($_) ? Math::GMPz::Rmpz_get_str($_, 10) : $_ } miller_factor($z);

    if (scalar(@factors) > 1) {

        my @primes = grep { Math::Prime::Util::GMP::is_prob_prime($_) } @factors;

        if (@primes) {
            $check_conditions->(\@primes) || return 0;
        }

        if (scalar(@primes) == scalar(@factors)) {
            return 1;
        }

        # All divisors d|n must satisfy powmod(b, d, n) == b.
        # Return faster if we have a divisor without this property.
        foreach my $base (@bases) {
            foreach my $factor (@factors) {
                Math::Prime::Util::GMP::powmod($base, $factor, $n) eq $base
                  or return 0;
            }
        }
    }

    @factors = map { Math::Prime::Util::GMP::factor($_) } @factors;

    $check_conditions->(\@factors);
}

my @super_psp_2 = qw(
  1000000000000000000014541904243333611971694400533 1000000000046133150000709422509638636432783314409 1087491955144835963716834922169761184625520709721 1186997478763310635838600255029985193860836204801 1279691266253381033792291515734935282222978263681 1357769704637023632269206467756957004490475018373 1441230007443941504875581459735855729415433464651 1544793294505423455619692000782465287943420110801 1679442833746302608970116291630117785554039967291 1899210256601562066805097164461886596419283025421 2152370546436110072334433237746895629118729288321 2327820748742306600342763843713400122692793022001 2828427124769942104793311589800907206123450993069 2854954783483562964208143860790362912027397006761 3207618333880806477874185039690829397021693783041 3364668884964565561176885326417619002641858214053 3433508180124259855950192177458493057315012480001 3537274476659334447831870932179364291623819697161 3563434892028918363271349766773367698319212413201 3613804982222614221276391090439089617206565036961 3695743498595789720953631907098901817764894102689 3974163683013832880578839979920279669106466140801
);

my @super_psp_2_3 = qw(
  1000000000046133150000709422509638636432783314409 2828427124769942104793311589800907206123450993069 9000000000000000000003764611206599380599110122081 9900000000000000000036109628485787187246343073581 10000000000000000000080883777771685521728242499961 90000000000000000000053336763269438143725463532653 99000000000000000000145653723076649944203209769661 99999999999999999900365744898176440759304202468721 99999999999999999990414266131635905024511279919201 99999999999999999990432011029388636164550852426401 99999999999999999999928303019687159579392009826653 100000000000203543939552779491065637628704190723881 900000000000000000000080600572343674710226417355641 999999999999999999991019956609277832468506658103801 999999999999999999999053701638839076889745600836141 1000000000000000000000149553833252173824161590389153 1000000000002561315000002186778176350622336416051009
);

my $time = [gettimeofday];

foreach my $n (@super_psp_2) {
    is_super_pseudoprime($n, 2) || die "error for n = $n";
}

foreach my $n (@super_psp_2_3) {
    is_super_pseudoprime($n, 2, 3) || die "error for n = $n";
}

say "is_super_pseudoprime() took: ", tv_interval($time), " seconds";

$time = [gettimeofday];

foreach my $n (@super_psp_2, @super_psp_2_3) {
    my @f = factor($n);
}

say "GMP_MPU factor() took: ", tv_interval($time), " seconds";

__END__
is_super_pseudoprime() took: 0.069692 seconds
GMP_MPU factor() took: 41.220255 seconds
trizen commented 3 years ago

Another quite interesting factorization method, inspired by Fermat's Little Theorem, which can factorize an integer n when the factors are close to each other or a near-miss multiple of each other (similar to HOLF, but extended to multiple factors) or when n has a factor p for which the multiplicative order znorder(base, p) is small (similar to p-1 method).

The idea is to try to find a factor of n by checking gcd(b^n - b^k, n) for some small fixed base b >= 2 and some odd k >= 1.

use 5.014;
use warnings;
use Math::GMPz;

use ntheory qw(:all);
use POSIX qw(ULONG_MAX);

sub flt_factor {
    my ($n, $base, $reps) = @_;

    # base: a native integer <= sqrt(ULONG_MAX)
    # reps: how many tries before giving up

    if (ref($n) ne 'Math::GMPz') {
        $n = Math::GMPz->new("$n");
    }

    $base = 2   if (!defined($base) or $base < 2);
    $reps = 1e6 if (!defined($reps));

    my $z = Math::GMPz::Rmpz_init();
    my $t = Math::GMPz::Rmpz_init();
    my $g = Math::GMPz::Rmpz_init();

    Math::GMPz::Rmpz_set_ui($z, $base);
    Math::GMPz::Rmpz_set_ui($t, $base);

    Math::GMPz::Rmpz_powm($z, $z, $n, $n);

    # Cannot factor Fermat pseudoprimes
    if (Math::GMPz::Rmpz_cmp_ui($z, $base) == 0) {
        return ($n);
    }

    my $multiplier = $base * $base;

    if ($multiplier > ULONG_MAX) {    # base is too large
        return ($n);
    }

    for (my $j = 1 ; $j <= $reps ; $j += 1) {

        Math::GMPz::Rmpz_mul_ui($t, $t, $multiplier);
        Math::GMPz::Rmpz_mod($t, $t, $n) if ($j % 10 == 0);
        Math::GMPz::Rmpz_sub($g, $z, $t);
        Math::GMPz::Rmpz_gcd($g, $g, $n);

        if (Math::GMPz::Rmpz_cmp_ui($g, 1) > 0) {

            if (Math::GMPz::Rmpz_cmp($g, $n) == 0) {
                return ($n);
            }

            my $x = Math::GMPz::Rmpz_init();
            my $y = Math::GMPz::Rmpz_init();

            Math::GMPz::Rmpz_set($y, $g);
            Math::GMPz::Rmpz_divexact($x, $n, $g);

            return sort { Math::GMPz::Rmpz_cmp($a, $b) } ($x, $y);
        }
    }

    return $n;
}

my $p = random_ndigit_prime(30);

say join ' * ', flt_factor("173315617708997561998574166143524347111328490824959334367069087");
say join ' * ', flt_factor("2425361208749736840354501506901183117777758034612345610725789878400467");

say join ' * ', flt_factor(vecprod($p, next_prime($p),      next_prime(next_prime($p))));
say join ' * ', flt_factor(vecprod($p, next_prime(13 * $p), next_prime(123 * $p)));
say join ' * ', flt_factor(vecprod($p, next_prime($p),      next_prime(next_prime($p)), powint(2, 128) + 1));

Sample output:

173823271649325368927 * 997079482306880298554023570609858604518081
19760381716819645293083 * 122738580838512721992324860157572874494433031849
983057551814228496370786324543 * 966402150178984544856323228375193910346239498958826069519263
983057551814228496370786324339 * 1545277038136196287225260841895788297993709288833630781936530463
340282366920938463463374607431768211457 * 950028931822958927963036177811858421972164231097542188041798952975726949125506530708171809
danaj commented 6 months ago

I haven't forgotten about this.

https://github.com/trizen/sidef/commit/74c59016173d5ee82cc51e254807a7596fe089cb