JuliaMath / Primes.jl

Prime numbers in Julia
Other
99 stars 32 forks source link

Factor is terribly slow #49

Open enedil opened 7 years ago

enedil commented 7 years ago

Hello. I believe that after finding every consecutive prime factor, performing isprime on resultant is harmful for performance of such function. Consider such snippet:

number = reduce(*, 1, Primes.PRIMES)
tic(); factor(number); toc()
# already running for 15 minutes, still didn't halt

Compare it with equivalent Python:

from operator import mul
from sympy import primerange, factorint
number = reduce(mul, primerange(1, 2**16), 1)
%timeit factorint(number)
# 1 loops, best of 3: 16.8 s per loop

Considering the fact that sympy is written in pure Python, this is pathetic.

Profiling Julia version brings such result (this time taking the product of first 3000 primes):

WARNING: The profile data buffer is full; profiling probably terminated
before your program finished. To profile for longer runs, call Profile.init
with a larger buffer and/or larger delay.
18816 ./event.jl:68; (::Base.REPL.##3#4{Base.REPL.REPLBackend})()
 18816 ./REPL.jl:95; macro expansion
  18816 ./REPL.jl:64; eval_user_input(::Any, ::Base.REPL.REPLBackend)
   18816 ./<missing>:?; anonymous
    18816 ./profile.jl:16; macro expansion;
     18816 /home/enedil/.julia/v0.5/Primes/src/Primes.jl:312; factor(::BigInt)
      1     ./dict.jl:701; factor!(::BigInt, ::Dict{BigInt,Int64})
      11    /home/enedil/.julia/v0.5/Primes/src/Primes.jl:267; factor!(::BigInt, ::Dict{BigInt,Int64})
       11 ./gmp.jl:110; convert
      3     /home/enedil/.julia/v0.5/Primes/src/Primes.jl:268; factor!(::BigInt, ::Dict{BigInt,Int64})
       1 ./gmp.jl:255; rem
       2 ./gmp.jl:256; rem
      2     /home/enedil/.julia/v0.5/Primes/src/Primes.jl:269; factor!(::BigInt, ::Dict{BigInt,Int64})
       1 ./dict.jl:644; setindex!(::Dict{BigInt,Int64}, ::Int64, ::BigInt)
      3     /home/enedil/.julia/v0.5/Primes/src/Primes.jl:270; factor!(::BigInt, ::Dict{BigInt,Int64})
       3 ./gmp.jl:256; div
      6     /home/enedil/.julia/v0.5/Primes/src/Primes.jl:271; factor!(::BigInt, ::Dict{BigInt,Int64})
       6 ./gmp.jl:255; rem
      18789 /home/enedil/.julia/v0.5/Primes/src/Primes.jl:276; factor!(::BigInt, ::Dict{BigInt,Int64})
       18789 /home/enedil/.julia/v0.5/Primes/src/Primes.jl:190; isprime
        18789 /home/enedil/.julia/v0.5/Primes/src/Primes.jl:190; isprime

Ha! I suspected that checking is number is prime after excluding one particular factor is superfluous, and I know it's basically one of the worst cases for such design, but after all, how such difference in time (with sympy) could be explained?

I propose the call to isprime to be supressed.

enedil commented 7 years ago

https://github.com/JuliaMath/Primes.jl/pull/50

this is pull request.

rfourquet commented 7 years ago

number = reduce(*, 1, Primes.PRIMES)

Did you mean number = reduce(*, big(1), Primes.PRIMES)? Note that I have an optimized version of the pollardfactors! for BigInt which I just need to review before submitting a PR.

enedil commented 7 years ago

@rfourquet that's what I did on my machine, but I forgo to change it on GitHub, where I was preparing this issue. This number however avoids calling Pollard Rho at all.

trizen commented 3 years ago

Currently, factor(big"2"^128 + 1) also takes a very long time. But we can do better!

As an experiment, I implemented the Continued Fraction Factorization Method (CFRAC) in Julia, which is able to factorize 2^128 + 1 in just 4.5 seconds. Some optimizations may be possible.

The code is available at: https://github.com/trizen/julia-scripts/blob/master/Math/continued_fraction_factorization_method.jl

It works best for numbers with 30-50 digits in size that do not have small factors.

Additionally, a very simple implementation of the Elliptic Curve Factorization Method (ECM) in Julia: https://github.com/trizen/julia-scripts/blob/master/Math/elliptic-curve_factorization_method.jl

It can find very quickly factors of about 10-17 digits in size, without strongly depending on the size of n. For example, it takes just 1.5 seconds to find a factor of 2^128 + 1.

Feel free to use both methods inside the Primes module if you want.

More special-purpose factorization algorithms are described at: https://trizenx.blogspot.com/2019/08/special-purpose-factorization-algorithms.html

oscardssmith commented 2 years ago

I think we can close this now that https://github.com/JuliaMath/Primes.jl/pull/93 is merged. Using either of the strategies mentioned by @trizen would probably be good, but at this point, the initial issue from this PR gets solved in 0.45 seconds and other numbers with a few big factors are still somewhat reasonable.

oscardssmith commented 2 years ago

@trizen would you be willing to re-license ecm to an MIT license? I would be interested in adding them to Primes.jl for factoring numbers.

trizen commented 2 years ago

@trizen would you be willing to re-license ecm to an MIT license?

I removed any license restrictions. Life is too short to care about licenses and arbitrary restrictions. :)

oscardssmith commented 2 years ago

Can you confirm I haven't done anything especially stupid here? https://github.com/JuliaMath/Primes.jl/pull/104

oscardssmith commented 2 years ago

Also, how hard do you think it would be to add the second stage of the algorithm? According to https://www.rieselprime.de/ziki/Elliptic_curve_method it can significantly increase the success chance.

trizen commented 2 years ago

Can you confirm I haven't done anything especially stupid here? #104

Looks good to me.

Also, how hard do you think it would be to add the second stage of the algorithm? According to https://www.rieselprime.de/ziki/Elliptic_curve_method it can significantly increase the success chance.

Should be doable, although I haven't tried implementing the second stage. Maybe @danaj can provide some help here.

See also: https://github.com/danaj/Math-Prime-Util-GMP/blob/master/ecm.c#L487

trizen commented 11 months ago

Regarding the second stage of the ECM, the SymPy library has very nice and readable code for it: https://github.com/sympy/sympy/blob/master/sympy/ntheory/ecm.py

Just for fun, I also made two translations of the Python code (a slightly older version), which can also be used for reference:

oscardssmith commented 11 months ago

I got most of the way time with a Julia implementation in https://github.com/JuliaMath/Primes.jl/pull/104