Closed oscardssmith closed 2 years ago
I get errors using this branch. But you just need to tell me what is @btime factor.(1:300)
.
spelling function
is hard.
There are still errors:
MIN_FACTOR_SIZE not defined
Trying to fix it, I get 18μs for @btime factor.(1:300)
which improves by a factor of 2 on the previous value. However with my
cache of small factorisations I got 2μs
Errors fixed. When you call factor
with the cache you wrote, are you copying the result? I'm seeing
@btime factor.(1:300);
13.029 μs (301 allocations: 40.00 KiB)
@btime copy.(getfield.(factor.(1:300), :pe));
23.262 μs (902 allocations: 72.48 KiB)
Which implies that if your program was copying the data out of the cache (which you should because Factorization
is mutable), you could only be 25% faster. 13
for factor
vs 10.2
for the copy.
OK, I think this is now ready to review. This now only stores the min factor for odd numbers (since even numbers have a trivial min factor of 2).
For a quick benchmark, on master, factoring all numbers up to 2^20 took .339
seconds, while on this branch, it takes .213
seconds. The biggest changes are for numbers <2^16 which run 4x faster, and for numbers who's largest factor is <2^16 and who's second largest factor is <<2^16 where this will now run in O(second_largest_factor) rather than O(largest_factor)
This also obsoletes https://github.com/JuliaMath/Primes.jl/pull/113 by bringing the isprime
time down to 4ns for all numbers <2^16
I still cannot test since there are bugs. For instance, factor(6) returns 6 (instead of 2*3).
OK I could test. On a benchmark test of my package CyclotomicsNumber using Primes.factor instead of my cached version caused an overhead of 20% (380ms instead of 320 ms). The new version takes 350ms so the overhead is half, 10%. It is late in Paris. I will test tomorrow night some of my other packages...
By the way in this test I do not copy the factorizations. I just loop on the pairs (prime,power).
I just made another test with my package factoring into cyclotomic polynomials with somewhat better results. Choosing Primes.factor official version instead of my cached version increases time by 10%. With the new version the time is increased only by 3%.
I think I can get the overhead even lower by defining eachfactor(n)
that returns an iterator of factor, exponent
pairs. Since almost all of the time (for small numbers) is spent allocating the factorization object, skipping that should be a big win.
any objections to me merging? (if not, I will soonish)
No objection. If not the ultimate improvement I was looking for, it is clearly an improvement
Have you seen https://github.com/JuliaMath/Primes.jl/pull/118? That gives you another 2x if you don't need to save the factors (i.e. for computing arithmetic functions)
One cannot collect your iterator since it yields tuples of ints but the eltype
is int.
oops. Fixed. (that said, collecting it is removes the point)
collecting an iterator is not a good way to use it, but a good way to test it.
I just took the time to see the effect of using eachfactor
in my package CyclotomicNumbers
. Since I can use it in the critical path all the overhead is gone and I can forget my cache of factor
for this package.
This caches the smallest factor for odd
n<2^16
(which is reasonable because this number can be stored in aUInt8
, and therefore takes less space thanPRIMES
took before.The advantage of storing this information rather than the (arguably more obvious) list of small primes is that it allows for O(1) primality testing as well as faster factoring. @jmichel7 can you test if this fixes your performance problems?