Open Bodigrim opened 6 years ago
I've been messing around with this on a branch, and there are some really nice implementations that fall out of this approach.
primeCount
falls out immediately as primeLucy (const 1)
sumOfPrimesUpTo
is simply primeLucy id
It's still quite messy, but it's extremely fast once it has warmed up.
primeCount 100000
is slowish on my machine, every time I run it. primeCount' 100000
on my machine is slow the first time, but the result is cached and instantaneous for every n <= 100000
thereafter. And that shared allocation of primes
carries over to every other function that uses primeLucy
under the hood, so I can instantly get the sum of primes up to 100000.
I'll probably meditate on this and see if I can reimplement in terms of Vector
(since that's another open task). I'll likely also rework the API so there's an option to provide your own summatory function for the case where summatory f
isn't what you actually want, as in the original formulation.
@jhenahan
If we are talking about the primeLucy
I originally submitted, I don't entirely understand.
My primeLucy
not only allows you to supply, but requires, a summatory function. While calculating the summatory function from the principal function is possible, that would take O(n)
and hence defeat the purpose.
As for switching from the Array interface to the Vector interface, far be it from me to defend the former's elegance. That said, the code I submitted was highly optimized using explicit offsets and the unsafe functions. That ought to compile down to close to optimal x86-64 assembly, so I doubt that using the nicer Vector interface will buy you much. But if you want to do it for elegance's sake and achieve the same (or better!) performance, please do!
@CarlEdman Thanks very much for the fairly critical point about the summatory function. :) That would absolutely defeat the purpose. I'll have to think on it more, in that case.
@jhenahan My pleasure. Let me know if I can help.
With regards to array
vs. vector
: there are unboxed mutable vectors with a low-level interface, providing decent performance: unsafeRead, unsafeWrite, etc. I feel like nowadays Haskell developers (or at least myself :) are more acquainted with Vector
and it is a default choice, but in course of this issue we should try to benchmark both data structures.
@bodigrim I think that unboxing could be a big win. For starters, it ought to reduce GC time to just about 0. The reason I didn't use unboxed arrays was that for my problem I needed more than 64-bit int precision, so I had to go with Integers which can't really be unboxed.
Given how large some of the summatory function results can be, I think keeping Integers available for primeLucy
would be useful. But an alternative version that uses unboxed ints (perhaps with modular arithmetic support at every stage) could be a big win for certain users.
Would you have any suggested literature on “useful” summatory functions? While I’m thinking on it, it couldn’t hurt to build in more special cases like primeCount
.
@jhenahan
I don't really have a literature, but a few pointers:
The Faulhaber polynomials (for which I submitted code at Issue #70) give the summatory functions for all series of the form $x^n$. Hence, by trivial extension, they also give the summatory functions for any polynomial.
I have not tried this, but I believe primeLucy
can be used to generate summatory functions which can be fed back to itself. This should allow you to efficiently calculate, e.g., the sum of all composite numbers up to n which are the product of exactly two primes.
@jhenahan any news? Feel free to ping me, if you need to discuss anything.
Nothing yet, but I’ll be setting aside some time to look into it further next week. Work’s been eating my OSS time.
Math.NumberTheory.Primes.Counting.Impl.primeCount
implements an algorithm to compute π(n), which is number of primes below n, in O(n^0.7) time and O(n^1/2) space. One can view π(n) as a sum off p = 1
over primep
belown
.As it was pointed by @CarlEdman in #60 (
primeLucy
), this approach allows a generalization for any functionf
with the same time and space requirements. Moreover, there is a (difficult) way to improve its time complexity to O(n^2/3).That said, it seems desirable to productionalize
primeLucy
, provide a nice API, write decent tests and benchmarks, and then expressprimeCount
as a special case.Additional resources: https://en.wikipedia.org/wiki/Meissel–Lehmer_algorithm http://sweet.ua.pt/tos/bib/5.4.pdf https://projecteuler.net/thread=10;page=5#111677