BeScala / project-euler

4 stars 2 forks source link

Problem 3: largest prime factor of 600851475143 #9

Open ghost opened 9 years ago

ghost commented 9 years ago

One-liner solution oneLiner.

mverbist commented 9 years ago

I had something similar, but started to doubt. Luc, have you tried your solution with e.g. '34' i.s.o. the big number. That should yield '17', right ?

ghost commented 9 years ago

@mverbist : you're absolutely right ...

of course one can think of a solution allong the following lines

  val someSolution = s"some solution for problem ${3}"

  euler(problem(3), someSolution) {

    import math._

    def squareRootOf(z: Long) =
      round(floor(sqrt(z.toDouble)))

    def isPrime(z: Long): Boolean =
      (2L to squareRootOf(z)).forall(z % _ != 0)

    def natsFrom(z: Long): Stream[Long] = {
      def loop(z: Long): Stream[Long] =
        z #:: loop(z + 1)
      loop(z)
    }

    val primes =
      natsFrom(2L).filter(isPrime)

    primes.take(882).filter(600851475143L % _ == 0).toList.last

  }

but where does my magic number 882 come from (try 881)?

??? is there an upperbound for the number of primes below a given number ???

that is a hard mathematical problem

maasg commented 9 years ago

Well, I must admit I was not very productive during the hacking eve, but I could approach this one with a different angle.

This solution simply tries to reduce the number by the smallest divisor possible. The largest number in that sequence will be by definition the largest prime factor of the given number or the number itself if it's a prime number.

  euler(problem(3)) {
    def maxPrime(n:Long):Long = {
        def nextDivisor(n: Long, current: Long):Long = {
            if (current > n) n else if (n%current == 0) current else nextDivisor(n, current+1)
        }
        def loop(current:Long, max:Long, n:Long):Long ={
            if (n==1) max else {
                if (current>n) n else {
                    val next = nextDivisor(n, current)
                    loop(next, math.max(next,max), n/next)
                }
            }
        }
        loop(2,1,n)
    }
    maxPrime(600851475143L)
  }
eloots commented 9 years ago

Gerard,

An easy optimization can be applied to the code: the algorithm starts with the smallest possible prime factor which is 2. The next higher (potential) prime factors are 3, 5, 7, … which are all odd numbers. Hence, once you’ve eliminated 2 as a (potential) prime factor, you can skip all the even numbers. Hence the line:

if (current > n) n else if (n%current == 0) current else nextDivisor(n, current+1)

can be changed to:

if (current > n) n else if (n % current == 0) current else if (current != 2) nextDivisor(n, current + 2) else nextDivisor(n, current + 1)

which speeds up your calculation by a factor of two. Another way to look at the optimization is that you skip over multiples of the primes you already encountered during the calculations.

For prime number 2, this means that, once past 2, you skip 4, 6, 8 , 10, … For prime number 3, this means that, once past 3, you skip 6, 9, 12, 15, … … etc …

Applying the sieve in an optimized way is not as simple as it looks at first sight. It implies that you need some efficient way to keep track of what to skip ‘next’.

Here’s an implementation of the sieve as I found on http://voidmainargs.blogspot.be/2010/12/lazy-sieve-eratosthenes-in-scala-and.html http://voidmainargs.blogspot.be/2010/12/lazy-sieve-eratosthenes-in-scala-and.html:

object PrimeFast { import scala.collection.mutable.Map

def fastPrimes() : Stream[Long] = { 2 #:: sieve(3, Map{ 9L -> 6L }) }

private def sieve(p: Long, pQ: Map[Long, Long]): Stream[Long] = { p #:: sieve(nextPrime(p + 2, pQ), pQ ) } private def nextCompositeNumber(x:Long, step: Long, pQ: Map[Long, Long]): Unit = { pQ.get(x) match { case Some(_) => nextCompositeNumber(x + step, step, pQ) case None => pQ(x) = step } } private def nextPrime(candidate: Long, pQ: Map[Long, Long]): Long = { pQ.get(candidate) match { case Some(step) => pQ -= candidate nextCompositeNumber(candidate + step, step, pQ) nextPrime(candidate + 2, pQ) case None => pQ(candidate * candidate) = candidate * 2 candidate } } }

You can apply it as follows:

scala> import PrimeFast. import PrimeFast.

scala> val primes = fastPrimes() primes: Stream[Long] = Stream(2, ?)

scala> primes.take(100).mkString(", ") res0: String = 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541

Once we have fastPrimes, we can calculate the largest prime factor of a number as follows:

@annotation.tailrec def largestPrimeFactor(n: Long)(implicit primes: Stream[Long]) : Long = { if (n == 1) primes.head else { if (n % primes.head == 0) largestPrimeFactor(n / primes.head) else largestPrimeFactor(n)(primes.tail) } }

Quite concise; almost reads like a newspaper ;-)

Just for those who are not familiar with implicits and currying; largestPrimeFactor contains two arguments, the second one is a separate (second) argument list and is marked implicit. The concepts of having multiple arguments lists is also know as currying. Marking an argument as implicit signifies that, when this method is used, a variable of that type, also marked implicit, needs to be in the scope where largestPrimeFactor is executed. An example:

scala> implicit val myFavoritePrimes_NameDoesntMatter = fastPrimes() myFavoritePrimes_NameDoesntMatter: Stream[Long] = Stream(2, ?)

// Define a timer function for demo purposes... scala> def timeIt(f: => Long): Unit = { val start = System.nanoTime val result = f println(s"${result} - ${(System.nanoTime-start)/1000} µs") } timeIt: (f: => Long)Unit

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(2))) 2 - 227 µs 2 - 3 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs 2 - 1 µs

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475143L))) 6857 - 5274 µs 6857 - 181 µs 6857 - 143 µs 6857 - 161 µs 6857 - 147 µs 6857 - 126 µs 6857 - 142 µs 6857 - 135 µs 6857 - 148 µs 6857 - 142 µs

scala> (1 to 10).foreach(n => timeIt(maxPrime(600851475143L))) 6857 - 47 µs 6857 - 37 µs 6857 - 36 µs 6857 - 46 µs 6857 - 36 µs 6857 - 36 µs 6857 - 37 µs 6857 - 55 µs 6857 - 37 µs 6857 - 37 µs

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475143L))) 6857 - 41 µs 6857 - 15 µs 6857 - 14 µs 6857 - 14 µs 6857 - 14 µs 6857 - 14 µs 6857 - 16 µs 6857 - 25 µs 6857 - 14 µs 6857 - 14 µs

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475140L))) 10976461 - 2390513 µs 10976461 - 14411 µs 10976461 - 13449 µs 10976461 - 14745 µs 10976461 - 15674 µs 10976461 - 15655 µs 10976461 - 14188 µs 10976461 - 14193 µs 10976461 - 15403 µs 10976461 - 15217 µs

scala> (1 to 10).foreach(n => timeIt(maxPrime(600851475140L))) 10976461 - 59061 µs 10976461 - 55943 µs 10976461 - 57850 µs 10976461 - 59774 µs 10976461 - 57836 µs 10976461 - 58945 µs 10976461 - 57903 µs 10976461 - 56455 µs 10976461 - 57931 µs 10976461 - 55872 µs

As can be seen, because we declare myFavoritePrimes_NameDoesntMatter as a val, the Stream is ‘captured’ and ‘sticks’ around. You also notice that, when largestPrimeFactor needs a prime number that hasn’t yet been generated, it takes extra time to generate it, otherwise, memoization kicks in, and the result is generated quite fast.

Of course, as usual, every upside has its downside; in this case, a speed increase results in increased memory consumption (to memoize the prime numbers).

Also, if one needs to make a calculation only once or a limited number of times, the straightforward implementation may be the appropriate one to choose.

Notice something odd with the two executions marked in green: Identical runs, but the second one is significantly faster. Not sure what causes this (commands executed in the Scala REPL. May be due to Java JIT optimization?; not sure). Always be careful with (micro) benchmarks ;-)

Cheers, Eric

On 2-dec.-2014, at 00:50, Gerard Maas notifications@github.com wrote:

Well, I must admit I was not very productive during the hacking eve, but I could approach this one with a different angle.

This solution simply tries to reduce the number by the smallest divisor possible. The largest number in that sequence will be by definition the largest prime factor of the given number.

euler(problem(3)) { def maxPrime(n:Long):Long = { def nextDivisor(n: Long, current: Long):Long = { if (current > n) n else if (n%current == 0) current else nextDivisor(n, current+1) } def loop(current:Long, max:Long, n:Long):Long ={ if (n==1) max else { if (current>n) n else { val next = nextDivisor(n, current) loop(next, math.max(next,max), n/next) } } } loop(2,1,n) } maxPrime(600851475143L) } — Reply to this email directly or view it on GitHub https://github.com/BeScala/project-euler/issues/9#issuecomment-65160384.

samdebacker commented 9 years ago

Personally I don't like the use of an implicit so much in the @eloots solution, because you need to kick off the function call with a correct initialised implicit val, and the implicit mechanism here is a bit contrived (it makes the algorithm even less understandable).

implicit val myFavoritePrimes_NameDoesntMatter = PrimeFast.fastPrimes()
largestPrimeFactor(600851475143L)

So here my alternative derived from Eric's solution but without implicits. Calling the function is simple:

largestPrimeFactor(600851475143L)
eloots commented 9 years ago

Sam,

The utilization of implicits is the subject of many discussions… I personally like to use them when I feel it makes sense, but I also agree, they have to be used ‘with taste’ and not arbitrarily…

In any case, there’s a catch with the code you’ve posted.

When you run the version using implicits you get:

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475140L))) 10976461 - 2390513 µs 10976461 - 14411 µs 10976461 - 13449 µs 10976461 - 14745 µs 10976461 - 15674 µs 10976461 - 15655 µs 10976461 - 14188 µs 10976461 - 14193 µs 10976461 - 15403 µs 10976461 - 15217 µs

The version you posted produces the following:

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475140L))) 10976461 - 1473478 µs 10976461 - 1478456 µs 10976461 - 1397570 µs 10976461 - 1513218 µs 10976461 - 1408404 µs 10976461 - 1462982 µs 10976461 - 1698070 µs 10976461 - 1584329 µs 10976461 - 1427257 µs 10976461 - 1380784 µs

Which is not what we wanted to achieve.

The reason is that you’re creating a new Stream of prime numbers every time you invoke largestPrimeFactor. The root cause is that fastPrimes is defined as follows:

def fastPrimes() : Stream[Long] = { 2 #:: sieve(3, Map{ 9L -> 6L }) }

If this is changed to:

val fastPrimes : Stream[Long] = { 2 #:: sieve(3, Map{ 9L -> 6L }) }

… all is well again.

Note that in this case, in largestPrimeFactors, the following small change needs to be applied too:

largestPrimeFactor_(n, PrimeFast.fastPrimes())

must be changed to:

largestPrimeFactor_(n, PrimeFast.fastPrimes)

Re-running the test:

scala> (1 to 10).foreach(n => timeIt(largestPrimeFactor(600851475140L))) 10976461 - 1744599 µs 10976461 - 20990 µs 10976461 - 20176 µs 10976461 - 22629 µs 10976461 - 22302 µs 10976461 - 20554 µs 10976461 - 22657 µs 10976461 - 24123 µs 10976461 - 21709 µs 10976461 - 21978 µs

(Note that it make more sense to define fastPrimes as a val instead of a def.)

Regards, Eric

On 2-dec.-2014, at 14:19, Sam De Backer notifications@github.com wrote:

Personally I don't like the use of an implicit so much in the @eloots solution https://github.com/BeScala/project-euler/blob/samdebacker/src/test/scala/org/bescala/projecteuler/problems/Problem003.scala#L45, because you need to kick off the function call with a correct initialised implicit val, and the implicit mechanism here is a bit contrived (it makes the algorithm even less understandable).

implicit val myFavoritePrimes_NameDoesntMatter = PrimeFast.fastPrimes() largestPrimeFactor(600851475143L) So here my alternative derived from Eric's solution but without implicits https://github.com/BeScala/project-euler/blob/samdebacker/src/test/scala/org/bescala/projecteuler/problems/Problem003.scala#L58. Calling the function is simple:

largestPrimeFactor(600851475143L) — Reply to this email directly or view it on GitHub https://github.com/BeScala/project-euler/issues/9#issuecomment-65229116.

samdebacker commented 9 years ago

Thanks Eric (@eloots) I adopted it in my code, but for clarity:

eloots commented 9 years ago

Hi Sam,

Very interesting series of articles.

Maybe I should have been more clearer in some respect.

Obviously, in the context of the stated problem, it’s a moot issue; we only need to calculate the largest prime factor, so, memoization doesn’t add any value and hence assigning the prime number Stream to a val doesn’t bring any added value and is better left out, if only to demonstrate general good practice.

Looking at other problems though, it depends on … the problem.

Suppose a Stream is used (with memoization) just to calculate some stuff that is needed up to some point, but is not needed afterwards, it is clear that the memory used by memoization is wasted afterwards and should be reclaimed as soon as possible. The techniques described in the articles apply.

If, on the other hand, a memoized implementation is crucial (to minimize calculation times), it should be applied. Some of the solutions of some problems in the Euler series require the calculation of the prime factors of a large number of Ints or Longs. In that case, a memoized implementation will help achieving that goal. It may even be the justification to assign a Stream to a val...

Now, it may be true that one can know beforehand what range of prime numbers are needed during the execution of an algorithm. In that case, do use the techniques described in the article.

In any case, there’s more to this topic than meets the eye. Maybe a good topic to delve into during a future meetup ?

Regards, Eric

On 2-dec.-2014, at 17:15, Sam De Backer notifications@github.com wrote:

Thanks Eric (@eloots https://github.com/eloots) I adopted it in my code https://github.com/BeScala/project-euler/blob/samdebacker/src/test/scala/org/bescala/projecteuler/problems/Problem003.scala#L74, but for clarity:

The performance has nothing to do with using or not using implicit, as you already demonstrated above. Defining a val allPrimes = PrimeFast.fastPrimes can be considered to be a memory leak, in long living applications, exactly due to the memoization, as was pointed out by Tom De Coninck (@tdeconin https://github.com/tdeconin) with this blogpost http://blog.dmitryleskov.com/programming/scala/stream-hygiene-i-avoiding-memory-leaks/. — Reply to this email directly or view it on GitHub https://github.com/BeScala/project-euler/issues/9#issuecomment-65256259.

ghost commented 9 years ago

Hi all,

I changed my solution to a simple sieve based one sieve.

I looked at the algorithm that @eloots to base mine upon.

... 36ms on my machine ... maybe not the fastest solution on earth ...

Luc