scala-native / scala-native

Your favorite language gets closer to bare metal.
http://scala-native.org
Other
4.49k stars 364 forks source link

Bubblesort is extremely slow #1266

Closed thehappycoder closed 6 years ago

thehappycoder commented 6 years ago

On JVM and native build on GraalVM it takes about 12 seconds, on Scala Native it takes about 2.5 minutes.

import util.Random
import scala.annotation.tailrec

object BubbleSort {
  def imperativeBubbleSort[T <% Ordered[T]](source: Array[T]): Array[T] = {
    for (i <- 0 until source.length - 1; j <- 0 until source.length - 1 - i) {
      if (source(j) > source(j + 1)) {
         val temp = source(j)
         source(j) = source(j + 1)
         source(j + 1) = temp
      }
    }

    source
   }

  def main(args: Array[String]) {
    println("Sorting!")
    imperativeBubbleSort(Array.fill(30000)(Random.nextInt))
    println("Sorted!")
  }
}

Am I missing some configuration or Scala Native isn't there yet?

densh commented 6 years ago

So first of all make sure to always add following lines to build.sbt :

nativeMode := "release"
nativeGC := "immix"

Whenever you benchmark. By default we use debug (non-optimized) mode and boehm gc (slower but more stable).

I wrapped the main method you provided into a while loop that prints nanosecond over every iteration, and run it on JDK8 (-Xms/-Xmx 1024M), SN 0.3.7 and SN with Interflow. Numbers reported here are best (i.e. minimum) out of 20 iterations in the same run:

Implementation Time
JDK8 4.194390191 s
SN 0.3.7 25.201764658 s
SN + Interflow 1.261149289 s

If you can't wait for Interflow to be released, you can get a major speedup by removing generics and hand-specializing this for integers on a specific ordering and iterate over them using while loops instead of ranges. In the future Interflow will automatically do specialization for you.

EDIT: running on 7900x.

LeeTibbert commented 6 years ago

And pattern defeating quicksort (pdqsort) is, with any good fortune, on the way.

What you need/want is what you need/want. That said, was there some reason that you were using bubble sort or would something else do?

Very preliminary data, with nativeMode = debug, boehm gc, on a 3Ghz Pentium 32 bit 1-core, 2-ht machine: Using pdqsort with an (American) million integers, I got times of ScalaJVMsort (quicksort) approx 10.53 seconds. Pdqsort took approx 0.168 seconds. As with all experience points & benchmarks (lies, d*mn lies, & benchmarks), your mileage may vary.

I have proof of concept of fast ascending & descending sorts of both signed & unsigned primitives/AnyVals (where < and > exist. i.e. no sort of Unit).

Hope this helps.

thehappycoder commented 6 years ago

@densh Indeed, immix made a difference! Looking forward to interflow. I wanted to try it but after adding ScalaNative's master branch as a dependency, enablePlugins(ScalaNativePlugin) began yelling at me that such plugin doesn't exist.

@LeeTibbert I just whatever algorithm to play with Scala Native and GraalVM native compilation.

densh commented 6 years ago

@LeeTibbert I think this issue was mostly 'why is SN so much slower on this code', it's not about sorting specifically.

@thehappycoder Interflow is still not in master yet, but stay tuned for the updates in coming month(s). Regarding the error: you have to build snapshots locally (i.e. clone the repo and run sbt rebuild), we don't publish them publicly.

I'm closing this issue for now. There is nothing we can do short term before Interflow to improve performance here. Once Interflow is out, If you find that performance is still lacking, feel free to reopen then.