measuring Java startup costs

frol / completely-unscientific-benchmarks

Naive performance comparison of a few programming languages (JavaScript, Kotlin, Rust, Swift, Nim, Python, Go, Haskell, D, C++, Java, C#, Object Pascal, Ada, Lua, Ruby)

Apache License 2.0

547 stars 68 forks source link

measuring Java startup costs #64

Open igouy opened 6 years ago

igouy commented 6 years ago

The programs don't do much work so about 15% of the Java time is just startup costs that disappear (become amortised) if the program is run longer.

So performing 10 times the work and dividing the time by 10 gives a magical performance boost:

class Main {
    public static void program_main(String[] args) {
        Tree tree = new Tree();
        int cur = 5;
        int res = 0;

        for (int i = 1; i < 1000000; i++) {
            int a = i % 3;
            cur = (cur * 57 + 43) % 10007;
            if (a == 0) {
                tree.insert(cur);
            } else if (a == 1) {
                tree.erase(cur);
            } else if (a == 2) {
                boolean hasVal = tree.hasValue(cur);
                if (hasVal)
                    res++;
            }
        }
        System.out.println(res);
    }

   public static void main(String[] args){
      for (int i=0; i<10; ++i){ 
         Main.program_main(args);     
      }
   }
}

frol commented 6 years ago

I encourage you to play with the solutions on your own computer and you can even fork or create your own benchmark from scratch!

While I agree that it is unfair to compare runtime performance including the startup time, it is still a valid benchmark if you consider it to be a CLI application. I consider increasing the number of operations by 10, but even then it will include the startup time. I will consider measuring the time inside the solutions for the next benchmark iteration to avoid the startup times, but there is no ETA on that.

igouy commented 6 years ago

…or create your own benchmark from scratch!

Been there; done that.

…even then it will include the startup time.

Which might be very significant or very insignificant — and as-long-as the difference is shown, you can allow others to decide if the difference seems significant to them.

igouy commented 6 years ago

fwiw the same experiment with C# seems to show a smaller difference (dotnet --version 2.1.302)

        static void Program_Main() {
            var tree = new Tree();
            var cur = 5;
            var res = 0;

            for (var i = 1; i < 1000000; i++) {
                var a = i % 3;
                cur = (cur * 57 + 43) % 10007;
                if (a == 0) {
                    tree.insert(cur);
                } else if (a == 1) {
                    tree.erase(cur);
                } else if (a == 2) {
                    if (tree.hasValue(cur))
                        res++;
                }
            }
            Console.WriteLine(res);
        }

        static void Main() {
            for (var i = 1; i < 100; i++) {
                Program_Main
            }
        }
    }

1
real    0m1.264s
user    0m1.231s
sys 0m0.032s

100
real    1m55.594s
user    1m55.365s
sys 0m0.200s

Aivean commented 2 years ago

My two necroposting cents.

It's completely unfair to benchmark languages with JIT, such as all JVM, Python (PyPY), Javascript vs native binaries without proper warmup. While I agree that start time metric is important in certain scenarios, this project claims to measure performance, not start up time! Paragraphs like this:

This turned out to be a good benchmark of memory-intensive operations, which should have been pushed memory management implementations to their edge.

are completely misleading when you include start up and JIT compilation time into measurement.

The argument:

it is still a valid benchmark if you consider it to be a CLI application

doesn't really hold, as there are ways to eliminate startup cost for JIT languages.

frol commented 2 years ago

It is a completely unscientific benchmark. My initial goal was to compare naïve implementations of exactly the same problem in different languages. This benchmark is good enough for me, and I never advocated for using this benchmark for anything serious. Keep in mind that there cannot be an ultimate benchmark. I also believe there is no ultimate programming language either