Open belm0 opened 2 years ago
More runs looks better. Some benchmarks still slower.
$ time ../../python.exe fannkuch.py 20
real 0m29.641s
$ time ../../python.exe -X install-strict-loader -X jit fannkuch_static.py 20
real 0m17.892s
$ time ../../python.exe richards.py 200
real 0m30.914s
$ time ../../python.exe -X install-strict-loader -X jit richards_static.py 200
real 0m7.475s
$ time ../../python.exe nqueens.py 20
real 0m5.657s
$ time ../../python.exe -X install-strict-loader -X jit nqueens_static.py 20
real 0m12.813s
better with the jit-list constraints (static nqueens still much slower overall)
$ time ../../python.exe -X install-strict-loader -X jit -X jit-list-file=jitlist_richards_static.txt -X jit-enable-jit-list-wildcards richards_static.py 200
JIT: Jit/pyjit.cpp:925 -- Enabling wildcards in JIT list
JIT: Jit/jit_list.cpp:33 -- Jit-list file: jitlist_richards_static.txt
real 0m5.802s
$ time ../../python.exe nqueens.py 40
real 0m11.296s
$ time ../../python.exe -X install-strict-loader -X jit -X jit-list-file=jitlist_nqueens_static_basic.txt -X jit-enable-jit-list-wildcards nqueens_static_basic.py 40
JIT: Jit/pyjit.cpp:925 -- Enabling wildcards in JIT list
JIT: Jit/jit_list.cpp:33 -- Jit-list file: jitlist_nqueens_static_basic.txt
real 0m9.917s
$ time ../../python.exe -X install-strict-loader -X jit -X jit-list-file=jitlist_nqueens_static.txt -X jit-enable-jit-list-wildcards nqueens_static.py 40
JIT: Jit/pyjit.cpp:925 -- Enabling wildcards in JIT list
JIT: Jit/jit_list.cpp:33 -- Jit-list file: jitlist_nqueens_static.txt
real 0m24.740s
Suggest a benchmarks/README with some example invocations like this.
Yes, we definitely need a README for the benchmarks, thanks for the report! You seem to have converged on the right way to run them yourself, though. Static nqueens is new and under active development, I wouldn't worry too much about it yet. Richards, deltablue, and fannkuch should all be much faster under Static Python and with the JIT (and faster with SP+JIT than JIT alone.) SP without the JIT is more of a mixed bag; some of the arithmetic-heavy benchmarks (e.g. fannkuch) use primitives a lot in the static version, and we only actually keep primitives unboxed in the JIT. Also fannkuch has some uncharacteristic performance without the JIT because our bytecode quickening currently operates at function level based on number of calls, and fannkuch is just one very expensive function that is only called once, so bytecode quickening never kicks in.
I'll keep this open to track getting both a README on running the benchmarks, and our results from running them, added to the repo.
About managing jit-lists, it seems to be tedious in general for applications, and wildcards have known performance problems (#29).
I wonder if it would be easier all around to have a mode where only functions in static modules are jitted.
We do actually have that mode too, -X jit-all-static-functions
. I think the only reason I used a wildcard jit list for these benchmarks is that I wanted a fair comparison with running the non-static benchmark under the jit, and it didn't seem fair to expose only one of them to the wildcard jit list overhead.
Long-term for many applications the right answer is probably a dynamic mode where hot functions are jitted once they become hot in the process. It just hasn't been a priority because our application is a prefork webserver, so that mode wouldn't work for us. But we are picking up more workloads now, so it might happen sometime soon.
(Oh one gotcha for -X jit-all-static-functions
: it's additive, so -X jit -X jit-all-static-functions
is the same as -X jit
; to jit only static functions you need -X jit -X jit-list-file=/dev/null -X jit-all-static-functions
.)
a script to run benchmarks was added recently: https://github.com/facebookincubator/cinder/commit/77d5d1f55a50b9e099238c9e4f177ee8d668c646
When I build Cinder and run the programs in Tools/benchmarks, the static and static_basic variants seem to be slower than the originals. Am I doing something wrong?
(update: -X jit helps, but full static is only about 15% faster?)