Closed inducer closed 1 year ago
Pypy is 0.90s.
For posterity, could also report the runtimes for the rust-port on your machine?
cargo new --bin symoxide_contest
cd symoxide_contest
cargo add symoxide
curl https://gist.githubusercontent.com/kaushikcfd/74c442a075557dad466cd3daea9c151f/raw/d593ce7d5a6de6764e541921472456c8528efb3b/symoxide_expr_traversal.rs > src/main.rs
cargo run --release
Just ran this. Somewhere around .28s.
nightly-x86_64-unknown-linux-gnu (default)
rustc 1.66.0-nightly (432abd86f 2022-09-20)
So pypy times seem super variable. Just reran, with
Python 3.8.13 (7.3.9+dfsg-4, Aug 09 2022, 12:51:24)
[PyPy 7.3.9 with GCC 12.1.0]
(nominally unchanged to before) and got 0.63s.
I have an idea for a Python-side AST-to-AST transform that would eliminate the bounces to rec
in the bulk of cases, by effectively inlining the common case of rec
. IDK whether I'll be desperate enough this weekend to do it, but I think that might do something, because it would cut the number of frames set up in a pymbolic traversal by ~2.
So pypy times seem super variable. Just reran, with
Maybe earlier the JIT costs were also included? Also, not sure if tracing JIT is mature enough to compete with AOT compilers for such workloads in terms of both -- compilation overheads and quality of generated code.
not sure if tracing JIT is mature enough to compete with AOT compilers
What does "mature" mean to you here? For me, I'm interested in correctness and speed, in that order. :slightly_smiling_face:
What does "mature" mean to you here? For me, I'm interested in correctness and speed, in that order. slightly_smiling_face
Correctness is the bare minimum [^1]. I was mainly referring about the JIT overheads (#recompliations and compilation costs)
[^1]: not saying it is trivial :)
The "mapper optimizer" is now a thing. It succeeds at removing passed-through *args, **kwargs
when those aren't needed, and that's clearly profitable.
It also contains a code path for inlining self.rec
and/or the cache retrieval, but those are curiously unprofitable. Not sure I understand why, given that I thought we were bound by frame setup.
The doc failure is https://github.com/sphinx-doc/sphinx/issues/10861.
Bumped the baseline by a bit, see https://github.com/kaushikcfd/symoxide/pull/3 :).
The final tally here:
main
) takes 3.9s.There is a loopy failure, but that's just a doctest fail because this turns off flatten-by-default in the IdentityMapper
.
There's actually a path towards merging this now.
The mapper optimizer needs documentation.(Decided against this; it requires a fair bit of care to use, and the sharp edges may not be obvious to users. Added a comment.)The root of this is this contest set up by @kaushikcfd to show that we should all abandon Python for Rust (or some other AOT statically-typed language).
Here's my fork of @kaushikcfd's gist: https://gist.github.com/inducer/25e3372fd5c82384d437c777ed4153e9
Latest results:
print
statements removed.