Closed sim642 closed 3 years ago
Thanks for the very clear PR that looks high-quality (I haven't looked at the code yet).
A quick comment:
The quoting mechanism introduces functions with
()
argument. There is no explanation why, but I'm guessing it's to be lazy: in case the quoted expression involves some computation, then don't perform it immediately at the beginning (it might not be necessary due to short-circuiting, etc) but perform it each time it's actually used, just like if the quoted expression where independently at each use.
In general, suppose we are transforming an expression C[e]
(a subexpression e
inside a wider context C[...]
), where e
is some source expression that we want to transform into a desugared/transformed piece of code e´
, where e'
should be evaluated in the current scope (for example, it refers to some identifier x
that we know scopes over C[e]
. Transforming this into C[e']
would be incorrect (un-hygienic), because the context C[...]
written by the user could contain binders that shadow this x
with a different definition; for example C[...]
could be fun x -> ...
or let open Foo in ...
, where Foo
defines an x
. So instead one may want to translate this term into let __private_identifier = e' in C[__private_identifier]
, but this is incorrect because it changes the order of evaluation (or the number of types e'
is evaluated, etc.). On the other hand, the translation let __private_identifier () = e' in C[__private_identifier ()]
is always correct, scope-wise and evaluation-wise. Of course, as you point out it introduces constant-factor inefficiencies in general, and is not necessary when e´
is known to be a constant or an identifier (its evaluation order does not matter, and evaluating it when it is unnecessary does not degrade performances).
(Other translations are possible, such as let c x = C[x] in c e'
, or let x = e' and c x = C[x] in c x
, which do not require generating "private" identifiers; but we digress, and those introduce a function call that is harder to simplify away with local reasoning.)
I have now read the code and I believe it is correct, thanks again!
One minor remark: you left some comments (* eta-expanded if outermost *)
in two places in eq and ord each. I'm not sure those leftover comments (in one case a trace of a previous workaround, in the other a bug opportunity that you noticed) really help code understanding. Putting myself in the shoes of a newcomer, it's not clear what those comments mean, and they seem redundant with the comment that exists at the place where eta-expansion does actually take place. I would remove those extraneous comments, and enrich the comment on the actual eta-expansion site with a reference to the present PR that contains a detailed description for more context.
Thanks for the feedback! I updated both the eta-expansion and the quoter comments to provide some explanation for the special optimized case.
in the other a bug opportunity that you noticed
Indeed, when trying to understand what this eta-expansion and quoting is about, I realized that eta-expansion on the outermost level was missing if a non-fun
expression was explicitly provided via [@equal …]
or [@compare …]
. It took me a while to engineer the added tests because it only matters in the extremely obscure case when such expression refers to another derived one in the same mutually recursive definition (otherwise it's statically constructive by the simplest case) and that requires explicitly referencing the other to-be derived function.
A couple of additional remarks:
Pexp_ident
probably isn't the only case where the quote
optimization can be done. For example, I think Pexp_constant
and maybe some other combinations could also allow it. But I didn't want to attempt to implement a function to determine the purity of an expression by syntax and Pexp_ident
covers the most likely use case of quoting.This repository doesn't contain any benchmark code but if there's interest, I could also include that to continue living in this repository, since as it turns out small optimizations to derived code can make a big impact.
Yes, but how would you robustly check for performance regression, in a way that does not depend on the system on which the tests are run? One way to do this would be to count bytecode instructions, but I don't know of an easy, practical way to track this. (You can run a bytecode binary with the instrumented runtime, ocamlrund -t foo.bc | wc -l
, but it's a bit cumbersome to setup.)
Pexp_ident
probably isn't the only case where thequote
optimization can be done. For example, I thinkPexp_constant
and maybe some other combinations could also allow it. But I didn't want to attempt to implement a function to determine the purity of an expression by syntax andPexp_ident
covers the most likely use case of quoting.
I wondered the same and I agree. Your optimization makes sense because it corresponds to a well-understood use case (we are not optimized, rather de-pessimizing). I would wait until a clear use-case comes for constants (or functions, etc.) before going further.
Yes, but how would you robustly check for performance regression, in a way that does not depend on the system on which the tests are run? One way to do this would be to count bytecode instructions, but I don't know of an easy, practical way to track this. (You can run a bytecode binary with the instrumented runtime,
ocamlrund -t foo.bc | wc -l
, but it's a bit cumbersome to setup.)
Automating it is nontrivial indeed. I suppose one way would be to have an explicit copy of the last known good (i.e. with expected performance) derived code in the benchmark and have another one be derived via the current deriver implementation. Then check if the results significantly differ (beyond some margin of error) to automatically catch pessimization or optimization. And when intentional optimizations are made, just update the explicit copy from there on.
Anyway, this goes beyond what currently would be needed, just a thought about how it might be done.
Yes, having to update the "baseline" on a semi-regular basis is okay, but we want to avoid flaky tests that bring the CI down irregularly. If you want to try to work something out and submit a PR, that would be warmly welcome, but I am not sure how to get a robust test -- besides the ocamlrund
trick I mentioned.
(Maybe a simpler approach would be to just store the -dparsetree
output of the transformation, so that the test fails and has to be promoted whenever the generated code changes. When people change this test output we can think about the performance difference, possibly by running some benchmarks.)
After switching a large project from manual
equal
andcompare
implementations to ones derived by ppx_deriving (https://github.com/goblint/analyzer/pull/227), we noticed significant performance decreases in in those functions (https://github.com/goblint/analyzer/issues/265).Benchmarking the generated
equal
functions directly (https://github.com/goblint/analyzer/blob/a21f33511183074c693c15269990210b283ae045/bench/deriving/benchEq.ml, executable independently from our project) showed up to 3 times slowdown. For example, when the old manual implementationwas replaced with one derived for
Int.t * String.t
. The modulesInt
andString
are just chosen as example, we often use functors where the modules come abstractly from arguments, so using primitive types isn't an option for deriving.Inefficient derived code
Currently, the following implementation is derived:
The benchmarks confirm that this literal code performs the same as the derived one.
Optimizations
This PR suggests two optimizations for forwarding the
equal
calls.Avoid unnecessary eta-expansion
PR #55 introduced eta-expansion to forwarded calls. This was necessary since derived implementations for mutual recursion in the generated
let rec
must be statically constructive (https://ocaml.org/manual/letrecvalues.html), so expanding a forwarding call into afun
works around the restriction.The problem in this case is that the eta-expansion is added to all forwarding calls, including those deeper in the resulting
equal
implementation, where the restriction doesn't apply any more because the outermost expression is afun
already. AFAIK, eta-expansion is only needed at the top level if it's not afun
already (most cases ofexpr_of_typ
are anyway). Therefore this PR implements exactly that. The linked benchmarks show a 50%…300% speedup from this change.Avoid unnecessary closures in quoting
Issue #57 and commit https://github.com/ocaml-ppx/ppx_deriving/commit/c3bee7b96fb048c1a2431629109b532637c30df7 introduced quoting to forwarded calls. The quoting mechanism introduces functions with
()
argument. There is no explanation why, but I'm guessing it's to be lazy: in case the quoted expression involves some computation, then don't perform it immediately at the beginning (it might not be necessary due to short-circuiting, etc) but perform it each time it's actually used, just like if the quoted expression where independently at each use.The problem in this case is that forward calls being quoted are just identifier expressions, which AFAIK cannot be performing any computation on their own. Therefore this PR implements a special case of quoting just identifiers to be without the
()
argument. The linked benchmarks show a 200%…400% speedup from this change on the pair examples. Just forwarding to a single function doesn't seem to gain any speedup, nor slowdown.Efficient derived code
As a result of these changes, the new derived code for the example above is:
And overall the derived implementations seem to be at least 3 times faster, bringing them mostly on par with the manual implementations.