Open aclements opened 1 year ago
The https://github.com/golang/go/issues/61515 proposal does include a pointer to the latest version in the top post. We tend not to do significant rewrites of the top post in a proposal because then it makes it hard to follow the conversation that follows it, and instead add updates to it linking to the comment explaining the latest version. There's no really ideal way to do this.
@aclements Adding an hr divider at the end, followed by an "Edit: Changed to [...], see these comments [...]" line(s) would be sufficient. This is comparable to the practice of reserving the first comment for FAQs and proposal updates that I've seen the Go team use elsewhere recently, which worked well.
However, not exposing Keep limits refactoring opportunities
Can you demonstrate an example using Keep?
Oops, I guess his example doesn't technically show partial optimization since there's only one argument to the function under test. Partial optimization would mix one (or more) argument passed in the b.Loop body to the intermediate closure with or (or more) argument passed directly from the intermediate closure. The former would not be constant-propagated, while the latter could be.
Can you demonstrate an example using Keep?
The auto-Keep during b.Loop could be applied the same in any loop that counts from 0 to b.N where b has type *testing.B. Then b.Loop is less special - the compiler just applies it to both kinds of benchmark loops.
If we make Keep auto-apply inside b.N loops, then b.Loop is no longer special, and converting a b.N loop to a b.Loop loop is not a performance change at all, so it seems like we should make Keep auto-apply inside b.N loops. That will also fix a very large number of Go microbenchmarks, although it will look like it slowed them down.
That leaves the question of whether to auto-Keep at all. If we don't, we leave a big mistake that users, even experienced ones, will make over and over. The compiler can do the right thing for us here.
Maybe it would help if someone could try a compiler implementation and see how difficult this is.
However, not exposing Keep limits refactoring opportunities
Can you demonstrate an example using Keep?
Suppose you have a function like
func Benchmark(b *testing.B) {
for b.Loop() {
... complicated body ...
}
}
In this form, we're proposing that we automatically apply Keep to "complicated body". However, if you were to refactor out parts of the complicated body into a new function:
func Benchmark(b *testing.B) {
for b.Loop() {
... less complicated body, calls helper ...
}
}
func helper() {
... parts of complicated body ...
}
The code in helper
no longer gets Keep applied automatically, so the programmer would need to call Keep themself.
Partial optimization would mix one (or more) argument passed in the b.Loop body to the intermediate closure with or (or more) argument passed directly from the intermediate closure.
Can you demonstrate an example using Keep?
I guess you don't actually need Keep to express partial optimization. Here's an example of partial optimization, where one argument to the benchmarked function can be constant-propagated, but the other is treated as a variable even though it's a constant.
func BenchmarkPowPartialConstant(b *testing.B) {
xPow20 := func(a float64) float64 {
return math.Pow(a, 20)
}
for b.Loop() {
// Compute 5 ** 20. This allows the '20' to be constant-propagated,
// but treats the '5' as a variable.
xPow20(5)
}
}
Another way to express this does use Keep:
func BenchmarkPowPartialConstant(b *testing.B) {
xPow := func() float64 {
return math.Pow(math.Keep(5), 20)
}
for b.Loop() {
// Compute 5 ** 20. This is defined separately to avoid auto-Keep
// so the '20' can be constant-propagated.
xPow()
}
}
In both of these, xPow
returns the result of math.Pow
, so auto-Keep in the loop keeps the result. Otherwise, the call to math.Pow
could be eliminated entirely since it's pure.
Here is an ergonomic thing that doesn't seem to have been considered in the above discussion. When the benchmarked function has multiple input parameters and return values (of different types):
for i := 0; i < b.N; i++ {
bb, _ = bbhash.New(gamma, partitions, keys)
}
It would be nice to be able to write this...
for i := 0; i < b.N; i++ {
testing.Keep(bbhash.New(gamma, partitions, keys))
}
However, since the return types may be different, the proposed Keep
function won't work for the above scenario.
A variadic variant like the one below would work for the return types:
func Keep(v ...any) []any {
return v
}
But it doesn't work for the input arguments if one wants to avoid testing.Keep
for each argument. Maybe that's not a relevant concern?
Just curious: is it required to let Keep
return values?
[edit] okay, it looks the answer is yes.
@meling , that's an interesting point. Unfortunately, while I agree that makes Keep a little more ergonomic for that one case, it seems to make it less ergonomic for any case that needs to use the result of Keep
, which includes using it on arguments to functions and using the return value from functions.
For example, to fully write out your example with the signature you proposed, you'd need to write:
for i := 0; i < b.N; i++ {
r := testing.Keep(bbhash.New(gamma, partitions, keys))
bb = r[0].(something)
}
That type assertion will also add a little benchmarking overhead.
With the original proposed signature, you would write this as:
for i := 0; i < b.N; i++ {
bb, _ = bbhash.New(gamma, partitions, keys)
testing.Keep(bb)
}
That doesn't seem very bad to me.
Also, the idea with auto-Keep is that in these cases you wouldn't have to write Keep at all.
Yeah, I agree. testing.Keep(bb)
is definitely reasonable.
The main reason I brought it up was that it didn't seem like it was discussed. Moreover, I had some initial idea that perhaps it would be valuable to have some form of "type matching" (for lack of a better term):
func Keep[T any{0,3}](v T) T
Where the T
would match 0-3 input and corresponding output types, which could be different types (not sure this notation makes that clear enough, though). Something like this could also be helpful for the iter.Seq
definition being considered.
However, I removed it since I didn't want to distract the first message with such a "controversial" proposal 😅... Don't know if this is worth writing up as a separate proposal.
Placed on hold. — rsc for the proposal review group
To be clear, this is on hold pending an experimental implementation, per https://github.com/golang/go/issues/61179#issuecomment-1728164629
Just to loop back here, the proposal is to both:
Keep
to the testing
packageKeep
in some situationsNumber 1 seems pretty easy. Number 2 could be pretty difficult. This proposal is on hold for someone to prototype 2.
The question is, would we want to do 1 without also doing 2? Particularly, I think we could do 1 for 1.22 if we want. Getting 2 in for 1.22 looks more speculative.
But then there is a problem: having 1 and not 2 will encourage people to add lots of explicit testing.Keep
calls that are unnecessary once 2 lands (because they would have been auto-added). So it sounds like a recipe for unnecessary churn. Hence, my thought is we hold on 1 as well until 2 is ready.
But then there is a problem: having 1 and not 2 will encourage people to add lots of explicit
testing.Keep
calls that are unnecessary once 2 lands (because they would have been auto-added). So it sounds like a recipe for unnecessary churn. Hence, my thought is we hold on 1 as well until 2 is ready.
Isn't this the case for runtime.KeepAlive
already today? Benchmarks can call it or just use custom sink solutions to thwart optimizations.
IMHO we can do (1) first, in 1.22. Whatever calls to testing.Keep
people will add is likely to fix many benchmarks, which is a positive development overall. Unless we plan to do (2) without doing (1) at all (exposing it to users), I vote for (1) in 1.22
Isn't this the case for runtime.KeepAlive already today?
It is for function results, true. It can't be used to solve the function arg problem.
I'm concerned about losing momentum on this. Is there someone who is planning on doing the prototype work for auto-Keep?
If it is going to be years before we get auto-Keep, I'd much rather we decouple these and land Keep now. Then we at least have a path forward to eliminate the hacky workarounds (KeepAlive, sinks) while we wait for the better thing which may or may not happen.
Isn't this the case for runtime.KeepAlive already today?
It is for function results, true. It can't be used to solve the function arg problem.
Yes, my argument isn't that runtime.KeepAlive
gives us what we want; on the contrary, I'm arguing that the problem of "people will just splat Keep
all over the place" already exists today with runtime.KeepAlive
, so I don't think this is a good reason to delay shipping testing.Keep
as soon as we can.
@eliben and @cespare, I hear your concern, but there are only a few weeks left in the Go 1.22 cycle, Austin is away, and we don't have consensus on the actual design yet, nor an implementation. This will have to wait for Go 1.23.
In rare cases, this affects test output also, where optimizations in the test code cause subtle changes in floating-point output. This is a reduced test case from a much larger program:
package main
import (
"fmt"
"runtime"
)
func keep[T any](x T) T {
runtime.KeepAlive(&x)
return x
}
func main() {
fmt.Printf("%g\n", foo(keep(0.85)))
fmt.Printf("%g\n", foo(0.85))
}
func foo(x float64) float64 {
return 3 * x - 2
}
On ARM64, this outputs two different numbers (depending on whether keep
is used):
0.5499999999999999
0.5499999999999998
It can't be used to solve the function arg problem.
runtime.KeepAlive(&arg)
seems to work as in the example above
Benchmarks frequently need to prevent certain compiler optimizations that may optimize away parts of the code the programmer intends to benchmark. Usually, this comes up in two situations where the benchmark use of an API is slightly artificial compared to a “real” use of the API. The following example comes from @davecheney's 2013 blog post, How to write benchmarks in Go, and demonstrates both issues:
Most commonly, the result of the function under test is not used because we only care about its timing. In the example, since
Fib
is a pure function, the compiler could optimize away the call completely. Indeed, in “real” code, the compiler would often be expected to do exactly this. But in benchmark code, we’re interested only in the side-effect of the function’s timing, which this optimization would destroy.An argument to the function under test may be unintentionally constant-folded into the function. In the example, even if we addressed the first issue, the compiler may compute Fib(10) entirely at compile time, again destroying the benchmark. This is more subtle because sometimes the intent is to benchmark a function with a particular constant-valued argument, and sometimes the constant argument is simply a placeholder.
There are ways around both of these, but they are difficult to use and tend to introduce overhead into the benchmark loop. For example, a common workaround is to add the result of the call to an accumulator. However, there’s not always a convenient accumulator type, this introduces some overhead into the loop, and the benchmark must then somehow ensure the accumulator itself doesn’t get optimized away.
In both cases, these optimizations can be partial, where part of the function under test is optimized away and part isn’t, as demonstrated in @eliben’s example. This is particularly subtle because it leads to timings that are incorrect but also not obviously wrong.
Proposal
I propose we add the following function to the testing package:
(This proposal is an expanded and tweaked version of @randall77’s comment.)
The
Keep
function can be used on the result of a function under test, on arguments, or even on the function itself. UsingKeep
, the corrected version of the example would be:(Or
testing.Keep(Fib)(10)
, but this is subtle enough that I don’t think we should recommend this usage.)Unlike various other solutions,
Keep
also lets the benchmark author choose whether to treat an argument as constant or not, making it possible to benchmark expected constant folding.Alternatives
Keep
may not be the best name. This is essentially equivalent to Rust’sblack_box
, and we could call ittesting.BlackBox
. Other options includeOpaque
,NoOpt
,Used
, andSink
.27400 asks for documentation of best practices for avoiding unwanted optimization. While we could document workarounds, the basic problem is Go doesn’t currently have a good way to write benchmarks that run afoul of compiler optimizations.
48768 proposes
testing.Iterate
, which forces evaluation of all arguments and results of a function, in addition to abstracting away the b.N loop, which is another common benchmarking mistake. However, its heavy use of reflection would be difficult to make zero or even low overhead, and it lacks static type-safety. It also seems likely that users would often just pass afunc()
with the body of the benchmark, negating its benefits for argument and result evaluation.runtime.KeepAlive
can be used to force evaluation of the result of a function under test. However, this isn’t the intended use and it’s not clear how this might interact with future optimizations toKeepAlive
. It also can’t be used for arguments because it doesn’t return anything. @cespare has some arguments againstKeepAlive
in this comment.