Closed noughtmare closed 6 months ago
Should whnf
in this case be defined as whnf = funcToBench seq
?
I don't think whnf
needs to be changed. Only nf
, nfIO
, and nfAppIO
.
Well, it would be more symmetric to change both.
I'm sorry, I'm terribly overloaded at the moment. If you can take a bench suite of, say, bytestring
or containers
and check the effect of nf = funcToBench rnf
, I'll be very grateful.
OK, I'll make the change and try it out on your existing suites.
Here's the raw results for bytestring with GHC 9.4.4:
rnf-bytestring-after.csv rnf-bytestring-before.csv
And the full output of the command line (including the baseline comparisons):
The most remarkable result is that unpack
consistently takes about 200% more time. I will investigate that further.
Update: unpack
allocates much more for some reason.
I've been convinced by Andreas Klebinger that using force
is actually more representative of the (worst case) real world behavior. And it is easy to get the other behavior by just including rnf
in your benchmark. So, perhaps all that should be done is adding some documentation to explain this.
Documentation patch is welcome.
I've updated documentation in 461919e.
In the light of #48, I'm thinking that maybe it's worth to change nf
after all.
-nf = funcToBench force
+nf = funcToBench rnf
Sometimes you do benchmark operations on lists and it makes sense to measure their allocation as well. However, what happens much more often in my practice is that a list is just an implementation detail to run a computation N times, like in
nf (\n -> map func [1..n]) 100
In such case you really do not want to allocate that list, it just defeats the purpose by introducing more noise to the measurements of func
.
However, this change would have to wait until tasty-bench-0.4
, which in its turn awaits tasty-1.5
to be released, so probably somewhere in September.
Changed in 3ce8baf8260582d10b0894bc2bbd6b3105b41a5a.
See my discourse thread for the full story.
The short version is that this issue is reproduced by this program:
The result is:
Changing to
criterion
yields these wildly different results:I've tracked it down in Core to this difference:
This shows benchmark 2 retains
x2
in memory during the normalization ($wgo
).I think this can be solved by using
rnf
instead offorce
: