Avoid only a subset of benchmarks being specialized by SpecConstr.

haskell / criterion

A powerful but simple library for measuring the performance of Haskell code.

http://www.serpentine.com/criterion

BSD 2-Clause "Simplified" License

503 stars 86 forks source link

Avoid only a subset of benchmarks being specialized by SpecConstr. #270

Closed AndreasPK closed 1 year ago

AndreasPK commented 1 year ago

Instead of copying the contents I will just link to the tasty-bench ticket I opened about this issue:

https://github.com/Bodigrim/tasty-bench/issues/44

I believe criterion suffers from the same issue and could likely adopt a similar solution.

RyanGlScott commented 1 year ago

I'm not sure that I completely follow the details of that issue. Some questions:

It sounds like the functions that would be affected in criterion are nf' and whnf'. Is that right?
Would it suffice to add SPEC arguments to the inner go loops of each functions, similarly to how it is done in https://github.com/Bodigrim/tasty-bench/commit/9c8699014b9ab449e442c15a9ef11473a730a831?
How can I verify that these changes fix the underlying issue?

AndreasPK commented 1 year ago

It sounds like the functions that would be affected in criterion are nf' and whnf'. Is that right?

That seems right.

Would it suffice to add SPEC arguments to the inner go loops of each functions, similarly to how it is done in Bodigrim/tasty-bench@9c86990?

That also sounds right.

How can I verify that these changes fix the underlying issue?

I observed the differences with the bytestring benchmarks, in particular the Module BenchIndices. https://github.com/haskell/bytestring/blob/0bd68caff04db33923f6da532c6b87420a3b98b7/bench/BenchIndices.hs#L4 While in bytestring this module relies on tasty-bench I assume the same benchmark could be compiled using Criterion as a test case. The only real way to then check if it's affected/fixed is by inspecting the resulting core.

RyanGlScott commented 1 year ago

I've opened #271 with a patch. I also modified bytestring's benchmarks to depend on criterion and recorded the -ddump-simpl output for the BenchIndices.hs file before and after these changes. I saw identical Core in both cases, however:

--- BenchIndices.before.dump-simpl      2023-02-16 08:01:09.809717979 -0500
+++ BenchIndices.after.dump-simpl       2023-02-16 08:03:46.005978864 -0500
@@ -1,6 +1,6 @@

 ==================== Tidy Core ====================
-2023-02-16 12:58:58.386526068 UTC
+2023-02-16 13:03:13.701533361 UTC

 Result size of Tidy Core
   = {terms: 1,112, types: 889, coercions: 216, joins: 1/23}

Perhaps using criterion alters optimization just enough to where the differences in specialization no longer occur. Still, it seems like it would be worthwhile to land #271 just in case.

RyanGlScott commented 1 year ago

I've uploaded criterion-measurement-0.2.1.0 to Hackage with a fix.

AndreasPK commented 1 year ago

TLDR: There is no harm in this patch but I looked closer today and it seems criterion wasn't affected by the same issue as tasty.

GHC doesn't do cross-module SpecConstr. Unlike in tasty where https://github.com/Bodigrim/tasty-bench/blob/fd30b627a6d6c12f98f071e8372315b81dd4b57e/src/Test/Tasty/Bench.hs#LL1227C9-L1229C8 funcToBench is marked as INLINE in criterion the equivalent function whnf' (as far as I can tell) is marked as NOINLINE.

If it's never inlined it will never be specialized to any particular function because GHC doesn't do cross-module spec-constr so the whole issue doesn't arise. Sorry for the noise.