haskell / criterion

A powerful but simple library for measuring the performance of Haskell code.
http://www.serpentine.com/criterion
BSD 2-Clause "Simplified" License
503 stars 86 forks source link

Avoid only a subset of benchmarks being specialized by SpecConstr. #270

Closed AndreasPK closed 1 year ago

AndreasPK commented 1 year ago

Instead of copying the contents I will just link to the tasty-bench ticket I opened about this issue:

https://github.com/Bodigrim/tasty-bench/issues/44

I believe criterion suffers from the same issue and could likely adopt a similar solution.

RyanGlScott commented 1 year ago

I'm not sure that I completely follow the details of that issue. Some questions:

AndreasPK commented 1 year ago
  • It sounds like the functions that would be affected in criterion are nf' and whnf'. Is that right?

That seems right.

  • Would it suffice to add SPEC arguments to the inner go loops of each functions, similarly to how it is done in Bodigrim/tasty-bench@9c86990?

That also sounds right.

  • How can I verify that these changes fix the underlying issue?

I observed the differences with the bytestring benchmarks, in particular the Module BenchIndices. https://github.com/haskell/bytestring/blob/0bd68caff04db33923f6da532c6b87420a3b98b7/bench/BenchIndices.hs#L4 While in bytestring this module relies on tasty-bench I assume the same benchmark could be compiled using Criterion as a test case. The only real way to then check if it's affected/fixed is by inspecting the resulting core.

RyanGlScott commented 1 year ago

I've opened #271 with a patch. I also modified bytestring's benchmarks to depend on criterion and recorded the -ddump-simpl output for the BenchIndices.hs file before and after these changes. I saw identical Core in both cases, however:

--- BenchIndices.before.dump-simpl      2023-02-16 08:01:09.809717979 -0500
+++ BenchIndices.after.dump-simpl       2023-02-16 08:03:46.005978864 -0500
@@ -1,6 +1,6 @@

 ==================== Tidy Core ====================
-2023-02-16 12:58:58.386526068 UTC
+2023-02-16 13:03:13.701533361 UTC

 Result size of Tidy Core
   = {terms: 1,112, types: 889, coercions: 216, joins: 1/23}

Perhaps using criterion alters optimization just enough to where the differences in specialization no longer occur. Still, it seems like it would be worthwhile to land #271 just in case.

RyanGlScott commented 1 year ago

I've uploaded criterion-measurement-0.2.1.0 to Hackage with a fix.

AndreasPK commented 1 year ago

TLDR: There is no harm in this patch but I looked closer today and it seems criterion wasn't affected by the same issue as tasty.

GHC doesn't do cross-module SpecConstr. Unlike in tasty where https://github.com/Bodigrim/tasty-bench/blob/fd30b627a6d6c12f98f071e8372315b81dd4b57e/src/Test/Tasty/Bench.hs#LL1227C9-L1229C8 funcToBench is marked as INLINE in criterion the equivalent function whnf' (as far as I can tell) is marked as NOINLINE.

If it's never inlined it will never be specialized to any particular function because GHC doesn't do cross-module spec-constr so the whole issue doesn't arise. Sorry for the noise.