Bodigrim / tasty-bench

Featherlight benchmark framework, drop-in replacement for criterion and gauge.
https://hackage.haskell.org/package/tasty-bench
MIT License
80 stars 11 forks source link

Spurious allocations reported with --csv option #19

Closed harendra-kumar closed 3 years ago

harendra-kumar commented 3 years ago

From https://github.com/composewell/streamly/pull/1042.

It seems when I use --csv option it shows more allocations and bytesCopied:

$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/ --csv=fold.csv
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
         70 μs ± 112 μs,  13 KB allocated, 5.9 KB copied

All 1 tests passed (0.01s)
$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
         58 μs ±  88 μs,   0 B  allocated,   0 B  copied

All 1 tests passed (0.01s)
Bodigrim commented 3 years ago

Hmm. Surprisingly, I cannot reproduce:

$ git last
* 3feeec2e 2021-06-08 | Check for GNU awk (HEAD -> master, origin/master, origin/HEAD) [Harendra Kumar]

$ git st
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

$ cabal update
Downloading the latest package list from hackage.haskell.org
Updated package list of hackage.haskell.org to the index-state 2021-06-08T22:16:53Z
To revert to previous state run:
    cabal v2-update 'hackage.haskell.org,2021-06-06T19:14:45Z'

$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/ --csv=fold.csv
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK (0.04s)
         48 μs ± 108 μs,   0 B  allocated,   0 B  copied

All 1 tests passed (0.05s)

$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK (0.05s)
         49 μs ± 110 μs,   0 B  allocated,   0 B  copied

All 1 tests passed (0.05s)
Bodigrim commented 3 years ago

This was with GHC 8.10.4. GHC 9.0.1 allocates a ton anyway.

FWIW I've added a few strict dashes (e. g., Data.IntMap.Strict) in the last commits.

harendra-kumar commented 3 years ago

I am able to reproduce it consistently, did it again:

$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/ --csv=fold.csv
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
         70 μs ± 113 μs,  13 KB allocated, 5.8 KB copied

All 1 tests passed (0.01s)
$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
         58 μs ±  87 μs,   0 B  allocated,   0 B  copied

All 1 tests passed (0.01s)
$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/ --csv=fold.csv
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
         69 μs ± 112 μs,  13 KB allocated, 5.8 KB copied

All 1 tests passed (0.01s)

I am using Linux/Nix, GHC-8.8.3. Need to do more digging.

harendra-kumar commented 3 years ago

Tried the latest master branch. Same results:

$ cabal run bench:Data.Fold -- --stdev 1000000 +RTS -T -K36K -M36M -RTS -p /elimination.product/ --csv=fold.csv
Up to date
All
  Data.Fold/o-1-space
    elimination
      product: OK
        69.8 μs ± 114 μs,  13 KB allocated, 5.8 KB copied, 2.0 MB peak memory

All 1 tests passed (0.01s)

I can see the peak-memory now, nice! Thanks!

Bodigrim commented 3 years ago

Checked on Ubuntu machine with ghc-8.8.3, still no luck to reproduce. Any specific flags you used to build streamly? Here is my cabal.project.freeze:

active-repositories: hackage.haskell.org:merge
constraints: any.Cabal ==3.0.1.0,
             any.HUnit ==1.6.2.0,
             any.QuickCheck ==2.14.2,
             QuickCheck -old-random +templatehaskell,
             any.abstract-deque ==0.3,
             abstract-deque -usecas,
             any.ansi-terminal ==0.11,
             ansi-terminal -example,
             any.ansi-wl-pprint ==0.6.9,
             ansi-wl-pprint -example,
             any.array ==0.5.4.0,
             any.async ==2.2.3,
             async -bench,
             any.atomic-primops ==0.8.4,
             atomic-primops -debug,
             any.base ==4.13.0.0,
             any.base-orphans ==0.8.4,
             any.binary ==0.8.7.0,
             any.bytestring ==0.10.10.0,
             any.cabal-doctest ==1.0.8,
             any.call-stack ==0.4.0,
             any.clock ==0.8.2,
             clock -llvm,
             any.colour ==2.3.5,
             any.containers ==0.6.2.1,
             any.deepseq ==1.4.4.0,
             any.directory ==1.3.6.0,
             any.exceptions ==0.10.4,
             exceptions +transformers-0-4,
             any.filepath ==1.4.2.1,
             any.fusion-plugin-types ==0.1.0,
             any.ghc ==8.8.3,
             any.ghc-boot ==8.8.3,
             any.ghc-boot-th ==8.8.3,
             any.ghc-heap ==8.8.3,
             any.ghc-prim ==0.5.3,
             any.ghci ==8.8.3,
             any.hashable ==1.3.2.0,
             hashable +integer-gmp -random-initial-seed,
             any.heaps ==0.4,
             any.hpc ==0.6.0.3,
             any.hsc2hs ==0.68.7,
             hsc2hs -in-ghc-tree,
             any.hspec ==2.8.2,
             any.hspec-core ==2.8.2,
             any.hspec-discover ==2.8.2,
             any.hspec-expectations ==0.8.2,
             any.integer-gmp ==1.0.2.0,
             any.lockfree-queue ==0.2.3.1,
             any.monad-control ==1.0.2.3,
             any.mtl ==2.2.2,
             any.network ==3.1.2.1,
             network -devel,
             any.optparse-applicative ==0.16.1.0,
             optparse-applicative +process,
             any.parsec ==3.1.14.0,
             any.pretty ==1.1.3.6,
             any.primitive ==0.7.1.0,
             any.process ==1.6.8.0,
             any.quickcheck-io ==0.2.0,
             any.random ==1.2.0,
             any.rts ==1.0,
             any.setenv ==0.1.1.3,
             any.splitmix ==0.1.0.3,
             splitmix -optimised-mixer,
             any.stm ==2.5.0.0,
             streamly -debug -dev -doctests -fusion-plugin -has-llvm -inspection -no-fusion +opt -streamk -use-c-malloc,
             streamly-benchmarks -debug -dev -fusion-plugin -has-llvm -inspection -limit-build-mem -no-charts +opt -use-gauge,
             streamly-tests -dev -fusion-plugin -has-llvm -limit-build-mem +opt,
             any.tagged ==0.8.6.1,
             tagged +deepseq +transformers,
             any.tasty ==1.4.1,
             tasty +clock +unix,
             any.tasty-bench ==0.2.5,
             any.template-haskell ==2.15.0.0,
             any.temporary ==1.3,
             any.terminfo ==0.4.1.4,
             any.text ==1.2.4.0,
             any.tf-random ==0.5,
             any.time ==1.9.3,
             any.transformers ==0.5.6.2,
             any.transformers-base ==0.4.5.2,
             transformers-base +orphaninstances,
             any.transformers-compat ==0.6.6,
             transformers-compat -five +five-three -four +generic-deriving +mtl -three -two,
             any.typed-process ==0.2.6.0,
             any.unbounded-delays ==0.1.1.1,
             any.unix ==2.7.2.2,
             any.unliftio-core ==0.2.0.1,
             any.wcwidth ==0.0.2,
             wcwidth -cli +split-base
index-state: hackage.haskell.org 2021-06-10T09:42:45Z
Bodigrim commented 3 years ago

@harendra-kumar any ideas how to reproduce?

harendra-kumar commented 3 years ago

Tried again, not able to reproduce any more. I will reopen it if I am able to reproduce it again.