Bodigrim / tasty-bench

Featherlight benchmark framework, drop-in replacement for criterion and gauge.
https://hackage.haskell.org/package/tasty-bench
MIT License
80 stars 11 forks source link

Report RTSStats.max_mem_in_use_bytes #20

Closed harendra-kumar closed 3 years ago

harendra-kumar commented 3 years ago

It will be useful to know how much memory an application being benchmarked could be holding at peak. Ideally we want maxrss as seen by the OS. But I guess we can trust the GHC RTS stats, more importantly, its easy to implement.

Bodigrim commented 3 years ago

Is it RTSStats.max_mem_in_use_bytes? Sorry, I did not read the title :)

Bodigrim commented 3 years ago

@harendra-kumar could you please give a try to 22c1fde?

harendra-kumar commented 3 years ago

You may want to point out in the doc notes that this is the memory used by the application code, the real memory used by the process may higher than this (usually up to double) because some memory is held as garbage until GC runs.

I compared what gauge reports (which is the same as what the time -v command reports) with tasty-bench.

gauge:

Prelude.Serial(maxrss)
Benchmark                                                                                                 default(MiB)
--------------------------------------------------------------------------------------------------------- ------------
Prelude.Serial/o-n-stack/iterated/takeAll (n/10 x 10)                                                             9.55
Prelude.Serial/o-n-stack/iterated/tail                                                                            8.96
Prelude.Serial/o-n-stack/iterated/scanl1' (n/10 x 10)                                                             9.50
Prelude.Serial/o-n-stack/iterated/scanl' (quadratic) (n/100 x 100)                                                8.41
Prelude.Serial/o-n-stack/iterated/nullHeadTail                                                                    8.85
Prelude.Serial/o-n-stack/iterated/mapM (n/10 x 10)                                                                9.50
Prelude.Serial/o-n-stack/iterated/filterEven (n/10 x 10)                                                          9.45
Prelude.Serial/o-n-stack/iterated/dropWhileTrue (n/10 x 10)                                                       9.73
Prelude.Serial/o-n-stack/iterated/dropWhileFalse (n/10 x 10)                                                      9.66

tasty-bench

Prelude.Serial(maxrss)
Benchmark                                                                                                     default(MiB)
------------------------------------------------------------------------------------------------------------- ------------
Prelude.Serial/o-n-stack.iterated.takeAll (n/10 x 10)                                                             4.00
Prelude.Serial/o-n-stack.iterated.tail                                                                            4.00
Prelude.Serial/o-n-stack.iterated.scanl1' (n/10 x 10)                                                             4.00
Prelude.Serial/o-n-stack.iterated.scanl' (quadratic) (n/100 x 100)                                                3.00
Prelude.Serial/o-n-stack.iterated.nullHeadTail                                                                    3.00
Prelude.Serial/o-n-stack.iterated.mapM (n/10 x 10)                                                                4.00
Prelude.Serial/o-n-stack.iterated.filterEven (n/10 x 10)                                                          4.00
Prelude.Serial/o-n-stack.iterated.dropWhileTrue (n/10 x 10)                                                       4.00
Prelude.Serial/o-n-stack.iterated.dropWhileFalse (n/10 x 10)                                                      4.00
Bodigrim commented 3 years ago

Dunno, tasty-bench results are pretty consistent with Activity Monitor on my machine. I clarified that data is reported according to RTSStats.