Closed harendra-kumar closed 3 years ago
Is it
Sorry, I did not read the title :)RTSStats.max_mem_in_use_bytes
?
@harendra-kumar could you please give a try to 22c1fde?
You may want to point out in the doc notes that this is the memory used by the application code, the real memory used by the process may higher than this (usually up to double) because some memory is held as garbage until GC runs.
I compared what gauge reports (which is the same as what the time -v
command reports) with tasty-bench
.
gauge:
Prelude.Serial(maxrss)
Benchmark default(MiB)
--------------------------------------------------------------------------------------------------------- ------------
Prelude.Serial/o-n-stack/iterated/takeAll (n/10 x 10) 9.55
Prelude.Serial/o-n-stack/iterated/tail 8.96
Prelude.Serial/o-n-stack/iterated/scanl1' (n/10 x 10) 9.50
Prelude.Serial/o-n-stack/iterated/scanl' (quadratic) (n/100 x 100) 8.41
Prelude.Serial/o-n-stack/iterated/nullHeadTail 8.85
Prelude.Serial/o-n-stack/iterated/mapM (n/10 x 10) 9.50
Prelude.Serial/o-n-stack/iterated/filterEven (n/10 x 10) 9.45
Prelude.Serial/o-n-stack/iterated/dropWhileTrue (n/10 x 10) 9.73
Prelude.Serial/o-n-stack/iterated/dropWhileFalse (n/10 x 10) 9.66
tasty-bench
Prelude.Serial(maxrss)
Benchmark default(MiB)
------------------------------------------------------------------------------------------------------------- ------------
Prelude.Serial/o-n-stack.iterated.takeAll (n/10 x 10) 4.00
Prelude.Serial/o-n-stack.iterated.tail 4.00
Prelude.Serial/o-n-stack.iterated.scanl1' (n/10 x 10) 4.00
Prelude.Serial/o-n-stack.iterated.scanl' (quadratic) (n/100 x 100) 3.00
Prelude.Serial/o-n-stack.iterated.nullHeadTail 3.00
Prelude.Serial/o-n-stack.iterated.mapM (n/10 x 10) 4.00
Prelude.Serial/o-n-stack.iterated.filterEven (n/10 x 10) 4.00
Prelude.Serial/o-n-stack.iterated.dropWhileTrue (n/10 x 10) 4.00
Prelude.Serial/o-n-stack.iterated.dropWhileFalse (n/10 x 10) 4.00
Dunno, tasty-bench
results are pretty consistent with Activity Monitor on my machine. I clarified that data is reported according to RTSStats
.
It will be useful to know how much memory an application being benchmarked could be holding at peak. Ideally we want
maxrss
as seen by the OS. But I guess we can trust the GHC RTS stats, more importantly, its easy to implement.