Closed NightMachinery closed 2 years ago
:wave:
Ha! The lack of benchmarking tools in Python is among the bane of my existence... so much that one of my long term plans/ideas is taking reusable parts of benchee and putting them into a reusable binary and then just reimplementing the runner for different languages.
Ok, sorry.
Is it possible? Yes! You can totes just call System.cmd
Should you do it? Probably not.
for 2. + 3. I go into more detail here: https://pragtob.wordpress.com/2017/08/29/careful-what-you-measure-2-1-times-slower-to-4-2-times-faster-mjit-versus-truffle-ruby/ ("What time are we measuring?" section
Hope this helps!
Ah, one more: A very simple benchmarking implementation isn't too hard, see for instance this: https://github.com/PragTob/rubykon/tree/main/lib/benchmark that's less than 200loc for what I think of as a passable and minimally viable benchmarking library :)
My use cases take relatively long times on the order of a minute, so the overhead is negligible for me.
benchees warmup executions won't do you any good here
Why not?
hyperfine
It doesn’t support peak memory consumption.
PS: Will the memory of these forked processes be benchmarked correctly with benchee?
Ah.
time -v command
on a unix system (might be /usr/bin/time -v on mac). Benchee also won't give you memory measurements here, as benchee measures the memory of the running elixir/erlang process and not of the operating system level process spawned by System.cmd
I am not using a JITed language here, so that doesn't matter. But the inability to measure the memory of forked processes makes benchee unsuitable. Can't you add an option to measure those as well?
I am current using time
, but it is pretty bare-bones. I had to write the logic of running the measurements multiple times and aggregating the results myself.
even in a non JITed language warmup matters for many scenarios. Erlang was non JIT'ed for the majority of benchee's life time. I still implemented it before even the first release (iirc).
Measuring the memory of an externally spawned process is way different from what benchee does. I don't think I'll ever add this, as benchee just isn't the tool for the job here. I wouldn't even know how to do it properly. Before we go there, there are many things to improve around the memory measurement benchee has (it only measures the one process you spawned, not other processes that process may spawn or other processes in the system for instance)
Is it possible to benchmark a subprocess call like
python -c "8*9"
?I need to benchmark some Python code, and I could not find a good in-Python solution. I am wondering if I can use
benchee
. (I might want to benchmark some Julia code as well, so a language-agnostic benchmarking regime is beneficial for me in any case.)