[WIP] Extensible jobs - Githubissues

kbrock commented 6 years ago

Hi @evanphx

I find I often want to extend benchmark-ips to add number of objects allocated or number of queries performed and others.

There are a few memory alternatives like https://github.com/michaelherold/benchmark-memory (there was another, can't find it), but to be honest, it would be nice if there were just one base for this mostly common code. I was hoping to keep benchmark-ips as the base glue that would be able to run multiple types of benchmarks. (probably not your main goal)

I think the first 2 commits are no-brainers. Possibly the 3rd is as well. If you prefer, I can pull some of those out into a separate PR and go from there.

evanphx commented 6 years ago

This is great! Really no qualms about adding it. Do you have an memory one maybe we could add to the release too?

kbrock commented 6 years ago

@evanphx sweet - I put this together to determine if it was down the right alley.

The original code I used was more of a monkey patch / hack. I'll change it to work with this abstraction. That will also test out that it works with the abstraction as well.

I'll see what I can add. It isn't as extensive as the benchmark-memory but it allows a single set of examples to be run across a number of measurements - As I read that code, I'm realizing what a nice job @michaelherold did with that code.

michaelherold commented 6 years ago

Thanks for the compliment, @kbrock!

I've had on my project list for a long time (I spoke with Evan about it at RubyConf 2016, hah!) to make a wrapper gem that combines benchmark-ips and benchmark-memory into a whole. I even have a name for it: benchmark-time_and_space! 😆

The one big thing I would need for that is a change to the "hold" serialization format to be fully JSON-based instead of line-based.

Thinking out loud here - I like the idea of making benchmark-ips extensible and would be happy to make benchmark-memory easier to use too.

kbrock commented 6 years ago

I do wondering about the overall concept of hold.

You can run 2 kinds of benchmarks (probably more but...)

A. 1 file with a bunch of implementations of the same method. e.g.: fast-ruby B. 1 file with a bunch of implementations of different methods. e.g.: ruby bench

It seems that hold is for case B. But it becomes very manual. Improving a method and then running with 2 implementations seems to add quite a few steps. May be nice to run each version of the library from a CI.

Does it makes more sense to just use an api (i.e.: benchmark.fyi) for running across multiple implementations? This is essentially what the hold.csv file is. Right? common data across multiple runs. it can be a save file or a process that has an API.

We would need more blob/json based stats to store memory, queries, timings. We would need meta-data on the implementation that was run for this particular test. e.g.: ruby version, library version, time run. But besides that, it is very similar to benchmark.fyi

Think I'm getting ahead of myself though. And maybe I'm just reinventing ruby-bench.

evanphx commented 6 years ago

The hold system could definitely be better. It was originally design to help benchmark ruby implementation changes, that's why it saves and requires you to run it again.

kbrock commented 6 years ago

@evanphx Hmm. This is a great conversation. But I'm sorry I conflated these 2 conversation here. I'll open another issue or pr with some thoughts.

@michaelherold I would like to define the examples once, and then use them across both / other test types. I had thought this could be extensible through the Suite interface. But realized the Suite is more focused on what happens between tests, and this is more focused on what happens in an individual test.

michaelherold commented 6 years ago

@evanphx It does that job very well -- that's what I use it for, in fact. 😄