Comparative Benchmarking

airspeed-velocity / asv

Airspeed Velocity: A simple Python benchmarking tool with web-based reporting

https://asv.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

874 stars 180 forks source link

Comparative Benchmarking #478

Open MSeifert04 opened 8 years ago

MSeifert04 commented 8 years ago

I'm currently working on a benchmarking project where I want to benchmark different implementations of some function. It's not really hard to write the benchmarks but it's not so easy (impossible?) to display them as comparative benchmarks in one graph.

I know that it's possible to parametrize the benchmarks but these have weird names (they use the full name of the function and if the function is defined in the benchmarking file it's not displayed at all) and obviously different implementations of the same functionality might differ (argument names, argument order, ...) in a way that makes it really ugly to parametrize the test.

Would it be possible to either:

include a section in the documentation how one could do comparative benchmarks (if it's already possible)
enhance the functionality of asv to do this kind of comparative benchmarks?

MSeifert04 commented 8 years ago

Sorry for closing and reopening this.

It is possible with a combination of params, param_names and then choosing in the graph the appropriate "x-axis" value to display comparative benchmarks.

However this suffers (at least on the 0.2 version) from several drawbacks:

If I set the "x-axis" to any parameter I can't use the "plot_settings" (every option there makes the bars disappear).

If I have more complicated benchmarks I need a very extensive machinery to make sure the order of arguments is appropriate for the comparison function.

Besides fixing the first point (or did I do something wrong?) would it be possible to have a specialized class for comparative benchmarks? That class would be displayed like a parametrized benchmark but instead of using the parametrization as parameters use the methods?

I don't know if that's within the scope of asv, if not feel free to ignore the enhancement request..

pv commented 8 years ago

The "plotting options" thing is a bug (probably best discussed in a separate issue ticket).

I'm not completely sure from the above description what is the problem with using parameterized benchmarks for what you are trying to do. Could you post some example code of what you are currently doing and what you'd like to do to gist.github.com, to clarify this?

If you are trying to have each method in a benchmark class define a parameter value and an associated benchmark, I think that can be achieved generically with a class decorator or a metaclass --- I'm not sure if it would be a good idea to include such wrapper in asv itself, as this introduces multiple ways to declare the same thing.

MSeifert04 commented 8 years ago

@pv I haven't used class decorators/metaclasses to that extend yet, so I don't know how I would go about implementing this.

But currently I have some implementation like this: https://gist.github.com/MSeifert04/d2d4013093362c9e71c60b16ef53e355.

For example I have different functions, say scipy.signal.medfilt, scipy.ndimage.medfilt2d, scipy.ndimage.filters.median_filter and another implementation from another package. The function signatures differ slightly so I either have to put a lot of if .. elif ... ... else ... branches inside each test or map different calls (like in the gist) to different arguments.

It hope it's not too much of a mess and I also think that this is not really so general as to be useful for lots of people. 😄

pv commented 8 years ago

The behavior of "plot settings" is a bug (probably best filed as a separate issue). . I'm not sure I understand from this description what you mean by a "comparative benchmark" if it is different from a parameterized benchmark? Giving example code on how you did it now could clarify what is the problem you are facing.

MSeifert04 commented 8 years ago

@pv I don't understand. Did you unintentionally copy portions of your last message? Or was it intentionally so I open an issue and ... include more code? Or include the actual code I'm using?

pv commented 8 years ago

No, Github appears to be having some technical issues with their email interface --- that's the message I sent yesterday and it didn't appear, so I wrote a second one via the web interface. Apparently, the emails were only delayed and not disappeared.

MSeifert04 commented 8 years ago

ok 😄