evanphx / benchmark-ips

Provides iteration per second benchmarking for Ruby
MIT License
1.72k stars 97 forks source link

More understandable ips comparison #49

Closed bquorning closed 8 years ago

bquorning commented 9 years ago

When two pieces of code run the same number of iterations in the same amount of time, that shouldn't be reported as one being "1.00x slower" than the other. By comparing all slower results against the fastest result, and not the other way around, we get some more realistic reporting.

evanphx commented 9 years ago

Good call on it not being clear. I worry about changing the number displayed there only because it might confuse existing users, which seem to have understood my intention despite my bad wording. Perhaps we leave the number the same and just remove he word "slower"?

schneems commented 9 years ago

:+1: I always do (faster - slower) / faster and never use the x.compare! feature.

If you're worried about backwards compat, we could preserve an option call it legacy_compare!, release a minor version with deprecations around compare! telling people it will change and then a major version bump with the actual change.

evanphx commented 9 years ago

@schneems ok! I'm happy to display more info as well, to try and make it clearer. So we could show the current value as well as a subtracted one. I want the feature to be clear, so whatever we have to do to make that possible, we should do it.

schneems commented 9 years ago

How about something like this?

          addition2: 24011974.8 i/s - Fastest
          addition3: 23958619.8 i/s - 0.22 % slower (1.00x)
addition-test-long-label:  5014756.0 i/s - 79.06 % slower (4.79x)
            addition:  4955278.9 i/s - 79.317 % slower (4.85x)
bquorning commented 9 years ago

I was considering doing the comparisons against the slowest run, and showing how many percent faster the other runs were:

          addition2: 24011974.8 i/s - 384.6 % faster
          addition3: 23958619.8 i/s - 383.5 % faster
addition-test-long-label:  5014756.0 i/s - 1.2 % faster
            addition:  4955278.9 i/s - Slowest

It seems to me that when doing performance work, you want to know how much faster you can make a piece of code. Not necessarily how much slower the old code was. Correct me if I’m wrong.

schneems commented 9 years ago

I like that, I'm usually looking for speed improvements. Not sure about others though.

kbrock commented 9 years ago

Thanks.

Maybe related or maybe not.

Sometimes, 2 different methods are about the same speed. standard deviation may be able to tell us this.

What algorithm can we use to have them say "basically the same"?

evanphx commented 9 years ago

Use the stddev. If the difference is within the the stddev, they are the same speed.

On Tue, Aug 11, 2015 at 1:30 PM, Keenan Brock notifications@github.com wrote:

Thanks. Maybe related or maybe not. Sometimes, 2 different methods are about the same speed. standard deviation may be able to tell us this.

What algorithm can we use to have them say "basically the same"?

Reply to this email directly or view it on GitHub: https://github.com/evanphx/benchmark-ips/pull/49#issuecomment-130057310

evanphx commented 9 years ago

I'd prefer to keep it showing the fastest and then how much slower the others are. Change that would be definitely confusing to existing users.

kbrock commented 9 years ago

I had looked into including the std dev in the calculations. So I could see if some were the same. But had trouble making the information clear when showing a range.

Almost feel like it could be: ((slower - slower.stdev) - (fast + fast.stddev)) / (fast + fast.stddev). And then swap the signs for the stddev padding.

bquorning commented 9 years ago

I would like to move the discussion back to the main topic again:

The current x = best.ips.to_f / report.ips.to_f calculation means “The fastest time was x times faster than this one (except, you should subtract 1x)”. I understand that changing the existing reporting might confuse existing users, but the current implementation apparently does also confuse existing users.

I think that it would be most confusing for existing users to only slightly change the existing output. Keeping the _.__x slower output, but giving the number a different meaning – that would be confusing. (That is what was originally proposed by this PR, and I was wrong.)

Less confusing would be to completely change the output. I think changing it to either of the proposed _.__ % slower or _.__ % faster would be less confusing for existing users.

Since it has already been mentioned that such changes might cause a major version bump: I’m wondering if there may be a need for custom output formatters? Should it be possible to choose between the “% faster than baseline” or “% slower than baseline” formatter by using a command line flag?

schneems commented 9 years ago

We could use x.compare!(:faster) and have it default to x.compare!(:slower) for a flag? Ultimately @evanphx is the gatekeeper here, I'm just throwing ideas at the wall.

evanphx commented 9 years ago

Take a look at https://github.com/evanphx/benchmark-ips/commit/7961c61fbee2ebd313c6d9a5ff1d1698db2538a9 and give me your feedback please!

bquorning commented 8 years ago

Why was this closed?

evanphx commented 8 years ago

No one responded to my feedback request 8 months ago.

evanphx commented 8 years ago

Or, rather, you did but the change seemed fine.