Open PragTob opened 6 years ago
I’m curious - what do you think this would tell us about our code? Will we be able to clearly see from the results if our function is O(n) vs O(n^2) or something like that?
The only thing I could see this for is asserting that a function is O(1) time or O(1) memory, since that’s the only property that should be measurable for any random input of any size. But you could also assert that with two inputs that differ by an order of magnitude, without the need for stream_data.
I haven't looked at it from the perspective of O-notations. That might be doable though.
I've looked at it more from the perspective of:
also in case anything hits a cache on any layer (even cpu lvl...) then more varied inputs assure more accurate results :)
Ok, so now I see how this might work - or at least I have some idea of how I think this could work. I was just totally missing it at first.
We could run each randomly generated input a certain number of times (configurable by the user), and after that number of runs we'd move on to the next input. Then, after the total benchmark time is up (also configurable by the user as it is now), we'd show the n fastest and slowest inputs and the stats for those inputs, and maybe the median as well? Is that along the lines of what you're thinking?
I just totally wasn't thinking about it in those terms at first, so the whole concept escaped me! I can certainly see value in that, although it does sound like a somewhat significant amount of work.
The second thing though - getting a performance profile across a wide array of inputs - can be done now. Take the following example:
input_stream = StreamData.integer()
get_num = fn -> Stream.take(input_stream, 1) |> Enum.map(& &1) |> hd() end
Benchee.run(%{
"some important business code" => fn -> Business.make_money(get_num()) end
}, time: 10)
One of the awesome composability benefits of having benchee's API be "just a function" :tada:
Yeah I was thinking in both ways :) And I'm not quite sure which one I like better.
E.g. cases:
Your example could be improved up I think - right now get_num
has an impact on the run time. We could stick it into a before_each
and feed it to the function for maximum benefit. That's sort of what I imagine the implementations look like (separate step though probably)
(note to self, property based benchmarking is probably a horrible name as we're not checking properties of the benchmarks but it's more like streamdata based benchmarking or random data based benchmarking or fuzzing benchmarking or something)
This had been in my mind for a long time but I guess it's worth getting out there:
We already have inputs. Property based testing is great and support in the elixir system is growing a lot. Why not just put in a generator as an input and let it generate inputs? Statistics will become more interesting!
As generators usually generate more complex data as it goes on it might be a bit weird as the first recorded run times will be fast and the last recorded run times (or memory measurements) will be higher. We might even need a different type of generator - only time will tell.
We need to save and show the seed to ensure reproducibility.
Once we have that there are a lot of optional things we could do:
This is definitely a >= 1.x or 2.x feature even ;)