Open tmbb opened 6 years ago
:wave:
Sorry, I must have somehow missed this issue :disappointed:
Sounds like a good idea. I'd like to just specify a "special" input that includes information about the size or a way to get the size. Specifying the size, seems to be hard because any kind of input data structure is a valid input for benchmarking inputs - who am I to judge?
The best think that I could come up with right now is an "input quantifier" function (name pending) that receives the inputs and spits out a number that would be used to compare and graph it. Like in your example it'd be like input_quantifier: fn file_name -> count_lines(file) end
or something.
That's a big addition though - not just for graphing but we could even try to make a guess if a function is linear or not given 3 or more input data points. This is probably harder than I imagine right now though :grin:
Definitely past 1.0 (which should land soon) - I'll keep it in mid though.
Thanks for the excellent idea :tada:
This seems like something that could work with the proper support for using property testing generators with Benchee. For any input that has a size we could generate inputs of increasing size and use that size to graph against the result of the benchmark.
There should be a way of graphing the execution time as a function of the size of the input, so that one can demonstrate the time complexity of the algorithm empirically in graphs like this:
The size of the input can't be determined automatically, but it could given by the user in the form of a tuple:
{input_size, input_value}
or some similar API. I understand that this might require a new top-level Benchee function, and that might be a change too big to implement.But even if this change isn't possible, just plotting the execution time of all the inputs in the same graph would be a big improvement.
In my concrete example, I want to show that a an implementation of a programming language lexer (let's say, implementation 1) has complexity roughly linear on the size of the file while the alternative impelmentation (say, implementation 2) has supralinear complexity. I can see how, as the input size rows, implementation 2 gets slower and slower than implementation 1, but I can't compare the execution time of the implementation 1 with itself as the input grows.