haskell-perf / graphs

BSD 3-Clause "New" or "Revised" License
39 stars 2 forks source link

Select right arguments to test functions #2

Open nobrakal opened 6 years ago

nobrakal commented 6 years ago

Currently, a function like hasEdge is tested with arguments from take 2 edgesNotInGraph. But this take the two first edges not in the graphs, likely (0,0) and (0,1). This is not a good thing, we want to test for edges everywhere and the graph.

How can we select them to have a representative population, and how many ?

snowleopard commented 6 years ago

I think you shouldn't automate this. A benchmark for hasEdge should include separate lists of existing and non-existing edges, that should be specified manually and purposefully.

nobrakal commented 6 years ago

Ok, I agree with you. About a separate list, I think it can be generated from a pattern per graph kind, using the inputs field from the Suite data. It will certainly require that inputs become of type

GenericGraph -> [(Name,i)]

instead of

Edges -> [(Name,i)]

On the same topic, is benchmarking hasEdge over one edge we find representative is a good idea ? Maybe it can be good to make an average from a little list a of benchmark concerning similar edges. It will allow to answer questions like "How many time will take searching for an edge I know it can only be in beginning of this path" ?

snowleopard commented 6 years ago

@nobrakal If you look at https://github.com/snowleopard/alga/issues/14#issuecomment-370242263, you'll see that I originally suggested to make vertex/edge access patterns part of the graph descriptions, not algorithms.

This means you don't need inputs in Suite, but instead a few new fields in the GenericGraph record.

nobrakal commented 6 years ago

Oh sorry, I had forgotten that.

I am not happy with it, because as there is no standard pattern for access and as you said, they had to be choose purposefully by the one who benchmark, so I don't want to include our patterns in the GenericGraph data.

I think inputs are a good thing, because users can write something like

edgesNotInGraph :: GenericGraph -> [Edges]
edgesNotInGraph (GenericGraph name edges) = case name of
  "path" -> [...]
  "circuit" -> [...]

And thus test only what they want per graph

snowleopard commented 6 years ago

I think inputs are a good thing, because users can write something like

Ouch!

Doing a quadratic amount of work (number of benchmark graphs times number of algorithms) will not scale in many ways: it is hard to write, it is hard to explain, it is hard to understand. I don't think this is a viable option. I advise to think about how you can decompose benchmarking into orthogonal concerns.

nobrakal commented 6 years ago

It looks orthogonal to me... An Algorithm is needing a graph and arguments to produce something, so we need to map each algorithm to each graph to each arguments to bench all possible cases. Arguments can be produced from the graph, and thus are not orthogonal with graphs, so my idea is to provide this relation between graphs and arguments.

I can't see how you can do less than

number of benchmark graphs times number of algorithms

But I have certainly missed something if you are thinking this is not even viable ^^ !