JuliaCI / BenchmarkTools.jl

A benchmarking framework for the Julia language
Other
610 stars 101 forks source link

Dict -> OrderedDict in BenchmarkGroup? #52

Closed CarloLucibello closed 1 year ago

CarloLucibello commented 7 years ago

I don't know if anyone else feels the same, but it is a little bit annoying to me the fact that when I dispay the results of some benchmarks, they are not shown in insertion order. As an example, here for me it is visually difficult to compare the performances of different graph types in Erdos because of the lack of ordering:

julia> @show res["generators"];
res["generators"] = 16-element BenchmarkTools.BenchmarkGroup:
  tags: []
  ("rrg","Net(500, 750) with [] graph, [] vertex, [] edge properties.") => Trial(632.408 μs)
  ("rrg","Graph{Int64}(100, 150)") => Trial(121.221 μs)
  ("rrg","Net(100, 150) with [] graph, [] vertex, [] edge properties.") => Trial(115.033 μs)
  ("rrg","Graph{Int64}(500, 750)") => Trial(677.647 μs)
  ("complete","Net(100, 4950) with [] graph, [] vertex, [] edge properties.") => Trial(896.223 μs)
  ("complete","DiGraph{Int64}(100, 9900)") => Trial(617.122 μs)
  ("complete","DiNet(20, 380) with [] graph, [] vertex, [] edge properties.") => Trial(42.104 μs)
  ("erdos","Graph{Int64}(500, 1500)") => Trial(405.240 μs)
  ("erdos","Net(100, 300) with [] graph, [] vertex, [] edge properties.") => Trial(71.516 μs)
  ("complete","DiGraph{Int64}(20, 380)") => Trial(23.721 μs)
  ("complete","Net(20, 190) with [] graph, [] vertex, [] edge properties.") => Trial(20.845 μs)
  ("complete","Graph{Int64}(100, 4950)") => Trial(159.900 μs)
  ("complete","DiNet(100, 9900) with [] graph, [] vertex, [] edge properties.") => Trial(1.861 ms)
  ("erdos","Net(500, 1500) with [] graph, [] vertex, [] edge properties.") => Trial(297.167 μs)
  ("complete","Graph{Int64}(20, 190)") => Trial(7.340 μs)
  ("erdos","Graph{Int64}(100, 300)") => Trial(88.091 μs)

Would it be reasonable and not too disruptive to use OrderedDicts instead of Dict in the BenchmarkGroup type?

Yes, I could write down some more appropriate comparison methods, but asking doesn't hurt :)

Cheers, Carlo

jrevels commented 7 years ago

@shashi has also requested this at one point, and I agree it would be nice.

The main barrier is the need to ensure BenchmarkTools is as forward-compatible as possible for testing performance changes to Julia Base, which means being very careful about what BenchmarkTools takes on as a dependency. If we did end up making this change, rolling our own OrderedDict might be better than pulling in a larger dependency like DataStructures.

At that point the question becomes: Is ensuring that display order matches insertion order worth the development burden of maintaining an OrderedDict implementation (or otherwise adopting a new dependency)? Personally, my answer leans towards "no", but if the wider community keeps expressing a desire for this feature then I might be swayed.

In a lot of cases, including the example you gave, it seems like more explicit organization via subgrouping can help solve these kinds of problems. For example, it looks like "erdos", "complete", and "rrg" could all be their own subgroups.

CarloLucibello commented 7 years ago

In a lot of cases, including the example you gave, it seems like more explicit organization via subgrouping can help solve these kinds of problems. For example, it looks like "erdos", "complete", and "rrg" could all be their own subgroups.

I didn't try again lately, but as far as I can remember having subgroups is not ideal as well, because having subgroups means you have to go one step down in the hierarchy to have the benchmark times displayed. Something can be done in this regard maybe?

I agree that the solutions could be more problematic than this very minor problem, so we can leave this issue open just to pool the interest in its resolution

jrevels commented 7 years ago

because having subgroups means you have to go one step down in the hierarchy to have the benchmark times displayed.

The default behavior is to only show the first level, but you can use showall to display the whole thing.

garrison commented 5 years ago

+1 to this. I am making some benchmarks of different system sizes, and it is a bit annoying to make sense of this:

          "EV" => 6-element BenchmarkTools.BenchmarkGroup:
              tags: []
              "L = 8" => Trial(23.448 μs)
              "L = 16" => Trial(46.961 μs)
              "L = 10" => Trial(29.194 μs)
              "L = 14" => Trial(40.954 μs)
              "L = 6" => Trial(17.881 μs)
              "L = 12" => Trial(35.076 μs)
garrison commented 4 years ago

I'm finding myself back here with the same wish I had back in September. I believe this would be a really useful feature. Now that julia has reached 1.0, I don't think the maintenance overhead of having an OrderedDict in this package would be a huge burden. But #60 managed to solve this problem without even introducing it as a package dependency (thanks @shashi), so such drastic measures seem not even necessary. What needs to happen to get this merged (aside from resolve conflicts and update to Pkg3)? Lacking this, even having a recipe that can be used to output all benchmarks in lexicographic order would be most welcome.

gdalle commented 1 year ago

I think this would be more easily fixed by pretty-printing BenchmarkGroup results / exporting them to markdown than by changing the underlying data structure?

gdalle commented 1 year ago

See #171