TuringLang / papers

A Bayesian zoo
14 stars 9 forks source link

Benchmark infrastructure is broken #1

Open xukai92 opened 6 years ago

xukai92 commented 6 years ago

It's probably better for us to redesign it.

xukai92 commented 6 years ago

Requirement for new benchmark infrastructure (benchmarks/benchmarkhelper.jl)

xukai92 commented 6 years ago

The mLab point can be changed if we have a better way to automatically generate a benchmark results that can be tracked between difference commits.

mohamed82008 commented 6 years ago

Maybe relevant https://github.com/mohamed82008/ComputExp.jl.

xukai92 commented 6 years ago

Can you give me a brief intro about what's that repo is doing?

mohamed82008 commented 6 years ago

It is a database approach to managing computational experiments with a focus on reproducibility. An example is given here https://github.com/mohamed82008/ComputExp.jl/blob/master/test/TestComputationalExperiments.ipynb. This was a pet project a few months ago, but I never actually completed it.

So we can define programming languages, with versions and specific commits. We can define problems, algorithms and implementations of algorithms. An implementation is associated with a package dependency (can be extended to more), a specific commit of that package, a script file and a function name to be called when running the implementation. Then there are problems with problem features/parameters, and there are experiments that compare implementations on problems. Each experiment has many runs, where the run is for a specific implementation and a specific problem setting. Then finally each run has some results.

So once the database is generated, one can save it, load it again and do some stats or plotting on the results. At least that's the intention but I never got to making it fully functional yet.

xukai92 commented 6 years ago

This is interesting. But I guess it goes far beyond what we need here. Our intention to keep the automatically generated benchmark results is to keep track of performance change between commits. We probably want something pretty light for this purpose.