Closed yebai closed 2 years ago
@xukai92 are these models available somewhere? Perhaps we can add them to https://github.com/TuringLang/Turing.jl/tree/master/benchmarks
Seems that they are avaiable in an old branch here https://github.com/TuringLang/TuringExamples/tree/old-models/old-models.
For the benchmark suite, can we add the Stan version as well?
For the benchmark suite, can we add the Stan version as well?
I think so, Github actions are quite generous with build time compared to Travis. So we can run these benchmarks altogether, then produce a table on the fly.
Sounds good. I will take a look after finishing the remaining issues for AABI in AHMC.
Will be fixed via https://github.com/TuringLang/TuringExamples/pull/22
Here is a new table we can use
Model | Stan | Turing |
---|---|---|
Gaussian with Unknown Parameters | 0.342 +/- 0.015 | 2.211 +/- 0.061 |
Hierarchical Poisson | 0.134 +/- 0.068 | 0.325 +/- 0.013 |
High Dimensional Gaussian | 11.609 +/- 0.306 | 9.766 +/- 0.222 |
Semi-supervised HMM | 5.033 +/- 0.058 | 463.213 +/- 26.045 |
LDA | 43.888 +/- 0.504 | 378.762 +/- 7.91 |
Logistic Regression | 56.15 +/- 2.274 | 3.942 +/- 1.331 |
Naive Bayes | 13.677 +/- 0.142 | 6.848 +/- 0.144 |
Stochastic Volatility | 0.918 +/- 0.014 | 75.026 +/- 30.579 |
What's the best place to host it? Not sure if we still want it on the wiki page.
Let's make a nice table, and put it on the front page of turing.ml, with a link the script to reproduce all the numbers.
Slightly improved the table. Another other change to make?
cc @trappmartin and @cpfiffer, who might have ideas/suggestions regarding how to format and publish this benchmarking result on the front page.
here is an example for Julia's benchmarking page: https://julialang.org/benchmarks/
Thanks for the pointer. I can make the visualiation. I will also improve the table a bit more - got an idea.
Its a bit hard to make the markdown table nice as white spaces would be ignored. Plain text actually looks nice.
PPL Turing Stan
Model
10,000D Gaussian 9.766 ± 0.222 11.609 ± 0.306
Gaussian Unknown 2.211 ± 0.061 0.342 ± 0.015
Hierarchical Poisson 0.325 ± 0.013 0.134 ± 0.068
LDA 378.762 ± 7.910 43.888 ± 0.504
Logistic Regression 3.942 ± 1.331 56.15 ± 2.274
Naive Bayes 6.848 ± 0.144 13.677 ± 0.142
Semi-Supervised HMM 463.213 ± 26.045 5.033 ± 0.058
Stochastic Volatility 75.026 ± 30.579 0.918 ± 0.014
UPDATES
Turing should probably be the first column, and we should order them by which models Turing performs better in.
Also made a plot
Y-axis is in log scale
Maybe you could add error bars of the standard deviation to the plot?
Would it be possible to add some benchmarks on which we evaluate how Stan and Turing performs with increasing number of observations? Basically a line plot with the number of observations on the x-axis.
Maybe you could add error bars of the standard deviation to the plot?
Sure
Would it be possible to add some benchmarks on which we evaluate how Stan and Turing performs with increasing number of observations? Basically a line plot with the number of observations on the x-axis.
Sure. But let improve those we are slow first. Otherwise it's hard to benchmark them (inference time is too long).
I've copied and pasted the current table and figure to the wiki.
The benchmark numbers on the wiki are seriously out-of-date, and probably misleading about Turing's performance. Better to update the numbers using the current releases.
https://github.com/TuringLang/Turing.jl/wiki