szilard / benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
MIT License
1.87k stars 335 forks source link

Upgrade to VW v8.0 #23

Closed trufanov-nok closed 8 years ago

trufanov-nok commented 8 years ago

It seems that VW 7.10 is under the current tests. Would be great if you update it with v.8.0: https://github.com/JohnLangford/vowpal_wabbit/archive/8.0.zip I don't think it'll change ROCs but would be interesting if there are any performance regressions happened.

szilard commented 8 years ago

As I mentioned in the README "The linear models are not the primary focus of this study because of their not so great accuracy vs the more complex models (on this type of data)." Also "The main conclusion here is that it is trivial to train linear models even for n = 10M rows virtually in any of these tools on a single machine in a matter of seconds." and " the differences in memory efficiency and speed will start to really matter only for larger sizes and beyond the scope of this study."

However, feel free to run VW 7.10 (to reproduce my results) and 8.0 and let me know if significantly different.

szilard commented 8 years ago

Closing this issue, feel free to reopen if you have any results.