szilard / benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
MIT License
1.87k stars 335 forks source link

running your benchmarks from beginning to end #35

Open vinhdizzo opened 8 years ago

vinhdizzo commented 8 years ago

Hey Szilard,

I'd like to replicate your code from beginning to end perhaps on Google Compute Engine (GCE), mainly to test out GCE with Vagrant. Do you know have a sense of how long the entire process would take assuming a similar server size as what you used on EC2?

Is there a convenient way to run all your scripts in from folder 0 to 4? That is, is there a master script that executes them all?

I notice that the results are written out to the console. Do you have a script that scrapes all the AUC's for your comparison analysis?

Thanks!

szilard commented 8 years ago

Hi Vinh:

That would be great. I'm a big fan of reproducible data analysis/research and it would be nice to have this project in a fully automated format (installation, run, presentation of results etc.). This project grew very organically and I spent a lot of time with experimentation, many iterations etc. therefore I did not want to invest time in making it fully automated/reproducible, but if you want to take on the task, I'll be happy to help a bit.

To answer your questions:

1) Do you know have a sense of how long the entire process would take assuming a similar server size as what you used on EC2?

Dunno, the runs depend on the tool/algo, but based on my results maybe you can now take a step back and prioritize/simplify etc.

2) Is there a convenient way to run all your scripts in from folder 0 to 4? That is, is there a master script that executes them all?

No, though the scripts run out of the box, no weird configs etc.

3) I notice that the results are written out to the console. Do you have a script that scrapes all the AUC's for your comparison analysis?

No, but it would probably not be difficult for you to log the results in a file.

On the other hand the repo contains all the code needed to get the results and the code base is relatively small (since it uses mostly high level APIs)

I've seen several projects that automated some simple benchmarks of their own ML tool, but unfortunately almost everyone is focusing on their own tool only. A fully automated benchmark for various tools (maybe similar to the famous TPC benchmarks in the SQL world) would be great.