dimroc / etl-language-comparison

Count the number of times certain words were said in a particular neighborhood. Performed as a basic MapReduce job against 25M tweets. Implemented with different programming languages as a educational exercise.
http://blog.dimroc.com/2015/11/14/etl-language-showdown-pt3/
187 stars 33 forks source link

Create a (automatic?) process for updating the benchmark results #18

Open kevin-hanselman opened 9 years ago

kevin-hanselman commented 9 years ago

It would be nice to have the "Results" table in the README kept up to date. For instance, a Rust implementation was recently added (#14), and there's a pull request to improve said implementation (#5).

Obviously, this could be a CI task, but maybe the simpler thing to do is update the README in any pull request which changes/adds implementation(s)? For consistency across machines, probably a complete rewrite of the benchmark results table would be required, perhaps noting the machine specs as well.

This is just a thought and is open for discussion.

dimroc commented 9 years ago

It's a great idea and I have thought of automating the calculation of each implementation's runtime. I will be manually updating the README later this week.

We have come to a point though where many of these implementations are no longer apples to apples comparisons. They vary in some subtle and not so subtle ways:

This repo is becoming more a source of idiomatic implementations rather than a fair speed comparison. I'll be including this information in the README.

Your idea is still very valid though, we could still use an automatic way of running and tracking benchmarks across languages. It could spur more community involvement.