adoptium / aqa-tests

Home of test infrastructure for Adoptium builds
https://adoptium.net/aqavit
Apache License 2.0
130 stars 310 forks source link

Add Spark Benchmarks for Perf Testing #1255

Closed piyush286 closed 2 years ago

piyush286 commented 5 years ago

It would be good to measure performance on various Spark workloads. We can run benchmarks such as Hibench and Databricks.

HiBench is a big data benchmark suite that helps evaluate different big data frameworks in terms of speed, throughput and system resource utilizations. It contains a set of Hadoop, Spark and streaming workloads, including Sort, WordCount, TeraSort, Sleep, SQL, PageRank, Nutch indexing, Bayes, Kmeans, NWeight and enhanced DFSIO, etc. It also contains several streaming workloads for Spark Streaming, Flink, Storm and Gearpump.

Packages to Use

We'll need the following packages in order to run perf tests.

Hadoop: https://hadoop.apache.org/

Spark: https://spark.apache.org/

HiBench: https://github.com/Intel-bigdata/HiBench

Supported Hadoop & Spark releases for HiBench

I have experience in using the following: hadoop-2.7.3, spark-dk-2.0.2.0 and hibench-6.0 and tests work without any issues. Though it would be preferred to use the latest compatible binaries.

TODO

I wrote a script few years back that we still use for our internal perf analysis. We'll need to create the build and playlist files in order to get all the dependencies needed to runs the tests.

smlambert commented 2 years ago

Close this as tracking addition of additional perf benchmarks via #1112.