[ML-3585] Added benchmarks to mllib-large.yaml for clustering

databricks / spark-sql-perf

Apache License 2.0

586 stars 407 forks source link

[ML-3585] Added benchmarks to mllib-large.yaml for clustering #149

Closed lu-wang-dl closed 6 years ago

lu-wang-dl commented 6 years ago

Benchmark for clustering is added to mllib-large.yaml. GaussianMixture, KMeans, and LDA are added. BisectingKMeans is missing in spark-sql-perf now. Need to be fixed in the following up JIRA: https://databricks.atlassian.net/browse/ML-3834 Then parameters is based on the previous benchmarks for the Spark 2.2 QA.

mengxr commented 6 years ago

LGTM