hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
977 stars 245 forks source link

Use ml lib to do clustering #877

Closed danking closed 7 years ago

jbloom22 commented 8 years ago

http://spark.apache.org/docs/latest/mllib-clustering.html

jbloom22 commented 7 years ago

@lfrancioli this may be possible with pyspark approach rather than baked into hail. do you have random forest example?

jbloom22 commented 7 years ago

Laurent has posted his random forest example. Closing as clustering should be similarly done in PySpark. http://discuss.hail.is/t/using-spark-ml-to-create-and-apply-random-forests/204