amplab / spark-ec2

Scripts used to setup a Spark cluster on EC2
Apache License 2.0
392 stars 299 forks source link

How can launch Spark 2.0.1 and Hadoop 2.7? #64

Open biolearning opened 7 years ago

biolearning commented 7 years ago

Right now it seems the latest hadoop version in spark-ec2 is 2.4, but actually in spark download page, it can be up to 2.7, also it is available in aws s3 http://s3.amazonaws.com/spark-related-packages. So the question is - how to launch such Spark cluster and if it does not support is there any workaround?

--hadoop-major-version=HADOOP_MAJOR_VERSION Major version of Hadoop. Valid options are 1 (Hadoop 1.0.4), 2 (CDH 4.2.0), yarn (Hadoop 2.4.0) (default: yarn)

shivaram commented 7 years ago

See #56

biolearning commented 7 years ago

Thanks. When is that improvement expected to check into the 2.0 branch?