amplab / spark-ec2

Scripts used to setup a Spark cluster on EC2
Apache License 2.0
392 stars 299 forks source link

Using spark_version='1.6.2' results in partial installation (?) #57

Open arokem opened 7 years ago

arokem commented 7 years ago

Specifically, I get these messages during launch of the cluster, and these files are indeed not in place once the cluster starts up:

./spark-ec2/spark-standalone/setup.sh: line 22: /root/spark/bin/stop-all.sh: No such file or directory
./spark-ec2/spark-standalone/setup.sh: line 27: /root/spark/bin/start-master.sh: No such file or directory    

Indeed, no spark web interface on port 8080 either

shivaram commented 7 years ago

Is this from branch-2.0 ? I think the problem is we didn't backport the change that added 1.6.1 and 1.6.2 to branch-2.0 as seen from [1]. Can you check if adding 1.6.2 there fixes the problem ?

[1] https://github.com/amplab/spark-ec2/blob/06f5d2bc7c222aecb56e2f7bb8b8e160bc501104/spark_ec2.py#L78

arokem commented 7 years ago

This is from branch-1.6

shivaram commented 7 years ago

Hmm that means SPARK_VERSION isn't being correctly parsed somehow. Because as in [2] the default should be sbin and not bin

[2] https://github.com/amplab/spark-ec2/blob/4b57900a24e25accd9c3f14c867920730813bf11/spark-standalone/setup.sh#L5

arokem commented 7 years ago

As far as I can tell, there's neither a bin nor a sbin directory under /root/spark. The only thing under /root/spark is: /root/spark/conf/spark-env.sh

On Mon, Oct 3, 2016 at 9:13 PM, Shivaram Venkataraman < notifications@github.com> wrote:

Hmm that means SPARK_VERSION isn't being correctly parsed somehow. Because as in [2] the default should be sbin and not bin

[2] https://github.com/amplab/spark-ec2/blob/ 4b57900a24e25accd9c3f14c867920730813bf11/spark-standalone/setup.sh#L5

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amplab/spark-ec2/issues/57#issuecomment-251291593, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHPNoh-XHzaa-sDSe9b5615YDmnKWFyks5qwdJfgaJpZM4KNRnZ .

shivaram commented 7 years ago

That means that Spark wasn't downloaded properly - My guess is this has to do with the tar.gz files for hadoop1 not being found on S3. You could try --hadoop-version=yarn as a workaround

arokem commented 7 years ago

Thanks! For now, I have resorted to set spark_version to 1.6.0, which seems to work, but I'll try that too. Feel free to close this issue, unless you want to keep track of this. And thanks again.

etrain commented 7 years ago

I ran into something similar recently.

It appears that for spark 1.6.2 only a subset of the binaries were uploaded to s3:

$ s3cmd ls s3://spark-related-packages/spark-1.6.2*
2016-06-27 23:47 241425242   s3://spark-related-packages/spark-1.6.2-bin-cdh4.tgz
2016-06-27 23:47 230444067   s3://spark-related-packages/spark-1.6.2-bin-hadoop1-scala2.11.tgz
2016-06-27 23:48 271799224   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.3.tgz
2016-06-27 23:49 273797124   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.4.tgz
2016-06-27 23:50 278057117   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.6.tgz
2016-06-27 23:50 196142809   s3://spark-related-packages/spark-1.6.2-bin-without-hadoop.tgz
2016-06-27 23:51  12276956   s3://spark-related-packages/spark-1.6.2.tgz

While

$ s3cmd ls s3://spark-related-packages/spark-1.6.0*
2015-12-27 23:07 252549861   s3://spark-related-packages/spark-1.6.0-bin-cdh4.tgz
2015-12-27 23:15 241526957   s3://spark-related-packages/spark-1.6.0-bin-hadoop1-scala2.11.tgz
2015-12-27 23:23 243448482   s3://spark-related-packages/spark-1.6.0-bin-hadoop1.tgz
2015-12-27 23:31 282904569   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.3.tgz
2015-12-27 23:41 244381359   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.4-without-hive.tgz
2015-12-27 23:48 284903527   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.4.tgz
2015-12-28 00:00 289160984   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.6.tgz
2015-12-28 00:08 201549664   s3://spark-related-packages/spark-1.6.0-bin-without-hadoop.tgz
2015-12-28 00:16  12204380   s3://spark-related-packages/spark-1.6.0.tgz
shivaram commented 7 years ago

I think the problem here is that the artifacts are missing from the release not just from s3. i.e. http://www-us.apache.org/dist/spark/spark-1.6.2/spark-1.6.2-bin-hadoop1.tgz gives me a 404

sabman commented 7 years ago

As well as http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz http://s3.amazonaws.com/spark-related-packages/spark-1.6.2-bin-hadoop1.tgz

RudyLu commented 7 years ago

Seems the same issue as #43.