amplab / spark-ec2

Scripts used to setup a Spark cluster on EC2
Apache License 2.0
393 stars 299 forks source link

Error when using a more up-to-date AMI #80

Open mmistroni opened 7 years ago

mmistroni commented 7 years ago

Hi all I am trying to launch an ec2 cluster using a more up to date AMI:ami-c928c1a9 Here's my command

root@9f2c58d4fbe6:/spark-ec2# ./spark-ec2 -k ec2AccessKey -i ec2AccessKey.pem -s 2 --ami=ami-c928c1a9 --region us-west-2 launch MMTestCluster4

I am launching this from a docker container running Ubuntu 16.06 and i am getting this exception

Connection to ec2-54-187-145-15.us-west-2.compute.amazonaws.com closed. Deploying files to master... Warning: Permanently added 'ec2-54-187-145-15.us-west-2.compute.amazonaws.com,54.187.145.15' (ECDSA) to the list of known hosts. protocol version mismatch -- is your shell clean? (see the rsync man page for an explanation) rsync error: protocol incompatibility (code 2) at compat.c(176) [sender=3.1.1] Traceback (most recent call last): File "./spark_ec2.py", line 1534, in main() File "./spark_ec2.py", line 1526, in main real_main() File "./spark_ec2.py", line 1362, in real_main setup_cluster(conn, master_nodes, slave_nodes, opts, True) File "./spark_ec2.py", line 846, in setup_cluster modules=modules File "./spark_ec2.py", line 1121, in deploy_files subprocess.check_call(command) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['rsync', '-rv', '-e', 'ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ec2AccessKey.pem', '/tmp/tmp

I am willing to help sort out this issues as i am skilled in python and i am a user of scala/pyton aws API. Please give me some hints / starting points, and also if possible a test environment as it's goign to cost me loads of money to keep on creating large instances (and then destroy them) on my aws account

thanks in advance and regards Marco

shivaram commented 7 years ago

I think the error is happening due to some ssh output that comes up when we try to run rsync - https://www.centos.org/forums/viewtopic.php?t=53369 seems relevant

Also I think the error is happening when we rsync from the client (i.e. in this case your Ubuntu machine) to the master node. So one shouldn't need a big cluster to debug this - launching 1 master, 1 slave with t1.micro might be enough

mmistroni commented 7 years ago

Hello Thanks...Will try it out...Good to know I can use Micro instances Kr

On 4 Feb 2017 11:26 pm, "Shivaram Venkataraman" notifications@github.com wrote:

I think the error is happening due to some ssh output that comes up when we try to run rsync - https://www.centos.org/forums/viewtopic.php?t=53369 seems relevant

Also I think the error is happening when we rsync from the client (i.e. in this case your Ubuntu machine) to the master node. So one shouldn't need a big cluster to debug this - launching 1 master, 1 slave with t1.micro might be enough

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amplab/spark-ec2/issues/80#issuecomment-277485757, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ7RDrrbcJ_zMobYCxQ_L6XwiR7lvqQyks5rZQlAgaJpZM4L3SS7 .

milad181 commented 7 years ago

Hello,

I have the same issue when I am using a new AMI. Is there any workaround this? All I want is to run a cluster of spark using the latest Amazon Linux AMI.

Thanks