nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Exception if python binary != python2.7 on nodes #89

Closed ereed-tesla closed 8 years ago

ereed-tesla commented 8 years ago

I'm using a modified AMI Amazon Linux AMI release 2015.09, which seems to default to python 2.6 when firing python. On cluster launch I get:

[52.37.134.99] Configuring ephemeral storage...
At least one node raised an error: Traceback (most recent call last):
  File "/tmp/setup-ephemeral-storage.py", line 158, in <module>
    non_root_block_devices = get_non_root_block_devices()
  File "/tmp/setup-ephemeral-storage.py", line 53, in get_non_root_block_devices
    block_devices_raw = subprocess.check_output([
AttributeError: 'module' object has no attribute 'check_output'

This is from using python 2.6:

root@ip-172-31-30-154 tmp]$ python26 --version
Python 2.6.9
root@ip-172-31-30-154 tmp]$ python26 setup-ephemeral-storage.py 
Traceback (most recent call last):
  File "setup-ephemeral-storage.py", line 158, in <module>
    non_root_block_devices = get_non_root_block_devices()
  File "setup-ephemeral-storage.py", line 53, in get_non_root_block_devices
    block_devices_raw = subprocess.check_output([
AttributeError: 'module' object has no attribute 'check_output'

Python 2.7 is in the path but it isn't called

root@ip-172-31-30-154 tmp]$ python27 --version
Python 2.7.10
root@ip-172-31-30-154 tmp]$ python27 /tmp/setup-ephemeral-storage.py 
{"ephemeral": ["/media/ephemeral0", "/media/ephemeral1"], "root": "/media/root"}

Python 3.5 also fails for me:

root@ip-172-31-30-154 tmp]$ python3.5 setup-ephemeral-storage.py 
Traceback (most recent call last):
  File "setup-ephemeral-storage.py", line 171, in <module>
    format_devices(ephemeral_devices)
  File "setup-ephemeral-storage.py", line 102, in format_devices
    "Format process returned non-zero exit code: {c}".format(c=return_code))
Exception: Format process returned non-zero exit code: 1

Based on https://github.com/nchammas/flintrock/blob/master/flintrock/core.py#L481 and these exceptions, it seems like python from the PATH must be only 2.7 in order for for setup-ephemeral-storage.py to work.

nchammas commented 8 years ago

Thanks for the report @ereed-tesla.

I did assume that setup-ephemeral-storage.py would run under Python 2.7, though I didn't call that out explicitly enough.

The problem with Python 2.6 is that it does not have subprocess.check_output(). That was added only in Python 2.7.

I'm using a modified AMI Amazon Linux AMI release 2015.09

I normally test Flintrock against vanilla Amazon Linux, where the default Python for ec2-user is 2.7.

Did you explicitly change the default Python when you created your custom AMI?

nchammas commented 8 years ago

Immediate mitigation:

Longer-term resolution:

What I'd really like to avoid:

ereed-tesla commented 8 years ago

Ahaa I didn't see that comment. The base AMI is the present (edit: preset) from around spark_ec2.py ~v1.4 (maybe earlier) -- Anaconda is installed so the default is python --version -> Python 3.5.1 :: Anaconda 2.4.1 (64-bit)

However, a .bashrc/.profile is not fired in the cluster launch script so the default becomes the AMI default of 2.6:

$ ssh root@XYZ python --version
Python 2.6.9
nchammas commented 8 years ago

Ah, the spark-ec2 AMIs.

They are packed with lots of goodies, but they have not been updated in 2 years, lack a slew of security updates (no biggie for short-lived clusters, perhaps), and have many other rough edges like depending on you logging in as root.

My strong recommendation is to use something else. Is there something in particular about the spark-ec2 AMIs that you are looking for?

ereed-tesla commented 8 years ago

I agree. The main reason I'm testing using a variant of the spark-ec2 AMI is because it has all the dependencies/packages for my workflow setup (unrelated to spark), and since it works, I of course haven't changed it. Thanks for the advice -- closing this issue.

nchammas commented 8 years ago

You may be able to recreate what you need with Flintrock pretty easily using vanilla Amazon Linux and a combination of Flintrock's copy-file (e.g. to upload a script) and run-command commands to setup stuff you need on the cluster.

I'll keep this issue open to address the error message and Python 3 compatibility that should be added, per my earlier comment.