Closed ereed-tesla closed 8 years ago
Thanks for the report @ereed-tesla.
I did assume that setup-ephemeral-storage.py
would run under Python 2.7, though I didn't call that out explicitly enough.
The problem with Python 2.6 is that it does not have subprocess.check_output()
. That was added only in Python 2.7.
I'm using a modified AMI Amazon Linux AMI release 2015.09
I normally test Flintrock against vanilla Amazon Linux, where the default Python for ec2-user
is 2.7.
Did you explicitly change the default Python when you created your custom AMI?
Immediate mitigation:
setup-ephemeral-storage.py
about supported Python versions.Longer-term resolution:
setup-ephemeral-storage.py
compatible with Python 2.7+ and 3.4+. Forward-looking Linux distributions like Fedora have Python 3.4+ as the default and don't even have Python 2 installed.What I'd really like to avoid:
Ahaa I didn't see that comment. The base AMI is the present (edit: preset) from around spark_ec2.py ~v1.4 (maybe earlier) -- Anaconda is installed so the default is python --version
-> Python 3.5.1 :: Anaconda 2.4.1 (64-bit)
However, a .bashrc/.profile is not fired in the cluster launch script so the default becomes the AMI default of 2.6:
$ ssh root@XYZ python --version
Python 2.6.9
Ah, the spark-ec2 AMIs.
They are packed with lots of goodies, but they have not been updated in 2 years, lack a slew of security updates (no biggie for short-lived clusters, perhaps), and have many other rough edges like depending on you logging in as root
.
My strong recommendation is to use something else. Is there something in particular about the spark-ec2 AMIs that you are looking for?
I agree. The main reason I'm testing using a variant of the spark-ec2 AMI is because it has all the dependencies/packages for my workflow setup (unrelated to spark), and since it works, I of course haven't changed it. Thanks for the advice -- closing this issue.
You may be able to recreate what you need with Flintrock pretty easily using vanilla Amazon Linux and a combination of Flintrock's copy-file
(e.g. to upload a script) and run-command
commands to setup stuff you need on the cluster.
I'll keep this issue open to address the error message and Python 3 compatibility that should be added, per my earlier comment.
I'm using a modified AMI
Amazon Linux AMI release 2015.09
, which seems to default to python 2.6 when firingpython
. On cluster launch I get:This is from using python 2.6:
Python 2.7 is in the path but it isn't called
Python 3.5 also fails for me:
Based on https://github.com/nchammas/flintrock/blob/master/flintrock/core.py#L481 and these exceptions, it seems like
python
from the PATH must be only 2.7 in order for forsetup-ephemeral-storage.py
to work.