Amazon Linux 2 AMIs (e.g. ami-0beafb294c86717a8) still launch with Python 2 as the default and only Python. Python 2 is EOL, and so is Spark's support for Python 2.
We should start installing Python 3 on launched clusters and setting it as the default Python for PySpark. The pattern would roughly follow what we do to ensure that the cluster has a recent enough version of Java installed (e.g. #316).
334 ensures Python 3 is available on the cluster, but this issue is about making sure it's the default Python for Spark.
In Spark 3.1+ it's not an issue since Spark specifically looks for python3 (at least according to what I wrote on #334), so I think over time this problem basically solves itself.
Amazon Linux 2 AMIs (e.g.
ami-0beafb294c86717a8
) still launch with Python 2 as the default and only Python. Python 2 is EOL, and so is Spark's support for Python 2.We should start installing Python 3 on launched clusters and setting it as the default Python for PySpark. The pattern would roughly follow what we do to ensure that the cluster has a recent enough version of Java installed (e.g. #316).