Closed Nogbit closed 3 years ago
Sorry, let me edit that, that log was from the worker and not the master, much more info to come in a sec.
Updated the original description and title
The python issue above was becuase conda was using latest, which brings with it Python 3.9. So that was not the AMI's fault.
However, even when using conda with Python 3.8 you will still have issues since the AWS Amazon Linux 2 AMI does not have initctl
and instead uses systemctl
.
Also, you will need to make the boot volume greater than the 10GB default as the bootstrap action will finish, but Hive fill fail to install afterwards as you will run out of disk space. 20GB will suffice.
This bootstrap worked for me. https://gist.github.com/Nogbit/f15e1c2be59bcc4ad122171b2e56cdeb
EMR doesnt start as it fails on the bootstrapping step. It looks like the EC2 instances used right now with EMR 6.3.0 all have Python 3.9 but that might be too high. I've tried all the EMR versions of 6.x, 5.3x and 5.20.0.
According to the docs
Each Amazon EMR release version is "locked" to the Amazon Linux AMI version to maintain compatibility.
. Though I'm not experiencing that. Every start of a cluster I get the same error below, even on EMR versions that came before the official release of Python 3.9.Release label:
emr-6.3.0
Hadoop distribution:Amazon 3.2.1
Applications:Hive 3.1.2, Pig 0.17.0, Hue 4.9.0
Log URI:s3://cooldask-emr/logs/
Logs
s3://cooldask-emr/logs/j-1X4JY6LKIYQFZ/node/i-059b9f6dbcce4c3e5/bootstrap-actions/1/controller.gz