Open rootAvish opened 6 years ago
Run the following on the Deep Learning AMI log directory sudo chmod 777 /var/log Though if you care, only executable permission (for all users) is required Then take an image and use that.
EMR is copying log directory and establishing files in it, but permissions set up on AMI are drwx------
and need to at least be drwx-----x
Hi,
I was using the command below to create a test EMR cluster:
Where the deep learning AMI ID I'm using is the us-east-1 (N. Virginia) AMI ID for https://aws.amazon.com/marketplace/pp/B076T8RSXY and I've tried several other AMIs, including the base deep learning AMI: https://aws.amazon.com/marketplace/pp/B077GFM7L7, and every EMR version starting from 5.8.0 and all GPU instance type possible to bring up the cluster.
However the cluster always fails while executing the Amazon defined Bootstrap actions (not the user defined ones) and there is no stderr in the bootstrap actions folder, however when I looked under
provision-node/<node-id>/stderr.gz
, all the attempts have failed with the same error:With no details around why exactly the
historyserver
is failing to start. Is there an installation step this guide this missing when using the DL AMIs? This error never occurs when using the default AMI of EMR. I'm using 60GB EBS root volume size and 35GB of attached EBS storage, if that matters. Tried the p3.2x and g2.2x large instance types with just one master instance and 0 core and 0 task nodes.