Yelp / mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
http://packages.python.org/mrjob/
Other
2.62k stars 586 forks source link

EMR: TERMINATED_WITH_ERRORS: The given SSH key name was invalid #2178

Open cicerojmm opened 4 years ago

cicerojmm commented 4 years ago

I'm having trouble running an example with EMR on AWS.

Generate the following error:

Using configs in /home/ciceromoura/.mrjob.conf
Creating temp directory /tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991
writing master bootstrap script to /tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/b.sh
uploading working dir files to s3://datalake-exemplo/tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/files/wd...
Copying other local files to s3://datalake-exemplo/tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/files/
Created new cluster j-3342SIBA7GY23
Added EMR tags to cluster j-3342SIBA7GY23: __mrjob_label=MR-DataMining-3, __mrjob_owner=ciceromoura, __mrjob_version=0.7.3
Waiting for Step 1 of 2 (s-2Z88F1LWZ8HPL) to complete...
  CANCELLED (Job terminated)
Cluster j-3342SIBA7GY23 was TERMINATED_WITH_ERRORS: The given SSH key name was invalid
Step 1 of 2 failed
Terminating cluster: j-3342SIBA7GY23

My configuration file (mrjob.conf): runners:

  emr:
    aws_access_key_id: xxxxxxxxxxx
    aws_secret_access_key: xxxxxxxxxxxxx
    ec2_key_pair: EMR
    ec2_key_pair_file: ~/.ssh//EMR.pem
    ssh_tunnel: true
    instance_type: m5.xlarge
    num_core_instances: 3

The command executed: python3 MR-DataMining-3.py -r emr s3://bucket/file.txt --output-dir=s3://bucket/output/ --cloud-tmp-dir=s3://bucket/tmp

I already checked the ssh key, changed it, generated another one, but the error persists. The cluster is created automatically, right? What am I doing wrong? Do you need AMI?