I am trying to find keywords from CommonCrawl archive. When I tried to run with one wet.gz file, it works fine. But If I try to run our script with entire wet archive files, then we got following error.
Using s3://mrjob-1c535120e37b953d/tmp/ as our temp dir on S3
Copying local files to s3://mrjob-1c535120e37b953d/tmp/email_address.ec2-user.20181007.080547.129600/files/...
Adding our job to existing cluster j-119SGKPP3LAQ2
Creating temp directory /tmp/email_address.ec2-user.20181007.080547.129600
Connect to resource manager at: http://localhost:40548/cluster
Waiting for Step 1 of 1 (s-KV7OE1GS86IV) to complete...
RUNNING for 0:00:05
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:02:44
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:05:26
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:08:08
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:10:50
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:13:31
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:16:13
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:18:55
Oops, ssh subprocess exited with return code 255, restarting...
Connect to resource manager at: http://localhost:40548/cluster
RUNNING for 0:21:37
FAILED
Cluster j-119SGKPP3LAQ2 is WAITING: Cluster ready after last step failed.
Why is this happening? How do I resolve this issue?
I am running mrjob script from one of our EC2 instances, not from my mac.
I am trying to find keywords from CommonCrawl archive. When I tried to run with one wet.gz file, it works fine. But If I try to run our script with entire wet archive files, then we got following error.
Why is this happening? How do I resolve this issue?
I am running mrjob script from one of our EC2 instances, not from my mac.