Open aglsharad opened 8 years ago
On further investigation, I found below error in stderr. Could anyone please help.
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Cannot run program "/mnt/var/lib/hadoop/steps/s-2RDLYQ8DGFAFN/./post-install-config.sh" (in directory "."): error=2, No such file or directory
at com.amazon.elasticmapreduce.scriptrunner.ProcessRunner.exec(ProcessRunner.java:143)
at com.amazon.elasticmapreduce.scriptrunner.ScriptRunner.main(ScriptRunner.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: Cannot run program "/mnt/var/lib/hadoop/steps/s-2RDLYQ8DGFAFN/./post-install-config.sh" (in directory "."): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at com.amazon.elasticmapreduce.scriptrunner.ProcessRunner.exec(ProcessRunner.java:96)
... 6 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.
Interesting. These things are hard to debug when I can't replicate. Here's my controller log:
2016-08-17T23:11:44.944Z INFO startExec 'hadoop jar /mnt/var/lib/hadoop/steps/s-1QF5MGR5WOUEJ/script-runner.jar s3://tessera-emr/scripts/post-install-config.sh tessera-user tessera'
2016-08-17T23:11:44.947Z INFO Environment:
TERM=linux
HADOOP_PREFIX=/home/hadoop
CONSOLETYPE=serial
JAVA_HOME=/usr/java/latest
PIG_CONF_DIR=/home/hadoop/pig/conf
HBASE_HOME=/home/hadoop/hbase
HADOOP_YARN_HOME=/home/hadoop
HIVE_HOME=/home/hadoop/hive
HADOOP_ROOT_LOGGER=INFO,DRFA
YARN_HOME=/home/hadoop
MAIL=/var/spool/mail/hadoop
SPARK_CONF_DIR=/home/hadoop/spark/conf
IMPALA_CONF_DIR=/home/hadoop/impala/conf
PWD=/
HOSTNAME=ip-172-31-9-91
LESS_TERMCAP_mb=[01;31m
LESS_TERMCAP_me=[0m
NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat
LESS_TERMCAP_md=[01;38;5;208m
AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
HISTSIZE=1000
HADOOP_COMMON_HOME=/home/hadoop
HADOOP_LOGFILE=syslog
PATH=/home/hadoop/pig/bin:/usr/local/cuda/bin:/usr/java/latest/bin:/home/hadoop/bin:/home/hadoop/mahout/bin:/home/hadoop/hive/bin:/home/hadoop/hbase/bin:/home/hadoop/impala/bin:/home/hadoop/spark/bin:/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin:/home/hadoop/cascading/tools/multitool-20140224/bin:/home/hadoop/cascading/tools/load-20140223/bin:/home/hadoop/cascading/tools/lingual-client/bin:/home/hadoop/cascading/driven/bin
HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-1QF5MGR5WOUEJ
HIVE_CONF_DIR=/home/hadoop/hive/conf
AWS_DEFAULT_REGION=us-west-2
HADOOP_CONF_DIR=/home/hadoop/conf
IMPALA_HOME=/home/hadoop/impala
SHLVL=5
HADOOP_IDENT_STRING=hadoop
LANGSH_SOURCED=1
XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
UPSTART_JOB=rc
SPARK_HOME=/home/hadoop/spark
HADOOP_HOME_WARN_SUPPRESS=true
EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
AWS_RDS_HOME=/opt/aws/apitools/rds
PIG_CLASSPATH=/home/hadoop/pig/lib
LESS_TERMCAP_se=[0m
MAHOUT_CONF_DIR=/home/hadoop/mahout/conf
LOGNAME=hadoop
UPSTART_INSTANCE=
HBASE_CONF_DIR=/home/hadoop/hbase/conf
YARN_CONF_DIR=/home/hadoop/conf
AWS_PATH=/opt/aws
_=/usr/java/latest/bin/java
HADOOP_HOME=/home/hadoop
runlevel=3
LD_LIBRARY_PATH=/home/hadoop/lib/native:/usr/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib:
UPSTART_EVENTS=runlevel
MAHOUT_LOG_DIR=/mnt/var/log/apps
previous=N
EC2_HOME=/opt/aws/apitools/ec2
PIG_HOME=/home/hadoop/pig
LESS_TERMCAP_ue=[0m
AWS_ELB_HOME=/opt/aws/apitools/elb
RUNLEVEL=3
USER=hadoop
HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-1QF5MGR5WOUEJ/tmp
RUBYOPT=rubygems
PREVLEVEL=N
HADOOP_HDFS_HOME=/home/hadoop
HOME=/home/hadoop
HISTCONTROL=ignoredups
LESSOPEN=||/usr/bin/lesspipe.sh %s
MAHOUT_HOME=/home/hadoop/mahout
LANG=en_US.UTF-8
LESS_TERMCAP_us=[04;38;5;111m
HADOOP_MAPRED_HOME=/home/hadoop
2016-08-17T23:11:44.947Z INFO redirectOutput to /mnt/var/log/hadoop/steps/s-1QF5MGR5WOUEJ/stdout
2016-08-17T23:11:44.947Z INFO redirectError to /mnt/var/log/hadoop/steps/s-1QF5MGR5WOUEJ/stderr
2016-08-17T23:11:44.947Z INFO Working dir /mnt/var/lib/hadoop/steps/s-1QF5MGR5WOUEJ
2016-08-17T23:11:44.990Z INFO ProcessRunner started child process 20264 :
hadoop 20264 2408 0 23:11 ? 00:00:00 bash /home/hadoop/bin/hadoop jar /mnt/var/lib/hadoop/steps/s-1QF5MGR5WOUEJ/script-runner.jar s3://tessera-emr/scripts/post-install-config.sh tessera-user tessera
2016-08-17T23:11:44.992Z INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-1QF5MGR5...
2016-08-17T23:12:00.994Z INFO waitProcessCompletion ended with exit code 0 : hadoop jar /mnt/var/lib/hadoop/steps/s-1QF5MGR5...
2016-08-17T23:12:00.994Z INFO total process run time: 16 seconds
2016-08-17T23:12:01.259Z INFO Step created jobs:
2016-08-17T23:12:01.259Z INFO Step succeeded with exitCode 0 and took 16 seconds
And stderr:
Command exiting with ret '0'
And stdout:
Downloading 's3://tessera-emr/scripts/post-install-config.sh' to '/mnt/var/lib/hadoop/steps/s-1QF5MGR5WOUEJ/.'
If you can compare, maybe that will help narrow down the problem. Are you sure these scripts are being copied to/from your s3 bucket?
Yes, the scripts seems to be copied to/from s3 bucket as I can see the scripts in my S3 bucket: s3://dmar-emr My stdout is same as yours except the S3 bucket name and the ID
Downloading 's3://dmar-emr/scripts/post-install-config.sh' to '/mnt/var/lib/hadoop/steps/s-CTIH7VZHVAFZ/.'
My stderr is giving exception as shared earlier.
Controller has difference on following two I used AWS_DEFAULT_REGION=us-west-1
I got below exit code and below total process run time
2016-08-21T17:15:44.543Z INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-CTIH7VZH... 2016-08-21T17:15:50.545Z INFO waitProcessCompletion ended with exit code 1 : hadoop jar /mnt/var/lib/hadoop/steps/s-CTIH7VZH... 2016-08-21T17:15:50.545Z INFO total process run time: 6 seconds 2016-08-21T17:15:50.736Z INFO Step created jobs: 2016-08-21T17:15:50.736Z WARN Step failed with exitCode 1 and took 6 seconds
Here is my syslog
2016-08-21 17:15:48,421 INFO com.amazon.ws.emr.hadoop.fs.EmrFileSystem (main): Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation 2016-08-21 17:15:48,796 INFO amazon.emr.metrics.MetricsSaver (main): MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1471799702904 2016-08-21 17:15:48,796 INFO amazon.emr.metrics.MetricsSaver (main): Created MetricsSaver j-1CW6B85J7SWH6:i-01c12c2ecaeef2d96:RunJar:20223 period:60 /mnt/var/em/raw/i-01c12c2ecaeef2d96_20160821_RunJar_20223_raw.bin 2016-08-21 17:15:49,666 INFO com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem (main): Opening 's3://dmar-emr/scripts/post-install-config.sh' for reading 2016-08-21 17:15:49,727 INFO amazon.emr.metrics.MetricsSaver (main): Thread 1 created MetricsLockFreeSaver 1
I am always getting this error, 4th time now.
I am trying to setup Tessera environment on Amazon EMR. I have setup the AWS account. I installed the scripts by cloning the github respository. Then I ran tessera-emr.sh from my system. Cluster is created. When I tried to logon to RStudio Server from web browser, credentials did not work. I used USER=tessera-user, PASSWD=tessera On investigating, I found customJAR step failed in the cluster.
Details in customJAR:
status: Failed ID:s-1FIU62D4Y3X95 JAR location: s3://elasticmapreduce/libs/script-runner/script-runner.jar Main class: None Arguments: s3:///scripts/post-install-config.sh tessera-user tessera
Action on failure: Continue
Details from controller log files
2016-08-18T13:44:29.739Z INFO redirectOutput to /mnt/var/log/hadoop/steps/s-1FIU62D4Y3X95/stdout 2016-08-18T13:44:29.739Z INFO redirectError to /mnt/var/log/hadoop/steps/s-1FIU62D4Y3X95/stderr 2016-08-18T13:44:29.739Z INFO Working dir /mnt/var/lib/hadoop/steps/s-1FIU62D4Y3X95 2016-08-18T13:44:29.781Z INFO ProcessRunner started child process 19973 : hadoop 19973 2325 0 13:44 ? 00:00:00 bash /home/hadoop/bin/hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4Y3X95/script-runner.jar s3://dmar-emr/scripts/post-install-config.sh tessera-user tessera 2016-08-18T13:44:29.783Z INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4... 2016-08-18T13:44:35.785Z INFO waitProcessCompletion ended with exit code 1 : hadoop jar /mnt/var/lib/hadoop/steps/s-1FIU62D4... 2016-08-18T13:44:35.785Z INFO total process run time: 6 seconds 2016-08-18T13:44:35.984Z INFO Step created jobs: 2016-08-18T13:44:35.984Z WARN Step failed with exitCode 1 and took 6 seconds