RevolutionAnalytics / RHadoop

RHadoop
https://github.com/RevolutionAnalytics/RHadoop/wiki
763 stars 278 forks source link

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 #235

Open RimaSahl opened 8 years ago

RimaSahl commented 8 years ago

Hi there, I have installed hadoop 2.7.2 on ubuntu 16.04, and I have also installed Rstudio and Rhadoop (rmr2,rhdfs,rhbase) on a single node cluster. RHadoop packages are installed in this directory: "/home/hduser/R/x86_64-pc-linux-gnu-library/3.2/". however, I get error when I use simple example and hadoop streaming fails . Blow is more detail: Can anyone please help me out ?

out<-mapreduce(input = small.ints, map=function(k,v) keyval(v,v^2)) packageJobJar: [/tmp/hadoop-unjar3635253007512617329/] [] /tmp/streamjob2173897990252478106.jar tmpDir=null 16/06/18 17:13:03 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032 16/06/18 17:13:04 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032 16/06/18 17:13:05 INFO mapred.FileInputFormat: Total input paths to process : 1 16/06/18 17:13:05 INFO mapreduce.JobSubmitter: number of splits:2 16/06/18 17:13:05 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 16/06/18 17:13:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1466241060737_0001 16/06/18 17:13:06 INFO impl.YarnClientImpl: Submitted application application_1466241060737_0001 16/06/18 17:13:07 INFO mapreduce.Job: The url to track the job: http://amir-Inspiron-3521:8088/proxy/application_1466241060737_0001/ 16/06/18 17:13:07 INFO mapreduce.Job: Running job: job_1466241060737_0001 16/06/18 17:13:12 INFO mapreduce.Job: Job job_1466241060737_0001 running in uber mode : false 16/06/18 17:13:12 INFO mapreduce.Job: map 0% reduce 0% 16/06/18 17:13:12 INFO mapreduce.Job: Job job_1466241060737_0001 failed with state FAILED due to: Application application_1466241060737_0001 failed 2 times due to AM Container for appattempt_1466241060737_0001_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://amir-Inspiron-3521:8088/cluster/app/application_1466241060737_0001Then, click on links to logs of each attempt. Diagnostics: File file:/usr/local/hadoop/"/usr/local/hadoop_tmp"/nm-local-dir/usercache/hduser/appcache/application_1466241060737_0001/"/usr/local/hadoop_tmp"/nm-local-dir/usercache/hduser does not exist Failing this attempt. Failing the application. 16/06/18 17:13:12 INFO mapreduce.Job: Counters: 0 16/06/18 17:13:12 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

I also get this warning message whenever I load "rmr2" package

library("rmr2", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.2") Please review your hadoop settings. See help(hadoop.settings) Warning message: S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found

and here is all environment variables:

Sys.getenv() DISPLAY :0 EDITOR vi GIT_ASKPASS rpostback-askpass HADOOP_CMD /usr/local/hadoop/bin/hadoop HADOOP_HOME /usr/local/hadoop HADOOP_STREAMING /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.2.jar HOME /home/hduser LANG en_US.UTF-8 LD_LIBRARY_PATH /usr/lib/R/lib::/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/jre/lib/amd64/server:@JAVA_LD@ LN_S ln -s LOGNAME hduser MAKE make PAGER /usr/bin/pager PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin R_BROWSER xdg-open R_BZIPCMD /bin/bzip2 R_DOC_DIR /usr/share/R/doc R_GZIPCMD /bin/gzip -n R_HOME /usr/lib/R R_INCLUDE_DIR /usr/share/R/include R_LIBS_SITE /usr/local/lib/R/site-library:/usr/lib/R/site-library:/usr/lib/R/library R_LIBS_USER ~/R/x86_64-pc-linux-gnu-library/3.2 RMARKDOWN_MATHJAX_PATH /usr/lib/rstudio-server/resources/mathjax-23 R_PAPERSIZE letter R_PAPERSIZE_USER a4 R_PDFVIEWER /usr/bin/xdg-open R_PLATFORM x86_64-pc-linux-gnu R_PRINTCMD /usr/bin/lpr R_RD4PDF times,inconsolata,hyper R_SESSION_TMPDIR /tmp/RtmpJ5Mpjt R_SHARE_DIR /usr/share/R/share RS_RPOSTBACK_PATH /usr/lib/rstudio-server/bin/rpostback RSTUDIO 1 RSTUDIO_HTTP_REFERER http://127.0.0.1:8787/ RSTUDIO_PANDOC /usr/lib/rstudio-server/bin/pandoc RSTUDIO_SESSION_STREAM hduser-d RSTUDIO_USER_IDENTITY hduser R_SYSTEM_ABI linux,gcc,gxx,gfortran,? R_TEXI2DVICMD /usr/bin/texi2dvi R_UNZIPCMD /usr/bin/unzip R_ZIPCMD /usr/bin/zip SED /bin/sed SSH_ASKPASS rpostback-askpass TAR /bin/tar USER hduser