ElianoMarques commented 9 years ago

Hi,

I have a 4 haddop (v1.2.1) cluster on EC2, R 3.1.2 and Rstudio running. I have installed all the packages from rhadoop as per many examples over the net.

I can run hadoop and mapreduce jobs through linux for example: hadoop jar hadoop-examples-1.2.1.jar pi 10 100000 sucessufully runs.

I'm having an issue while running rhadoop, which is not new over the net, however I tried a lot of things and still don't work. To be more specific, this I what i wrote:

Renviroment.site file has the following enviroment variables:

export HADOOP_PREFIX="/home/ubuntu/hadoop" export HADOOP_CMD="/home/ubuntu/hadoop/bin/hadoop" HADOOP_STREAMING="/home/ubuntu/hadoop/contrib/streaming/hadoop-streaming-1.2.1.jar"

In Rstudio (or R):

Loading all the rhadoop packages - please note that I have installed correctly all dependencies and packages

require(rhdfs) require(ravro) require(rmr2) require(plyrmr)

what works fine

hdfs.init() hdfs.ls("/tmp") bind.cols(mtcars, carb.per.cyl = carb/cyl) small.ints <- to.dfs(keyval(1, 1:100))

what doesn't work fine

bind.cols(input("/tmp/mtcars"), carb.per.cyl = carb/cyl) hadoop streaming failed with error code 5

out <- mapreduce( input = small.ints, map = function(k, v) cbind(v, v^2)) Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

I investigated this issue quite a lot over the internet and ended up looking at the log files from hadoop job tracker:

java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 17 more Caused by: java.lang.RuntimeException: configuration exception at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230) at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 more Caused by: java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214) ... 23 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:186) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 24 more

Again, as many other over the internet, is seems to be a problem with Rscript. "Cannot run program "Rscript": error=2, No such file or directory"

I look at other examples where this happened and checked that Rscript was under usr/bin as suggested in some topics but still no luck. I don't have R installed on secondary node or slaves. Would this be the problem?

Could you help suggesting alternatives routes?

Thanks in advance, Eliano Marques

piccolbo commented 9 years ago

On Mon, Dec 22, 2014 at 10:19 AM, ElianoMarques notifications@github.com wrote:

I don't have R installed on secondary node or slaves. Would this be the problem?

Most certainly.

ElianoMarques commented 9 years ago

Hi thanks for your repply. I have installed R in all nodes and the error indeed changed.

Currently when I run the example:

out <- mapreduce( input = small.ints, map = function(k, v) cbind(v, v^2))

I get the following error: 15/01/02 17:13:57 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201412221800_0007_m_000000 15/01/02 17:13:57 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

Looking at the logs I get this:

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249)

Can you suggest any help?

Thanks in advance, Eliano

piccolbo commented 9 years ago

You have to follow the instructions in full.

On Fri, Jan 2, 2015 at 9:18 AM, ElianoMarques notifications@github.com wrote:

Hi thanks for your repply. I have installed R in all nodes and the error indeed changed.

Currently when I run the example:

out <- mapreduce( input = small.ints, map = function(k, v) cbind(v, v^2))

I get the following error: 15/01/02 17:13:57 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201412221800_0007_m_000000 15/01/02 17:13:57 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

Looking at the logs I get this:

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249)

Can you suggest any help?

Thanks in advance, Eliano

— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/rmr2/issues/153#issuecomment-68543639 .

ElianoMarques commented 9 years ago

Hi thanks for your reply. I believe I have followed the instructions in full.

I read again all the instructions and checked if I have proceed accordingly and couldn't spot any issue.

Your help is appreciated, Eliano

ElianoMarques commented 9 years ago

Hi again,

I have explored the error a lit bit further and realize the issue may be related with the fact that R can't load the packages that rmr2 requires.

Error in library(functional) : there is no package called ‘functional’ No traceback available Error during wrapup

As suggested in some other posts, the library with the dependencies packages have to be on a system directory and not on a standard r library. However I have all the packages under /home/rlib (because I have read the post before).

R has the .libPaths() "/home/rlib" "/usr/lib/R/site-library" "/usr/lib/R/library"

What can be happening is that when rmr2 tries to load the dependencies packages he is searching on the other libraries first (which don't have the dependencies).

I'll check if this is the case and update here for future references.

Eliano

piccolbo commented 9 years ago

You have hadoop 1.2, are you using mr1 or mr2? Only mr1 will work with rmr2 on that version. If that check right, please get to the stderr of a failing task and paste it in your next message (of course, first check for confidential information and some such).

On Sun, Jan 4, 2015 at 10:03 AM, ElianoMarques notifications@github.com wrote:

Hi thanks for your reply. I believe I have followed the instructions in full.

I read again all the instructions and checked if I have proceed accordingly and couldn't spot any issue.

Your help is appreciated, Eliano

— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/rmr2/issues/153#issuecomment-68642143 .

ElianoMarques commented 9 years ago

Hi,

Sorry for not replying earlier. I have solved this issue a couple of days ago. The second error was purely related to the location of the libraries. After reading the documentation about this package, I thought we had to create a library in a new folder (not the default one) and give permission to all users. However, (you probably already know this) the library that we need to use is the one under /usr/lib/R/library. This is the library that Rscript uses, so if the rmr2 packages and dependencies are not under this location, it will return an error. Same principle as when you use crontab in linux for Rscripts.

Thanks for your help, Eliano

piccolbo commented 9 years ago

Great, thanks for reporting back

ecacarva commented 9 years ago

Guys, I have a Virtualbox machine from Hortowors version 2.1, I installed R-Studio Server and when executing my script got this errror: Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

I have all packages of R (including rmr2) on /usr/lib64/R/library

I would like to understand the solution: did you keep packages on this folfer or move to another ?

piccolbo commented 9 years ago

You need to get to the stderr of the failing process. Error code 1 in console unfortunately covers 99% of the problems.

ElianoMarques commented 9 years ago

Try this library: /usr/lib/R/library This is probably not an issue with hadoop itself but with R and the way rscript reads the libraries. So for example check in R what is the main library, I.e the one that rscript will use when running automatic/ scheduled scripts. I realise this when running r with crontab where the script was not running due to packages being in the wrong lib. Hadoop will try to do something similar, will call rscript in nodes and check for libraries in one specific lib.

Let me know if this works.

Other suggestion is to look at rscript itself and ensure is the location that map reduce is looking for.

Eliano

Sent from my iPhone

On 6 Mar 2015, at 14:30, Prof. Elias Carvalho notifications@github.com wrote:

Guys, I have a Virtualbox machine from Hortowors version 2.1, I installed R-Studio Server and when executing my script got this errror: Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

I have all packages of R (including rmr2) on /usr/lib64/R/library

I would like to understand the solution: did you keep packages on this folfer or move to another ?

— Reply to this email directly or view it on GitHub.

ecacarva commented 9 years ago

Thank you guys for help

My library is on /usr/lib64/R/library, sorry I am relatively new on R, how to check in R what is the main library?

I observed error is Outofmemory: Java heap space My log error is below: 15/03/08 11:10:21 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

15/03/08 11:10:21 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000001_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

15/03/08 11:10:31 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000000_1, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

15/03/08 11:10:33 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000001_1, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

15/03/08 11:10:46 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000000_2, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143

15/03/08 11:10:46 INFO mapreduce.Job: Task Id : attempt_1425832539104_0004_m_000001_2, Status : FAILED Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:334) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:376)

15/03/08 11:10:57 INFO mapreduce.Job: map 100% reduce 100% 15/03/08 11:10:58 INFO mapreduce.Job: Job job_1425832539104_0004 failed with state FAILED due to: Task failed task_1425832539104_0004_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0

15/03/08 11:10:58 INFO mapreduce.Job: Counters: 13 Job Counters Failed map tasks=7 Killed map tasks=1 Launched map tasks=8 Other local map tasks=6 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=81637 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=81637 Total vcore-seconds taken by all map tasks=81637 Total megabyte-seconds taken by all map tasks=20409250 Map-Reduce Framework CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 15/03/08 11:10:58 ERROR streaming.StreamJob: Job not Successful! Streaming Command Failed!

piccolbo commented 9 years ago

OK, so the problem is not or not yet the library path. Please open a separate issue.

madhvi-gupta commented 9 years ago

I have all the RHadoop packages including rmr2 and functional package in /home/madhvi/R/x86_64-pc-linux-gnu-library/3.1 which is my default directory for R packages but still I am getting the same error as in R studio:

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

log file of stderr contains

Error in library(functional) : there is no package called ‘functional’ No traceback available Error during wrapup: Execution halted java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249)

ElianoMarques commented 9 years ago

Please check if Rscript can read the package by creating a batch script in each node and master with something like:

require(package i) where i = packages you're using in your rhadoop script. If Rscript reads from the same directory as the one being called by R your issue should be gone, unless there is an error at your map/reduce script.

Hope this helps, Eliano

Sent from my iPhone

On 11 Aug 2015, at 13:16, madhvi-gupta notifications@github.com wrote:

I have all the RHadoop packages including rmr2 and functional package in /home/madhvi/R/x86_64-pc-linux-gnu-library/3.1 which is my default directory for R packages but still I am getting the same error as in R studio:

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

log file of stderr contains

Error in library(functional) : there is no package called ‘functional’ No traceback available Error during wrapup: Execution halted java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249)

— Reply to this email directly or view it on GitHub.

RevolutionAnalytics / rmr2

Rscript - hadoop streaming failed with error code 1 #153

Renviroment.site file has the following enviroment variables:

Loading all the rhadoop packages - please note that I have installed correctly all dependencies and packages

what works fine

what doesn't work fine