RevolutionAnalytics / RHadoop

RHadoop
https://github.com/RevolutionAnalytics/RHadoop/wiki
763 stars 278 forks source link

streaming.PipeMapRed: java.lang.OutOfMemoryError: Java heap space #210

Closed lishubin closed 9 years ago

lishubin commented 10 years ago

Can someone help please. more than 10 hours has been spent on this and no progress made.

i just want to try a very simple mapreduce program through rhadoop. like a <-mapreduce(input = small.ints, map = function(k, v) cbind(v, v^2))

however, i got the following errors: 14/06/15 02:05:27 INFO mapred.MapTask: numReduceTasks: 0 14/06/15 02:05:27 INFO streaming.PipeMapRed: PipeMapRed exec [/usr/bin/Rscript, --vanilla, ./rmr-streaming-map2a1d4ddff782] 14/06/15 02:05:27 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s] 14/06/15 02:05:27 INFO streaming.PipeMapRed: MRErrorThread done 14/06/15 02:05:27 WARN streaming.PipeMapRed: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:418)

14/06/15 02:05:27 INFO streaming.PipeMapRed: PipeMapRed failed! java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/06/15 02:05:27 INFO mapred.LocalJobRunner: Map task executor complete. 14/06/15 02:05:27 WARN mapred.LocalJobRunner: job_local1607022189_0001 java.lang.Exception: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/06/15 02:05:28 INFO streaming.StreamJob: map 0% reduce 0% 14/06/15 02:05:28 INFO streaming.StreamJob: Job running in-process (local Hadoop) 14/06/15 02:05:28 ERROR streaming.StreamJob: Job not successful. Error: NA 14/06/15 02:05:28 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

thanks a lot for any inputs.

piccolbo commented 10 years ago

Please file in the rmr2 issue tracker. Thanks!

On Sat, Jun 14, 2014 at 7:11 PM, lishubin notifications@github.com wrote:

Can someone help please. more than 10 hours has been spent on this and no progress made.

i just want to try a very simple mapreduce program through rhadoop. like a <-mapreduce(input = small.ints, map = function(k, v) cbind(v, v^2))

however, i got the following errors: 14/06/15 02:05:27 INFO mapred.MapTask: numReduceTasks: 0 14/06/15 02:05:27 INFO streaming.PipeMapRed: PipeMapRed exec [/usr/bin/Rscript, --vanilla, ./rmr-streaming-map2a1d4ddff782] 14/06/15 02:05:27 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s] 14/06/15 02:05:27 INFO streaming.PipeMapRed: MRErrorThread done 14/06/15 02:05:27 WARN streaming.PipeMapRed: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212) at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152) at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51) at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:418)

14/06/15 02:05:27 INFO streaming.PipeMapRed: PipeMapRed failed! java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/06/15 02:05:27 INFO mapred.LocalJobRunner: Map task executor complete. 14/06/15 02:05:27 WARN mapred.LocalJobRunner: job_local1607022189_0001 java.lang.Exception: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/06/15 02:05:28 INFO streaming.StreamJob: map 0% reduce 0% 14/06/15 02:05:28 INFO streaming.StreamJob: Job running in-process (local Hadoop) 14/06/15 02:05:28 ERROR streaming.StreamJob: Job not successful. Error: NA 14/06/15 02:05:28 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1

thanks a lot for any inputs.

— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/RHadoop/issues/210.