RevolutionAnalytics / RHadoop

RHadoop
https://github.com/RevolutionAnalytics/RHadoop/wiki
763 stars 278 forks source link

Error: Java heap space - pylrmr #237

Open abdul-git opened 8 years ago

abdul-git commented 8 years ago

Hi all,

I hoping to get some help..I have exhausted all resources with "Java heap space" when running bind.cols input with plyrmr.

My code

Sys.setenv(HADOOP_HOME='/opt/cloudera/parcels/CDH/lib/hadoop') Sys.setenv(HADOOP_CMD='/usr/bin/hadoop') Sys.setenv(HADOOP_STREAMING='/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-streaming.jar')

indir <- "/home/akhan/data_R/"

library(rhdfs) library(plyrmr) options (java.parameters="-Xmx4096m") hdfs.init()

bind.cols(mtcars, carb.per.cyl = carb/cyl) options (java.parameters="-Xmx4096m") bind.cols(input("/tmp/mtcars"), carb.per.cyl = carb/cyl)

I have looked all the post and googled around to figure out what maybe causing this issue and no luck. PLYRMR is installed on all of the hadoop cluster nodes.

16/06/22 18:27:55 INFO mapreduce.Job: Running job: job_1466530531134_0121 16/06/22 18:28:00 INFO mapreduce.Job: Job job_1466530531134_0121 running in uber mode : false 16/06/22 18:28:00 INFO mapreduce.Job: map 0% reduce 0% 16/06/22 18:28:03 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000001_0, Status : FAILED Error: Java heap space 16/06/22 18:28:03 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000000_0, Status : FAILED Error: Java heap space 16/06/22 18:28:07 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000000_1, Status : FAILED Error: Java heap space 16/06/22 18:28:07 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000001_1, Status : FAILED Error: Java heap space 16/06/22 18:28:10 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000000_2, Status : FAILED Error: Java heap space 16/06/22 18:28:10 INFO mapreduce.Job: Task Id : attempt_1466530531134_0121_m_000001_2, Status : FAILED Error: Java heap space 16/06/22 18:28:15 INFO mapreduce.Job: map 100% reduce 100% 16/06/22 18:28:16 INFO mapreduce.Job: Job job_1466530531134_0121 failed with state FAILED due to: Task failed task_1466530531134_0121_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0

16/06/22 18:28:16 INFO mapreduce.Job: Counters: 13 Job Counters Failed map tasks=8 Launched map tasks=8 Other local map tasks=6 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=33624 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=16812 Total vcore-seconds taken by all map tasks=16812 Total megabyte-seconds taken by all map tasks=34430976 Map-Reduce Framework CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 16/06/22 18:28:16 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 Calls: ... mrexec -> as.big.data -> do.call -> -> mr Execution halted