RevolutionAnalytics / RHadoop

RHadoop
https://github.com/RevolutionAnalytics/RHadoop/wiki
763 stars 278 forks source link

mapreduce() issue #189

Closed carriezhang closed 11 years ago

carriezhang commented 11 years ago

I just installed all packages needed and try to run the following code:

a.dfs=to.dfs(keyval(1,1:100)) 13/05/26 13:50:41 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/05/26 13:50:41 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 13/05/26 13:50:41 INFO compress.CodecPool: Got brand-new compressor

mapreduce(input = a.dfs, map = function(k,v) keyval(v, NULL), reduce = function(k,vv) keyval(k, length(vv))) packageJobJar: [/tmp/Rtmp1MOG1F/rmr-local-env1ce27a40ab8f, /tmp/Rtmp1MOG1F/rmr-global-env1ce2b421769, /tmp/Rtmp1MOG1F/rmr-streaming-map1ce25d2cb080, /tmp/Rtmp1MOG1F/rmr-streaming-reduce1ce27e4ab74f, /app/hadoop/tmp/hadoop-unjar3649673484716936164/] [] /tmp/streamjob5130145768891165183.jar tmpDir=null 13/05/26 14:00:39 INFO mapred.FileInputFormat: Total input paths to process : 1 13/05/26 14:00:40 INFO streaming.StreamJob: getLocalDirs(): [/app/hadoop/tmp/mapred/local] 13/05/26 14:00:40 INFO streaming.StreamJob: Running job: job_201305261306_0004 13/05/26 14:00:40 INFO streaming.StreamJob: To kill this job, run: 13/05/26 14:00:40 INFO streaming.StreamJob: /usr/lib/hadoop/hadoop/libexec/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201305261306_0004 13/05/26 14:00:40 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201305261306_0004 13/05/26 14:00:41 INFO streaming.StreamJob: map 0% reduce 0% 13/05/26 14:01:20 INFO streaming.StreamJob: map 100% reduce 100% 13/05/26 14:01:20 INFO streaming.StreamJob: To kill this job, run: 13/05/26 14:01:20 INFO streaming.StreamJob: /usr/lib/hadoop/hadoop/libexec/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201305261306_0004 13/05/26 14:01:20 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201305261306_0004 13/05/26 14:01:20 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201305261306_0004_m_000000 13/05/26 14:01:20 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 Deleted hdfs://localhost:54310/tmp/Rtmp1MOG1F/file1ce2497e2ca

It seems that there is something wrong with mapreduce, but I really do not understand where I went wrong.

piccolbo commented 11 years ago

please file in the appropriate issue tracker. This seems an rmr2 issue so the right place is https://github.com/RevolutionAnalytics/rmr2/issues/ Thanks.

kholodilov commented 11 years ago

carriezhang, did you fix this issue for yourself? I have the same problem, could you please share your experience?