Closed piccolbo closed 9 years ago
As suggested by piccolbo, I doubled the backend parameters:
#' Simple test of RHadoop.
example = function ()
{
evenodd = function (v)
{
ret = if (0 == bitwAnd (as.integer (v), 1)) "even" else "odd"
return (ret)
}
mapper = function (k, v)
{
keyval (unlist (lapply (v, evenodd)), v)
}
reducer = function (key, values)
{
keyval (key, length (values)) # sum (values))
}
rmr.options (
backend.parameters = list (
hadoop = list (
D = "mapreduce.map.java.opts=-Xmx800M",
D = "mapreduce.reduce.java.opts=-Xmx800M",
D = "mapreduce.map.memory.mb=8192",
D = "mapreduce.reduce.memory.mb=8192"
)
)
)
ints = to.dfs (1:100)
#calc = mapreduce (input = ints, map = mapper)
calc = mapreduce (input = ints, map = mapper, reduce = reducer)
print (from.dfs (calc))
}
But the error is still the same:
> hdfs.init (); test::example ()
14/11/14 09:00:59 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
14/11/14 09:00:59 INFO compress.CodecPool: Got brand-new compressor [.deflate]
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.4.0.2.1.5.0-695.jar] /tmp/streamjob8649825258584273018.jar tmpDir=null
14/11/14 09:01:02 INFO impl.TimelineClientImpl: Timeline service address: http://hrn.bkw-hdp.ch:8188/ws/v1/timeline/
14/11/14 09:01:02 INFO client.RMProxy: Connecting to ResourceManager at hrn.bkw-hdp.ch/10.10.0.12:8050
14/11/14 09:01:03 INFO impl.TimelineClientImpl: Timeline service address: http://hrn.bkw-hdp.ch:8188/ws/v1/timeline/
14/11/14 09:01:03 INFO client.RMProxy: Connecting to ResourceManager at hrn.bkw-hdp.ch/10.10.0.12:8050
14/11/14 09:01:03 INFO mapred.FileInputFormat: Total input paths to process : 1
14/11/14 09:01:04 INFO mapreduce.JobSubmitter: number of splits:2
14/11/14 09:01:04 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415823260172_0004
14/11/14 09:01:04 INFO impl.YarnClientImpl: Submitted application application_1415823260172_0004
14/11/14 09:01:04 INFO mapreduce.Job: The url to track the job: http://hrn.bkw-hdp.ch:8088/proxy/application_1415823260172_0004/
14/11/14 09:01:04 INFO mapreduce.Job: Running job: job_1415823260172_0004
14/11/14 09:01:12 INFO mapreduce.Job: Job job_1415823260172_0004 running in uber mode : false
14/11/14 09:01:12 INFO mapreduce.Job: map 0% reduce 0%
14/11/14 09:01:17 INFO mapreduce.Job: Task Id : attempt_1415823260172_0004_m_000000_0, Status : FAILED
Error: Java heap space
14/11/14 09:01:22 INFO mapreduce.Job: Task Id : attempt_1415823260172_0004_m_000000_1, Status : FAILED
Error: Java heap space
14/11/14 09:01:27 INFO mapreduce.Job: Task Id : attempt_1415823260172_0004_m_000000_2, Status : FAILED
Error: Java heap space
14/11/14 09:01:34 INFO mapreduce.Job: map 100% reduce 100%
14/11/14 09:01:34 INFO mapreduce.Job: Job job_1415823260172_0004 failed with state FAILED due to: Task failed task_1415823260172_0004_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
14/11/14 09:01:35 INFO mapreduce.Job: Counters: 12
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=14241
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=14241
Total vcore-seconds taken by all map tasks=14241
Total megabyte-seconds taken by all map tasks=116662272
Map-Reduce Framework
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
14/11/14 09:01:35 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1
For completeness, here is the log for the first of the four failed map tasks for the example program:
Log Type: syslog
Log Length: 3526
2014-11-14 09:01:15,374 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-11-14 09:01:15,409 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink ganglia started
2014-11-14 09:01:15,489 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-11-14 09:01:15,489 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2014-11-14 09:01:15,502 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-11-14 09:01:15,503 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1415823260172_0004, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@7ad66ecc)
2014-11-14 09:01:15,595 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-11-14 09:01:16,025 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /grid/00/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/01/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/02/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/03/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/04/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/05/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/06/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/07/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/08/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004,/grid/09/hadoop/yarn/local/usercache/vcn_osd/appcache/application_1415823260172_0004
2014-11-14 09:01:16,587 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2014-11-14 09:01:17,061 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2014-11-14 09:01:17,346 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://hnn.bkw-hdp.ch:8020/tmp/file53c624829a08:273+274
2014-11-14 09:01:17,397 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2014-11-14 09:01:17,398 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.deflate]
2014-11-14 09:01:17,409 INFO [main] org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2014-11-14 09:01:17,416 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2014-11-14 09:01:17,619 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:963)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:419)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Reverting to rmr2 version 3.1.0 (rmr2_3.1.0.tar.gz) found here; https://github.com/RevolutionAnalytics/rmr2/tree/master/build also produces the same error. Also version 3.0.0 (rmr2_3.0.0.tar.gz).
Installing the rmr2 package yields a number of suspicious messages:
> install.packages("rmr2_3.2.0.tar.gz", repos=NULL, source=TRUE)
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:47:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:00 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
* installing *source* package ‘rmr2’ ...
** libs
g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/local/include `/usr/lib64/R/bin/Rscript -e "Rcpp:::CxxFlags()"` -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c extras.cpp -o extras.o
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:04 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
extras.cpp: In function ‘SEXPREC* vsum(SEXPREC*)’:
extras.cpp:22: warning: comparison between signed and unsigned integer expressions
g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/local/include `/usr/lib64/R/bin/Rscript -e "Rcpp:::CxxFlags()"` -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c hbase-to-df.cpp -o hbase-to-df.o
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:11 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
hbase-to-df.cpp: In function ‘SEXPREC* raw_list_to_character(SEXPREC*)’:
hbase-to-df.cpp:27: warning: comparison between signed and unsigned integer expressions
hbase-to-df.cpp: In function ‘SEXPREC* hbase_to_df(SEXPREC*, SEXPREC*)’:
hbase-to-df.cpp:56: warning: comparison between signed and unsigned integer expressions
hbase-to-df.cpp:60: warning: comparison between signed and unsigned integer expressions
hbase-to-df.cpp:64: warning: comparison between signed and unsigned integer expressions
g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/local/include `/usr/lib64/R/bin/Rscript -e "Rcpp:::CxxFlags()"` -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c keyval.cpp -o keyval.o
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:19 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
keyval.cpp: In function ‘SEXPREC* sapply_rmr_length(SEXPREC*)’:
keyval.cpp:57: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* sapply_rmr_length_lossy_data_frame(SEXPREC*)’:
keyval.cpp:64: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* sapply_length_keyval(SEXPREC*)’:
keyval.cpp:79: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* sapply_null_keys(SEXPREC*)’:
keyval.cpp:86: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* sapply_is_list(SEXPREC*)’:
keyval.cpp:94: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* lapply_key_val(SEXPREC*, std::string)’:
keyval.cpp:101: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* are_factor(SEXPREC*)’:
keyval.cpp:115: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* are_data_frame(SEXPREC*)’:
keyval.cpp:129: warning: comparison between signed and unsigned integer expressions
keyval.cpp: In function ‘SEXPREC* are_matrix(SEXPREC*)’:
keyval.cpp:136: warning: comparison between signed and unsigned integer expressions
g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/local/include `/usr/lib64/R/bin/Rscript -e "Rcpp:::CxxFlags()"` -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c t-list.cpp -o t-list.o
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:26 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
t-list.cpp: In function ‘SEXPREC* t_list(SEXPREC*)’:
t-list.cpp:27: warning: comparison between signed and unsigned integer expressions
t-list.cpp:29: warning: comparison between signed and unsigned integer expressions
t-list.cpp:31: warning: comparison between signed and unsigned integer expressions
g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/local/include `/usr/lib64/R/bin/Rscript -e "Rcpp:::CxxFlags()"` -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c typed-bytes.cpp -o typed-bytes.o
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:33 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
typed-bytes.cpp: In function ‘T unserialize_numeric(const raw&, unsigned int&) [with T = double]’:
typed-bytes.cpp:136: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = std::basic_string<char, std::char_traits<char>, std::allocator<char> >]’:
typed-bytes.cpp:188: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘Rcpp::List unserialize_list(const raw&, unsigned int&)’:
typed-bytes.cpp:201: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘Rcpp::List unserialize_map(const raw&, unsigned int&)’:
typed-bytes.cpp:217: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘Rcpp::RObject unserialize(const raw&, unsigned int&, int)’:
typed-bytes.cpp:286: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘void serialize_noattr(const Rcpp::RObject&, raw&, bool)’:
typed-bytes.cpp:478: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp:482: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp:509: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp:515: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = char]’:
typed-bytes.cpp:191: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = unsigned char]’:
typed-bytes.cpp:240: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = bool]’:
typed-bytes.cpp:300: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = int]’:
typed-bytes.cpp:303: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = long int]’:
typed-bytes.cpp:321: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = float]’:
typed-bytes.cpp:324: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
typed-bytes.cpp: In function ‘std::vector<T, std::allocator<_Tp1> > unserialize_vector(const raw&, unsigned int&, int) [with T = double]’:
typed-bytes.cpp:327: instantiated from here
typed-bytes.cpp:180: warning: comparison between signed and unsigned integer expressions
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:44 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
g++ -m64 -shared -L/usr/local/lib64 -o rmr2.so extras.o hbase-to-df.o keyval.o t-list.o typed-bytes.o -L/usr/lib64/R/lib -lR
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:48 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
((which hbase && (mkdir -p ../inst; cd hbase-io; sh build_linux.sh; cp build/dist/* ../../inst)) || echo "can't build hbase IO classes, skipping" >&2)
/usr/bin/hbase
build_linux.sh: line 163: [: missing `]'
Using /usr/lib/hadoop-mapreduce as hadoop home
Using /usr/lib/hbase as hbase home
Copying libs into local build directory
Cannot find hbase jars in hbase home
cp: cannot stat `build/dist/*': No such file or directory
can't build hbase IO classes, skipping
installing to /usr/lib64/R/library/rmr2/libs
** R
** preparing package for lazy loading
Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘quickcheck’
** help
*** installing help indices
converting help for package ‘rmr2’
finding HTML links ... done
bigdataobject html
dfs.empty html
equijoin html
fromdfstodfs html
keyval html
make.io.format html
mapreduce html
rmr-package html
rmr.options html
rmr.sample html
rmr.str html
scatter html
status html
tomaptoreduce html
vsum html
** building package indices
** testing if installed package can be loaded
HADOOP_CMD=/usr/bin/hadoop
Be sure to run hdfs.init()
14/11/14 11:48:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/14 11:48:53 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
* DONE (rmr2)
Making 'packages.html' ... done
>
Solved (or at least a work-around). Based on https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/V6Td-XRQC_8 I adjusted the mapreduce.task.io.sort.mb value down from 1024 to 64 and this allowed the small example program to work correclty.
Thanks for researching this and reporting back. Since you reproed the problem against three different versions of rmr2, I am less inclined to think it's a problem with the way rmr2 sets some hadoop properties. It seems to me that your mapreduce.task.io.sort.mb was set 10X the default and that may not be compatible with other settings. Hadoop MR has lots of settings that require understanding of the internals and may interact and that's not a good thing. There are even research projects about automatic configuration.
I tested ( on Horton works HDP 2.3.) I can only run rmr2 3.1.0 version the next two versions 3.2.0 and 3.3.0 and 3.3.1, I couldnt run it because i am getting Java heap space error only with reducer, my mapreduce.task.io.sort.mb = 64
system settings Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client") #Its hadooop path Sys.setenv(HADOOP_CMD="/usr/hdp/2.3.2.0-2950/hadoop/bin/hadoop") #It's CMD path Sys.setenv(HADOOP_STREAMING="/usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-streaming.jar") # It's streaming path Sys.setenv(HADOOP_HEAPSIZE=2900)
following are my rmr settings
rmr.options.env = new.env(parent=emptyenv())
rmr.options.env$backend = "hadoop" rmr.options.env$profile.nodes = "off" rmr.options.env$hdfs.tempdir = "/tmp" #can't check it exists here rmr.options.env$exclude.objects = NULL
rmr.options.env$backend.parameters =
list(
hadoop =
list(cmdenv="PATH=/usr/local/lib64/R/bin/",
D = "mapreduce.map.java.opts=-Xmx1024M",
D = "mapreduce.reduce.java.opts=-Xmx2048M",
D = "mapred.tasktracker.map.tasks.maximum",
D = "mapred.tasktracker.reduce.tasks.maximum",
D = "mapreduce.map.memory.mb = 5120",
D = "mapreduce.reduce.memory.mb = 5120",
D = "mapreduce.task.io.sort.mb =64",
D = "yarn.scheduler.minimum-allocation.mb = 1000",
D = "yarn.scheduler.maximum-allocation.mb = 2000"))
@derrickoswald :: were you able to run rmr2 3.2.0 or later versions without java heap space error? I tried all the version after 3.1.0 but my job fails at the reducer
Thanks Shashi
rmr2 V. 3.3.1 works on my HDP sandbox 2.3 without any problem, both with the configuration you defined above and with my (much shorter) setup, which includes:
Sys.setenv("HADOOP_CMD"="/usr/bin/hadoop") Sys.setenv("HADOOP_STREAMING"="/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-streaming-2.7.1.2.3.0.0-2557.jar")
rmr.options(backend.parameters = list( hadoop = list(D = "mapreduce.map.memory.mb=1024") ))
It might be helpful to post your code and/or the error messages, because the answers to the following questions matter:
There may be more questions to ask, but that's a good start.
On Wed, Nov 25, 2015 at 2:54 PM, ShashiGudur notifications@github.com wrote:
@derrickoswald https://github.com/derrickoswald :: were able to run rmr2 3.2.0 or later versions without java heap space error? I tried all the version after 3.1.0 but my job fails at the reducer
Thanks Shashi
— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/rmr2/issues/148#issuecomment-159715448 .
Here is the code and error
if I only use map function it works perfect and fails when I use reducer or map reduce together
rbingroups = rbinom(30, n = 50, prob = 0.5) groups<-tapply(rbingroups, rbingroups, length) groups = to.dfs(rbingroups) WARNING: Use "yarn jar" to launch YARN applications. 15/11/25 13:32:40 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 15/11/25 13:32:40 INFO compress.CodecPool: Got brand-new compressor [.deflate] rmr.options.env = new.env(parent=emptyenv())
rmr.options.env$backend = "hadoop" rmr.options.env$profile.nodes = "off" rmr.options.env$hdfs.tempdir = "/tmp" #can't check it exists here rmr.options.env$exclude.objects = NULL
rmr.options.env$backend.parameters =
- list(
- hadoop =
- list(cmdenv="PATH=/usr/local/lib64/R/bin/",
- D = "mapreduce.map.java.opts=-Xmx1024M",
- D = "mapreduce.reduce.java.opts=-Xmx2048M",
D = "mapred.job.queue.name=other",
- D = "mapred.tasktracker.map.tasks.maximum",
- D = "mapred.tasktracker.reduce.tasks.maximum",
- D = "mapreduce.map.memory.mb = 5120",
- D = "mapreduce.reduce.memory.mb = 5120",
- D = "mapreduce.task.io.sort.mb =64",
- D = "yarn.scheduler.minimum-allocation.mb = 1000",
- D = "yarn.scheduler.maximum-allocation.mb = 2000")) rbingroups = rbinom(30, n = 50, prob = 0.5) groups<-tapply(rbingroups, rbingroups, length) groups = to.dfs(rbingroups) WARNING: Use "yarn jar" to launch YARN applications. 15/11/25 13:36:59 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 15/11/25 13:36:59 INFO compress.CodecPool: Got brand-new compressor [.deflate] out = mapreduce(input = groups,map = function(k,v) keyval(v, 1),reduce = function(k,vv) keyval(k, length(vv))) WARNING: Use "yarn jar" to launch YARN applications. packageJobJar: [] [/usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-streaming-2.7.1.2.3.2.0-2950.jar] /tmp/streamjob6715154162749713919.jar tmpDir=null 15/11/25 13:37:18 INFO impl.TimelineClientImpl: Timeline service address: http://znlhacdq0003.amer.zurich.corp:8188/ws/v1/timeline/ 15/11/25 13:37:18 INFO impl.TimelineClientImpl: Timeline service address: http://znlhacdq0003.amer.zurich.corp:8188/ws/v1/timeline/ 15/11/25 13:37:18 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 74288 for uszllf7 on ha-hdfs:ZHDPDEV 15/11/25 13:37:18 INFO security.TokenCache: Got dt for hdfs://ZHDPDEV; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ZHDPDEV, Ident: (HDFS_DELEGATION_TOKEN token 74288 for uszllf7) 15/11/25 13:37:20 INFO mapred.FileInputFormat: Total input paths to process : 1 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: number of splits:2 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1447621077216_4452 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ZHDPDEV, Ident: (HDFS_DELEGATION_TOKEN token 74288 for uszllf7) 15/11/25 13:37:21 INFO impl.YarnClientImpl: Submitted application application_1447621077216_4452 15/11/25 13:37:21 INFO mapreduce.Job: The url to track the job: http://znlhacdq0001.amer.zurich.corp:8088/proxy/application_1447621077216_4452/ 15/11/25 13:37:21 INFO mapreduce.Job: Running job: job_1447621077216_4452 15/11/25 13:37:33 INFO mapreduce.Job: Job job_1447621077216_4452 running in uber mode : false 15/11/25 13:37:33 INFO mapreduce.Job: map 0% reduce 0% 15/11/25 13:37:43 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_0, Status : FAILED Error: Java heap space 15/11/25 13:37:43 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_0, Status : FAILED Error: Java heap space 15/11/25 13:37:52 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_1, Status : FAILED Error: Java heap space 15/11/25 13:37:53 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_1, Status : FAILED Error: Java heap space 15/11/25 13:37:57 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_2, Status : FAILED Error: Java heap space 15/11/25 13:37:58 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_2, Status : FAILED Error: Java heap space 15/11/25 13:38:03 INFO mapreduce.Job: map 100% reduce 100% 15/11/25 13:38:03 INFO mapreduce.Job: Job job_1447621077216_4452 failed with state FAILED due to: Task failed task_1447621077216_4452_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0
15/11/25 13:38:03 INFO mapreduce.Job: Counters: 13 Job Counters Failed map tasks=7 Killed map tasks=1 Launched map tasks=8 Other local map tasks=6 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=44129 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=44129 Total vcore-seconds taken by all map tasks=44129 Total megabyte-seconds taken by all map tasks=451880960 Map-Reduce Framework CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 15/11/25 13:38:03 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 15/11/25 13:38:12 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 3600 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://ZHDPDEV/tmp/file113f63ecb6cc' to trash at: hdfs://ZHDPDEV/user/uszllf7/.Trash/Current 15/11/25 13:38:19 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 3600 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://ZHDPDEV/tmp/file113f7614dedf' to trash at: hdfs://ZHDPDEV/user/uszllf7/.Trash/Current
Try with my setup (detailed up in this thread), and if that doesn't work, try re-installing. It's a mystery...
On Wed, Nov 25, 2015 at 7:26 PM, ShashiGudur notifications@github.com wrote:
Here is the code and error
if I only use map function it works perfect and fails when I use reducer
or map reduce together
rbingroups = rbinom(30, n = 50, prob = 0.5) groups<-tapply(rbingroups, rbingroups, length) groups = to.dfs(rbingroups) WARNING: Use "yarn jar" to launch YARN applications. 15/11/25 13:32:40 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 15/11/25 13:32:40 INFO compress.CodecPool: Got brand-new compressor [.deflate] rmr.options.env = new.env(parent=emptyenv())
rmr.options.env$backend = "hadoop" rmr.options.env$profile.nodes = "off" rmr.options.env$hdfs.tempdir = "/tmp" #can't check it exists here rmr.options.env$exclude.objects = NULL
rmr.options.env$backend.parameters =
- list(
- hadoop =
- list(cmdenv="PATH=/usr/local/lib64/R/bin/",
- D = "mapreduce.map.java.opts=-Xmx1024M",
- D = "mapreduce.reduce.java.opts=-Xmx2048M",
D = "mapred.job.queue.name=other",
- D = "mapred.tasktracker.map.tasks.maximum",
- D = "mapred.tasktracker.reduce.tasks.maximum",
- D = "mapreduce.map.memory.mb = 5120",
- D = "mapreduce.reduce.memory.mb = 5120",
- D = "mapreduce.task.io.sort.mb =64",
- D = "yarn.scheduler.minimum-allocation.mb = 1000",
- D = "yarn.scheduler.maximum-allocation.mb = 2000")) rbingroups = rbinom(30, n = 50, prob = 0.5) groups<-tapply(rbingroups, rbingroups, length) groups = to.dfs(rbingroups) WARNING: Use "yarn jar" to launch YARN applications. 15/11/25 13:36:59 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 15/11/25 13:36:59 INFO compress.CodecPool: Got brand-new compressor [.deflate] out = mapreduce(input = groups,map = function(k,v) keyval(v, 1),reduce = function(k,vv) keyval(k, length(vv))) WARNING: Use "yarn jar" to launch YARN applications. packageJobJar: [] [/usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-streaming-2.7.1.2.3.2.0-2950.jar] /tmp/streamjob6715154162749713919.jar tmpDir=null 15/11/25 13:37:18 INFO impl.TimelineClientImpl: Timeline service address: http://znlhacdq0003.amer.zurich.corp:8188/ws/v1/timeline/ 15/11/25 13:37:18 INFO impl.TimelineClientImpl: Timeline service address: http://znlhacdq0003.amer.zurich.corp:8188/ws/v1/timeline/ 15/11/25 13:37:18 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 74288 for uszllf7 on ha-hdfs:ZHDPDEV 15/11/25 13:37:18 INFO security.TokenCache: Got dt for hdfs://ZHDPDEV; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ZHDPDEV, Ident: (HDFS_DELEGATION_TOKEN token 74288 for uszllf7) 15/11/25 13:37:20 INFO mapred.FileInputFormat: Total input paths to process : 1 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: number of splits:2 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1447621077216_4452 15/11/25 13:37:20 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ZHDPDEV, Ident: (HDFS_DELEGATION_TOKEN token 74288 for uszllf7) 15/11/25 13:37:21 INFO impl.YarnClientImpl: Submitted application application_1447621077216_4452 15/11/25 13:37:21 INFO mapreduce.Job: The url to track the job: http://znlhacdq0001.amer.zurich.corp:8088/proxy/application_1447621077216_4452/ 15/11/25 13:37:21 INFO mapreduce.Job: Running job: job_1447621077216_4452 15/11/25 13:37:33 INFO mapreduce.Job: Job job_1447621077216_4452 running in uber mode : false 15/11/25 13:37:33 INFO mapreduce.Job: map 0% reduce 0% 15/11/25 13:37:43 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_0, Status : FAILED Error: Java heap space 15/11/25 13:37:43 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_0, Status : FAILED Error: Java heap space 15/11/25 13:37:52 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_1, Status : FAILED Error: Java heap space 15/11/25 13:37:53 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_1, Status : FAILED Error: Java heap space 15/11/25 13:37:57 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000000_2, Status : FAILED Error: Java heap space 15/11/25 13:37:58 INFO mapreduce.Job: Task Id : attempt_1447621077216_4452_m_000001_2, Status : FAILED Error: Java heap space 15/11/25 13:38:03 INFO mapreduce.Job: map 100% reduce 100% 15/11/25 13:38:03 INFO mapreduce.Job: Job job_1447621077216_4452 failed with state FAILED due to: Task failed task_1447621077216_4452_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0
15/11/25 13:38:03 INFO mapreduce.Job: Counters: 13 Job Counters Failed map tasks=7 Killed map tasks=1 Launched map tasks=8 Other local map tasks=6 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=44129 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=44129 Total vcore-seconds taken by all map tasks=44129 Total megabyte-seconds taken by all map tasks=451880960 Map-Reduce Framework CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 15/11/25 13:38:03 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 15/11/25 13:38:12 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 3600 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://ZHDPDEV/tmp/file113f63ecb6cc' to trash at: hdfs://ZHDPDEV/user/uszllf7/.Trash/Current 15/11/25 13:38:19 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 3600 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://ZHDPDEV/tmp/file113f7614dedf' to trash at: hdfs://ZHDPDEV/user/uszllf7/.Trash/Current
— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/rmr2/issues/148#issuecomment-159765034 .
Originally reported here