Open brahmareddybattula opened 9 years ago
The variable NUM_MAPS
is configured for workload running, not preparing.
thanks for reply...then how to configure to execute my scenario?
Sorry, my last comment seems to be wrong after I saw the code.
@brahmareddybattula Can u paste the printed message like:
Submit MapReduce Job: /home/lv/intel/cluster/hadoop/hadoop-2.5.0-cdh5.3.2/bin/hadoop --config /home/lv/intel/cluster/hadoop/hadoop-2.5.0-cdh5.3.2/etc/hadoop jar /home/lv/intel/cluster/hadoop/hadoop-2.5.0-cdh5.3.2/share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.5.0-cdh5.3.2.jar randomtextwriter -D mapreduce.randomtextwriter.totalbytes=32000000000 -D mapreduce.job.maps=12 -D mapreduce.job.reduces=6 -D mapreduce.output.fileoutputformat.compress=false hdfs://lv-dev:54310/HiBench/Wordcount/Input
or check your report/wordcount/prepare/conf/wordcount.conf
for NUM_MAPS
, and report/wordcount/prepare/conf/sparkbench/sparkbench.conf
for hibench.default.map.parallelism
?
Number of mappers should follow the configurations as you defined and report/wordcount/prepare/conf/wordcount.conf
will tell you what and where the value of NUM_MAPS has been defined. Take mine as an example:
# Source: /home/lv/intel/HiBench/conf/99-user_defined_properties.conf
HADOOP_HOME=/home/lv/intel/cluster/hadoop/hadoop-2.5.0-cdh5.3.2
HDFS_MASTER=hdfs://lv-dev:54310
NUM_MAPS=12
NUM_REDS=6
SPARK_HOME=/home/lv/intel/cluster/spark/spark-1.3.0-bin-hadoop2.4
SPARK_MASTER=yarn-client
YARN_EXECUTOR_CORES=4
YARN_NUM_EXECUTORS=4
Let me introduce my scenario first..
want to run wordsount for 350GB with 1400 mappers.. Hence i configured NUM_MAPS=1400 and DataSize=350GB in bytes with 256MB block size..
But prepare job is running with 70 maps, As I have 7 nodes in cluster..
this is because, randomtextwriter job by default, it will take 10 maps for host... int numMapsPerHost = conf.getInt("mapreduce.randomtextwriter.mapsperhost", 10);
Currently I did like following and go head.. HADOOP_EXECUTABLE jar $HADOOP_EXAMPLES_JAR randomtextwriter \ $COMPRESS_OPT \ -D mapreduce.randomtextwriter.bytespermap=268435456 -D mapreduce.randomtextwriter.mapsperhost=200 \ $INPUT_HDFS
can we fix same..?