Open DonFlat opened 2 years ago
Is it possible to kill a node while running? Is possible to limit memory for each node?
Find out how to run pi calculation against Spark and Hadoop.
Benchmark for Pi: how long does it take to reach a decimal digit?
Examples to run hadoop example applications
Hadoop cluster set up
Hadoop mapreduce example
Hadoop configuration files manual
It seems that we only have word count as example for both Spark and Hadoop?
Spark cluster mode:
Two files that need to be modified with latest node name:
modify worker
Default master node: node102, worker: node103
hdfs-site.xml contains replica number
In HiBench, the following workloads have Hadoop version:
ml/Kmeans, ml/bayes, websearch/pagerank, sql/aggregation,join,scan
micro/dfsioe, micro/sleep, micro/sort, micro/terasort, micro/wordcount
The hadoop submission command:
/var/scratch/ddps2206/hadoop-3.3.4/bin/hadoop --config /var/scratch/ddps2206/hadoop-3.3.4/etc/hadoop jar /var/scratch/ddps2206/HiBench/autogen/target/autogen-8.0-SNAPSHOT-jar-with-dependencies.jar org.apache.mahout.clustering.kmeans.GenKMeansDataset -D hadoop.job.history.user.location=hdfs://node108:9000/HiBench/Kmeans/Input/samples -sampleDir hdfs://node108:9000/HiBench/Kmeans/Input/samples -clusterDir hdfs://node108:9000/HiBench/Kmeans/Input/cluster -numClusters 5 -numSamples 30000 -samplesPerFile 6000 -sampleDimension 3
/var/scratch/ddps2206/HiBench/conf/hibench.conf to adjust the size of input 636M small/ 1.4M tiny/ 4.0G large/ 18.5G Huge/ /var/scratch/ddps2206/HiBench/conf/spark.conf to adjust memory cores, default is 4G both
hadoop fs -get /HiBench/Kmeans/Input/samples/*