Intel-bigdata / HiBench

HiBench is a big data benchmark suite.
Other
1.45k stars 761 forks source link

README typo #676

Open xwu99 opened 3 years ago

xwu99 commented 3 years ago

Bayesian Classification (Bayes)

Naive Bayes is a simple multiclass classification algorithm with the assumption of independence between every pair of features. This workload is implemented in spark.mllib and uses the automatically generated documents whose words follow the zipfian distribution. The dict used for text generation is also from the default linux file /usr/share/dict/linux.words.ords. => /usr/share/dict/linux.words.