jbenninghoff / cluster-validation

Scripts to validate that a cluster is ready for MapR Data Platform installation
85 stars 94 forks source link

Fixes overly-aggressive mapreduce parameters for teragen/terasort script #14

Closed dumoulma closed 7 years ago

dumoulma commented 8 years ago

Setting mapreduce.map.cpu.vcores and mapreduce.map.disk to zero caused a runTeraGenSort.sh to fail on a NEC 4 node system (24cores xeon, 192GB ram) as nearly 200 mappers were running on teragen which flooded MFS so bad it started to throw IOExceptions. One cluster finally passed but the other system failed teragen with this issue.

I recommend commenting out the values and relying on the defaults. I set the default values for these parameters so that they might be changed easily if need be.