futuresystems / big-data-stack

Hadoop-based Big Data stack (hdfs, yarn, spark, etc)
Apache License 2.0
6 stars 17 forks source link

Provide better monitoring of job progression #59

Open badmutex opened 8 years ago

badmutex commented 8 years ago

Chriss Gessner's comments from emails:

personally id like to see more doc on monitoring big data stack ie yarn/spark -  map / reduce  /executors /jobs /stages, works on port 8088 not 4040 as default would suggest ganglia itself isnt that useful for me-i like to see the number of map tasks created/running/finished etc

some doc on optimizing map/reduce executor tasks would be nice eg how many executors per worker node vs executor-cores given the openstack flavors in use