nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
637 stars 116 forks source link

Add support for Ganglia #106

Closed dm-tran closed 8 years ago

dm-tran commented 8 years ago

Thanks for creating flintrock and making it open source, it's really fast.

My team and I are moving from spark-ec2 to flintrock. One thing that is missing is the support for ganglia, to monitor Spark clusters.

A user should be able to specify the version and (de)activate ganglia's installation, e.g. :

services:
  spark:
    version: 1.6.0
  hdfs:
    version: 2.7.1
  ganglia:
    version: x.y.z

[...]

launch:
  num-slaves: 1
  # install-hdfs: True
  # install-spark: False
  install-ganglia: True
nchammas commented 8 years ago

Hi @dm-tran and thank you for the kind words. They are appreciated.

I feel a bit hesitant to add support for other services that are not strictly Spark-related. But coming from spark-ec2 I totally understand why you want Ganglia in Flintrock too.

One thing that Flintrock offers over spark-ec2 is the ability to copy files and run commands against the cluster in parallel. This should make it relatively easy to piece together your own post-launch scripts for setting up other things you need, like Ganglia.

For example:

flintrock launch dm-tran
flintrock run-command dm-tran 'sudo yum install -y ganglia'

Or alternately:

flintrock launch dm-tran
flintrock copy-file dm-tran ganglia-setup.sh
flintrock run-command dm-tran 'chmod u+x ganglia-setup.sh'
flintrock run-command dm-tran './ganglia-setup.sh'

Would an approach like that meet your needs?

Another thing you can do is create your own custom AMIs that have Ganglia pre-installed from the default Amazon Linux AMIs, and use those to launch your Flintrock clusters.

I am open to considering "promoting" certain services into first-class Flintrock features in the future, but I think the bar will be pretty high. If something proves to be extremely valuable in conjunction with Spark and is widely used by Flintrock users, I would consider it.

dm-tran commented 8 years ago

Thanks @nchammas for the answer. An approach using copy-file and run-command (and optionally custom AMIs) would indeed meet our needs, we'll try that.

I'm closing the issue.

dineshdharme commented 6 years ago

I would love to have Ganglia as additional support. Since it is so important to understand what the cluster is doing and to figure where the problem lies.