paz-sh / paz

An open-source, in-house service platform with a PaaS-like workflow, built on Docker, CoreOS, Etcd and Fleet. This repository houses the documentation and installation scripts.
http://paz.sh
Other
1.08k stars 56 forks source link

Implement an out-of-the-box monitoring solution #26

Open lukebond opened 9 years ago

lukebond commented 9 years ago

Let's discuss what will become the out-of-the-box monitoring solution for Paz.

Currently we're using cAdvisor from the Kubernetes project. This is a good solution but when used in isolation it is limited because it doesn't do storage and search of historical data.

Heapster is the evolution of this project and builds upon cAdvisor to provide a cluster-aware, searchable cAdvisor (effectively). At first glance it appears to be a good solution.

I'm open to anything else, this is not my area of expertise.

Discuss?

rosskukulinski commented 9 years ago

@lukebond another option is to use collectd with the built-in cgroups monitoring and pump that data to statsd/graphite/influxdb. I'm playing with that now and should be able to give better feedback

twilson63 commented 9 years ago

@lukebond it is early days, but I like statsd/influxdb/graphana a lot. influxdb was easy to install and easy to query.

rosskukulinski commented 9 years ago

@twilson63 we just switched back to graphite from influxdb due to frustrations with querying... but I do think graphite is dead, influxdb just needs to catch up.

sublimino commented 9 years ago

+1 for Heapster, as per the link from readme https://github.com/GoogleCloudPlatform/heapster/blob/master/docs/influxdb.md it can target an Influx/Grafana setup On 4 Mar 2015 03:57, "Ross Kukulinski" notifications@github.com wrote:

@twilson63 https://github.com/twilson63 we just switched back to graphite from influxdb due to frustrations with querying... but I do think graphite is dead, influxdb just needs to catch up.

— Reply to this email directly or view it on GitHub https://github.com/yldio/paz/issues/26#issuecomment-77093135.

lukebond commented 9 years ago

sounds like influxdb is agreed upon.

i like the way cAdvisor/heapster just automagically grabs data from Docker. not sure if statsd and collectd do that or if it will be more manual? if so my preference would go to heapster.

re graphite/grafana, is it an either/or thing, or both?

i'd like to settle on a decision here.

sublimino commented 9 years ago

Heapster is natively attuned to Docker's API, collectd requires a plugin or similar.

Running Heapster on CoreOS Heapster communicates with the local fleet server to get cluster information. It expected cAdvisor to be running on all the nodes.

Guide here - it has images for cAdvisor, influx, heapster and grafana, which saves maintaining them in this project.

lukebond commented 9 years ago

for clarity, we're going with Heapster, InfluxDB and Grafana at this point