google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
17.18k stars 2.32k forks source link

Statsd/Graphite Support #724

Closed rosskukulinski closed 7 years ago

rosskukulinski commented 9 years ago

After @rjnagal pinged me on twitter, I figured I should open an issue here.

We're looking for a statsd (or pure graphite) metrics output like you've done with influxdb. While we love the direction influxdb is headed, we couldn't get it to scale as well as we have with statsd & graphite.

We would take a crack at implementing this, but we're not Go people, and are a bit pre-occupied shipping features to our users these days.

I don't know if this is something that's on your roadmap or not -- if it is, we'd happily help debug/test this feature.

rjnagal commented 9 years ago

Thanks for filing the issue, @rosskukulinski

Do you collect any container data in statsd today? What's the preferred metric format for statsd

cadvisor.. ( we can remove cadvisor from the name if that's not useful).

We don't know much about statsd. Are there any restrictions on character sets that can go in the metric name? Is "cadvisor./.cpu_usage" a valid name?

rosskukulinski commented 9 years ago

We aren't sending any container data to statsd currently. We briefly played with some collectd plugins that supposedly get cgroup data and pipe it to graphite directly -- but were not successful in a quick test.

Some background on statsd:

Statsd is actually a pretty simple nodejs metric aggregation daemon. It listens for udp or tcp messages of the format: my.metric:1|c. That would increment a counter called 'my.metric' by 1. Other types exist as well. I think most of the data coming out of cAdvisor would correspond with the 'gauge' metric type.

Statsd receives and aggregates metrics before sending them to a storage backend like graphite. Graphite has been around for a long time - it provides metric storage and graph rendering and supports clustering natively. Statsd does have a number of 3rd party backends as well.

I do not believe you can use slashes in the metric name as Whisper (graphite's underlying storage mechanism) uses the filesystem. Quick googling has confirmed this. We replace the slashes in all of our HTTP metrics with underscores before sending to statsd.

jmaitrehenry commented 9 years ago

Hi, I just finish a statsd storage for cadvisor, if you wish try it and send me some feedback about how I did it, you can test it by running my docker image jmaitrehenry/cadvisor or by checking the PR #798

This is how I run it:

docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --publish=8080:8080 \
  --detach=true \
  --name=cadvisor \
  jmaitrehenry/cadvisor \
  -storage_driver=statsd \
  -storage_driver_host=192.168.59.3:8125 \
  -storage_driver_db=docker_node_001

With storage_driver_db is use as a prefix for all stats, I use it for prefixing stats with the docker node hostname for example.

This is a sample of gauge receive by statsd :

  gauges: 
   { [...]
     'docker_001.lonely_yonath.memory_working_set': 47902720,
     'docker_001.lonely_yonath.rx_bytes': 8915403,
     'docker_001.lonely_yonath.rx_errors': 0,
     'docker_001.lonely_yonath.tx_bytes': 1473345,
     'docker_001.lonely_yonath.tx_errors': 0,
     'docker_001.lonely_yonath.cpu_cumulative_usage': 15310631766,
     'docker_001.lonely_yonath.memory_usage': 61747200,
     'docker_001.lonely_yonath.-dev-sda1.fs_limit': 19507089408,
     'docker_001.lonely_yonath.-dev-sda1.fs_usage': 12288,
     'docker_001.lonely_yonath.fs_summary.fs_limit': 19507089408,
     'docker_001.lonely_yonath.fs_summary.fs_usage': 12288,
     'docker_001.cadvisor.tx_bytes': 7236437,
     'docker_001.cadvisor.tx_errors': 0,
     'docker_001.cadvisor.cpu_cumulative_usage': 106884766265,
     'docker_001.cadvisor.memory_usage': 26087424,
     'docker_001.cadvisor.memory_working_set': 26087424,
     'docker_001.cadvisor.rx_bytes': 5218,
     'docker_001.cadvisor.rx_errors': 0,
     'docker_001.cadvisor.-dev-sda1.fs_limit': 19507089408,
     'docker_001.cadvisor.-dev-sda1.fs_usage': 28672,
     'docker_001.cadvisor.fs_summary.fs_limit': 19507089408,
     'docker_001.cadvisor.fs_summary.fs_usage': 28672,
     [...]
 },
rosskukulinski commented 9 years ago

Amazing! I'll try to find some time and take this for a spin tomorrow

On Sunday, July 5, 2015, Julien Maitrehenry notifications@github.com wrote:

Hi, I just finish a statsd storage for cadvisor, if you wish try it and send me some feedback about how I did it, you can test it by running my docker image jmaitrehenry/cadvisor or by checking the PR #798 https://github.com/google/cadvisor/pull/798

This is how I run it:

docker run \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:rw \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ --publish=8080:8080 \ --detach=true \ --name=cadvisor \ jmaitrehenry/cadvisor \ -storage_driver=statsd \ -storage_driver_host=192.168.59.3:8125 \ -storage_driver_db=docker_node_001

With storage_driver_db is use as a prefix for all stats, I use it for prefixing stats with the docker node hostname for example.

This is a sample of gauge receive by statsd :

gauges: { [...] 'docker_001.lonely_yonath.memory_working_set': 47902720, 'docker_001.lonely_yonath.rx_bytes': 8915403, 'docker_001.lonely_yonath.rx_errors': 0, 'docker_001.lonely_yonath.tx_bytes': 1473345, 'docker_001.lonely_yonath.tx_errors': 0, 'docker_001.lonely_yonath.cpu_cumulative_usage': 15310631766, 'docker_001.lonely_yonath.memory_usage': 61747200, 'docker_001.lonely_yonath.-dev-sda1.fs_limit': 19507089408, 'docker_001.lonely_yonath.-dev-sda1.fs_usage': 12288, 'docker_001.lonely_yonath.fs_summary.fs_limit': 19507089408, 'docker_001.lonely_yonath.fs_summary.fs_usage': 12288, 'docker_001.cadvisor.tx_bytes': 7236437, 'docker_001.cadvisor.tx_errors': 0, 'docker_001.cadvisor.cpu_cumulative_usage': 106884766265, 'docker_001.cadvisor.memory_usage': 26087424, 'docker_001.cadvisor.memory_working_set': 26087424, 'docker_001.cadvisor.rx_bytes': 5218, 'docker_001.cadvisor.rx_errors': 0, 'docker_001.cadvisor.-dev-sda1.fs_limit': 19507089408, 'docker_001.cadvisor.-dev-sda1.fs_usage': 28672, 'docker_001.cadvisor.fs_summary.fs_limit': 19507089408, 'docker_001.cadvisor.fs_summary.fs_usage': 28672, [...] },

— Reply to this email directly or view it on GitHub https://github.com/google/cadvisor/issues/724#issuecomment-118653232.

schneidexe commented 9 years ago

Works great! The metrics paths are still a bit messy when forwarding them from statsd to graphite with whisper storage, but definitely useful. Might be worth to get aligned with #474...

jmaitrehenry commented 9 years ago

@schneidexe something like prefix.container_id.container_name.metric ? I can update my PR for adding the container_id in the path.

If you wish more metrics, I can do that in a new PR after this one will be merge without problem :)

rjnagal commented 9 years ago

Statsd support is in.

Remaining items:

stevezau commented 8 years ago

any updates on this? I'd like to forward via graphite. It sounds like it might be possible now but i can't find any docs on it

dashpole commented 7 years ago

closing via #739. For questions about future storage, see #1458.