intelsdi-x / snap-plugin-collector-mesos

Collects Apache Mesos cluster metrics
http://snap-telemetry.io/
Apache License 2.0
14 stars 19 forks source link

Use Mesos protobuf for building the metric types for an agent's "/monitor/statistics" endpoint #12

Closed ghost closed 8 years ago

ghost commented 8 years ago

This PR is an implementation of the idea I proposed in #11. It's the first of two parts required to collect metrics from live executors on a Mesos agent.

Although the commit messages should explain what's happening in a fair amount of detail, there's a lot of changes in this PR so I want to break it down a bit:

ghost commented 8 years ago

Relevant tests are updated and passing, and I've tested this using the Vagrantfile included in this repo.

Namespace follows the format: /intel/mesos/agent/executor/<framework_id>/<executor_id>/...

Load the plugin:

$ snapctl plugin load build/rootfs/snap-plugin-collector-mesos 
Plugin loaded
Name: mesos
Version: 1
Type: collector
Signed: false
Loaded Time: Mon, 09 May 2016 00:47:44 UTC

Get a count of the available agent metrics:

$ snapctl metric list | grep agent | wc -l
142

Get a sample of the available executor metrics:

$ snapctl metric list | grep executor | head -n 5
/intel/mesos/agent/executor/*/*/statistics/cpus_limit                1
/intel/mesos/agent/executor/*/*/statistics/cpus_nr_periods           1
/intel/mesos/agent/executor/*/*/statistics/cpus_nr_throttled             1
/intel/mesos/agent/executor/*/*/statistics/cpus_system_time_secs         1
/intel/mesos/agent/executor/*/*/statistics/cpus_throttled_time_secs          1

Get a sample of the available executor perf metrics:

$ snapctl metric list | grep perf | head -n 5
/intel/mesos/agent/executor/*/*/statistics/perf/alignment_faults         1
/intel/mesos/agent/executor/*/*/statistics/perf/branch_load_misses       1
/intel/mesos/agent/executor/*/*/statistics/perf/branch_loads             1
/intel/mesos/agent/executor/*/*/statistics/perf/branch_misses            1
/intel/mesos/agent/executor/*/*/statistics/perf/branches             1
marcin-krolik commented 8 years ago

I really like the idea of introducing protobuf for handling available metrics. It's super flexible and we don't need to worry to maintain it internally! :+1:

marcin-krolik commented 8 years ago

I think we could also utilize protobuf to extract units per metric and provide it to snap plugin.MetricType Would be also nice to provide description for each metric plugin.MetricType.Description_. I took a look at mesos_pb2.go file, especially at type ResourceStatistics struct godoc comments and it seems it would be great to extract it somehow and provide it as metric description. So far I don't have any good idea how to achieve that apart from parsing the file ...

ghost commented 8 years ago

That would be definitely be helpful, but I don't see where in the protobuf that we can consistently get units (you can get types, but units aren't specifically called out). The metric names usually do a pretty good job of including units in the description (secs, microsecs, etc) but I don't have an answer on how we could consistently extract them yet. Just thinking out loud here... perhaps the naming is consistent (secs, microsecs, etc) and we could extract them from the metric name itself (which is already separated by _)?

ghost commented 8 years ago

@marcin-krolik PR updated to use snap-plugin-utilities/ns. Thanks again!