intelsdi-x / snap-plugin-collector-mesos

Collects Apache Mesos cluster metrics
http://snap-telemetry.io/
Apache License 2.0
14 stars 19 forks source link

[idea] Investigate pushing metrics (via a Mesos plugin) instead of pulling (via HTTP) #27

Open ghost opened 8 years ago

ghost commented 8 years ago

Currently, the plugin polls various Mesos APIs via HTTP to collect metrics. While this is probably fine, it could prove problematic: requests could stack up if they don't complete in time, and there have been some performance issues on very active clusters in the past (see MESOS-2353). While things are much better now that MESOS-2353 is resolved, I don't have data on how performant this will be at a very large scale or on very active clusters. Considering the amount of data the HTTP APIs will need to generate for us to satisfy #25 and #26, we might run into some issues with this approach.

​I've been kicking around the idea of writing a Mesos plugin that would be able to collect just the metrics we care about from within Mesos itself on a set interval and then push them to the Snap plugin, which could be listening on a local UDP or Unix socket.

Before going down this path, it'd be great to have some performance numbers with the current polling mechanism and determine how much of an improvement we might see here. Some immediate benefits I can see include:

More information about Mesos modules:

lynxbat commented 8 years ago

We have a rewrite for an Event Spec coming with would support this natively: https://github.com/intelsdi-x/snap/issues/136