intelsdi-x / snap

The open telemetry framework
http://snap-telemetry.io
Apache License 2.0
1.8k stars 294 forks source link

Configuration items in GetMetricTypes #936

Open marcin-krolik opened 8 years ago

marcin-krolik commented 8 years ago

Currently we have such situation that if Snap has been started without specific configuration provided in global config, it can prevent collector plugins which rely on configuration items in GetMetricTypes(plugin.ConfigType) from being loaded. This behavior is killing Snap philosophy that plugins can be loaded at anytime without the need of restarting snapd. Main reason for including configuration items like endpoint address, host IPs, credentials etc is the plugin author desire to build the list of available metric from single call to the metric source. For this particular reason plugin.ConfigType was introduced as a parameter to GetMetricTypes(plugin.ConfigType) function and I think it was a mistake. As a plugin user I would like to have possibility to load plugins always, and get the list of metric types regardless metric source availability. Configuration items can be easily provided in task manifest. Possible problems with connectivity, lack of the resource, credentials, metric etc should be resolved at CollectMetric(...). In my opinion GetMetricTypes(plugin.ConfigType) should be used only to build metric catalog statically (structs, composite objects etc) utilizing dynamic metric mechanism whenever needed. We already have and used utilities to build namespaces from composite objects.

I suggest we should do two step modification:

  1. Introduce new guideline or best practice for plugin authors to avoid using configuration in GetMetricTypes(plugin.ConfigType) and reimplement existing collector plugins which has dependency in GetMetricTypes(plugin.ConfigType) on configuration items based on statically generated metric list and dynamic metrics
  2. Modify collector plugin interface and remove configuration parameter from GetMetricTypes(plugin.ConfigType) method. This is a breaking change.

What are your thoughts on that?

elemoine commented 8 years ago

Makes sense to me! I like this direction. Could you please provide an example of a plugin that requires configuration values in GetMetricTypes (and that should be changed eventually)? Thanks.

tiffanyfay commented 8 years ago

I definitely agree. Seems related to: https://github.com/intelsdi-x/snap/issues/799

marcin-krolik commented 8 years ago

@elemoine: any OpenStack collector :) is currently dependant on configuration in GetMetricTypes. Take a look at Cinder, Glance or Nova. There are other examples also...

elemoine commented 8 years ago

Ok thanks. I have to admit that I hadn't checked!

ghost commented 8 years ago

Although I agree with this idea in principle, it isn't immediately clear to me how a plugin author would work around this. For example: for the Mesos collector plugin, I can simply look at the ResourceStatistics struct recursively to build metric types for a Mesos agent, which is in line with what you're suggesting here.

However, that would populate the Snap metric catalog with all possible metric types, when in reality the Mesos administrator may have not enabled one or more features (e.g. cgroups/perf_event, network/port_mapping, perf_events, etc). Therefore, there's a need to query the Mesos agent to determine which features (flags) are enabled, and then remove any metrics from the catalog that don't actually exist. Otherwise, the Snap user might find him/herself in an impossible situation of attempting to collect metrics that Snap says exist, but in reality don't exist on the Mesos agent.

Further, I don't think it's a great user experience to simply use dynamic metrics here, because in this case Mesos provides hundreds of metrics about running executors that would be really helpful for the Snap user to see via snapctl metric list. Assuming I understand the proposal here correctly, that would bury all the metrics under */*/* instead of something like */*/net_rx_bytes, which would require the user to context switch from snapctl and start reading documentation, walking API endpoints, etc.

ghost commented 8 years ago

Having said that, I do agree that it's a bad pattern for the user to reload the snap daemon when loading a plugin. Perhaps instead of modifying the Snap global config for individual plugins, it would be possible to introduce a plugin-local config that could be dynamically loaded into the daemon? Something like:

$ snapctl plugin load build/snap-plugin-collector-mysvc --config mysvc-config.json