poxet / Influx-Capacitor

Influx-capacitor collects metrics from windows machines using Performance Counters. Data is sent to influxDB to be viewable by grafana.
http://influx-capacitor.com
MIT License
44 stars 13 forks source link

influx-capacitor

Influx-capacitor collects metrics from windows machines using Performance Counters. Data is sent to influxDB to be viewable by grafana.

It is installed as a windows service named Influx-Capacitor. There is also a console application called Influx-Capacitor.Console that can be used for configuration, to check status and run commands manually.

Data is collected and sent to a InfluxDB database of your choice. Grafana is a great way of looking at the data.

Install using chocolatey [https://chocolatey.org/packages/Influx-Capacitor]()

Visit the main site for more information [http://influx-capacitor.com/]()

Performance Counters

By default configurations for a few Performance Counters are provided. Setup for counters are stored in xml-files in the same folder as the executables, or in the ProgramData folder (IE. C:\ProgramData\Thargelion\Influx-Capacitor) and are named with the file extension xml.

You can configure any Performance Counter available to be monitored. When you have added or changed a configuration file you need to restart the service for it to take effect. You can test and run manually using the console application.

Configuration

<Influx-Capacitor>
  <CounterGroups>
    <CounterGroup Name="[YourCounterGroupName]" SecondsInterval="[SecondsInterval]" RefreshInstanceInterval="[RefreshInstanceInterval]" CollectorEngineType="[CollectorEngineType]">
      <Counter>
        <MachineName>[MachineName]</MachineName>
        <CategoryName>[CategoryName]</CategoryName>
        <CounterName>[CounterName]</CounterName>
        <InstanceName Alias="[Alias]">[InstanceName]</InstanceName>
        <FieldName>[FieldName]</FieldName>
        <Limits Max="[Max]" />
      </Counter>
    </CounterGroup>
  </CounterGroups>
</Influx-Capacitor>

If you want to get the name of the counters right, simply open perfmon and find the counter that you want there. The names to put in the config files are exactly the same as the ones in perfmon.

Advanced: filters

When you start using instance names, there are cases where you can't get the correct name, for example because the instance name is dynamic and uses process id.

In such cases, you can use InstanceFilters at the CounterGroup level to apply some transformations on instance names, using regular expressions.

An example of configuration with filters.

<Influx-Capacitor>
  <CounterGroups>
    <CounterGroup Name="[YourCounterGroupName]" SecondsInterval="[SecondsInterval]" RefreshInstanceInterval="[RefreshInstanceInterval]" CollectorEngineType="[CollectorEngineType]">
      <Counter>
        <MachineName>[MachineName]</MachineName>
        <CategoryName>[CategoryName]</CategoryName>
        <CounterName>[CounterName]</CounterName>
        <InstanceName Alias="[Alias]">[InstanceName]</InstanceName>
        <FieldName>[FieldName]</FieldName>
        <Limits Max="[Max]" />
      </Counter>
      ...
      <InstanceFilters>
        <Filter Pattern="[Pattern]" />
        <Filter Pattern="[ReplacementPattern]" Replacement="[Replacement]" />
      </InstanceFilters>
    </CounterGroup>
  </CounterGroups>
</Influx-Capacitor>

A filtering pattern will exclude counters where instance name does not match. A replacement filter will only change the name, without excluding any instance. All filters are executed sequentially on each instance name.

You can use the InstanceName element to choose counters for simple cases, then apply an advanced filter.

Here are some examples.

<InstanceFilters>
  <!-- this filter will only include counters names where ".NET" is present in the instance name -->
  <Filter Pattern="\.NET" />
  <!-- this filter will remove all pids from instance names for IIS apps. "1879_.NET v4.5" => ".NET v4.5" -->
  <Filter Pattern="^\d+_(.*)$" Replacement="$1" />
  <!-- this filter will replace ".NET v4.5" by "net45" -->
  <Filter Pattern="\.NET v4\.5" Replacement="net45" />
</InstanceFilters>

Advanced: custom providers

By default, Influx-Capacitor includes one standard provider, which is able to collect counters from system performance counters, as presented above.

A additional mechanism is included to allow external contributors to add their own providers. A provider is an assembly which references Tharga.Influx-Capacitor.Collector and expose a public class which implements Tharga.InfluxCapacitor.Collector.Interface.IPerformanceCounterProvider. Then, you can configure this custom provider in your configuration files, in a Provider element.

<Influx-Capacitor>
  <Providers>
    <Provider Name="[ProviderName]" Type="[ProviderType]" />
  </Provider>
</Influx-Capacitor>

Specific provider settings can be included in configuration. See each provider documentation for details about these settings.

To use your specific providers, you have to indicate the provider uniquename in CounterGroup.

<Influx-Capacitor>
  <CounterGroups>
    <CounterGroup Name="..." Provider="[ProviderName]">
      ...
    </CounterGroup>
  </CounterGroups>
</Influx-Capacitor>

Application configuration

There are some application settings that can be configured. The configuration can be made in any xml-config file in the programdata folder. The default location of this configuration is application.xml.

<Influx-Capacitor>
  <Application>
    <FlushSecondsInterval>10</FlushSecondsInterval>
    <Metadata>true</Metadata>
    <MaxQueueSize>20000</MaxQueueSize>
  </Application>
</Influx-Capacitor>

Database connection settings

The settings are typically stored in the file database.xml located in the ProgramData folder (IE. C:\ProgramData\Thargelion\Influx-Capacitor). The settings can be located in any other xml configuration file, but then you will not be able to manage the settings using the management console. You can change settings directly in the file and restert the service, or you can use the command "setup change" in the console application, and the service will be restarted for you. It is also possible to have multiple database targets. Add another Database element in the config file and restart the service. When using multiple targets the console application cannot be used to change the confguration.

There are several different types of databases supported. Each of them is configured differently. Set the type attribute in the Database element to select what database provider to use. The attributes Type defaults to InfluxDB, and Enabled is default true.

Supported types are

InfluxDB

<Influx-Capacitor>
  <Database Type="InfluxDB" Enabled="true">
    <Url>http://localhost:8086</Url>
    <Username>MyUser</Username>
    <Password>qwerty</Password>
    <Name>InfluxDbName</Name>
    <RequestTimeoutMs>15000</RequestTimeoutMs>
  </Database>
</Influx-Capacitor>

Kafka

This is actually not a database, this type sends data to Kafka (http://kafka.apache.org/). The message is formatted for influxDB version 0.9.x.

<Influx-Capacitor>
  <Database Type="Kafka">
    <Url>http://server1;http://server2</Url>
  </Database>
</Influx-Capacitor>

Null

This type is for development only. It collects points and sends them to no where.

<Influx-Capacitor>
  <Database Type="null" />
</Influx-Capacitor>

Acc

This type is for development only. It collects and accumulates points but never sends them anywhere.

<Influx-Capacitor>
  <Database Type="acc" />
</Influx-Capacitor>

Tags

You can add constant tags on a global, counter group and counter level. This can be done in any of the configuration files. The name of the tag has to be unique.

Global tags that will be added to all points sent to the database can be added like this.

<Influx-Capacitor>
  <Tag>
    <Name>[TagName]</Name>
    <Value>[TagValue]</Value>
  </Tag>
</Influx-Capacitor>

It is also possible to add a tags for a specific counter group, these tags can be added like this.

<Influx-Capacitor>
  <CounterGroups>
    <CounterGroup Name="[YourCounterGroupName]" SecondsInterval="[SecondsInterval]">
      <Counter>
        <CategoryName>[CategoryName]</CategoryName>
        <CounterName>[CounterName]</CounterName>
        <InstanceName>[InstanceName]</InstanceName>
      </Counter>
      <Tag>
        <Name>[TagName]</Name>
        <Value>[TagValue]</Value>
      </Tag>
    </CounterGroup>
  </CounterGroups>
</Influx-Capacitor>

Tags for a specific counter is added like this.

<Influx-Capacitor>
  <CounterGroups>
    <CounterGroup Name="[YourCounterGroupName]" SecondsInterval="[SecondsInterval]">
      <Counter>
        <CategoryName>[CategoryName]</CategoryName>
        <CounterName>[CounterName]</CounterName>
        <InstanceName>[InstanceName]</InstanceName>
        <Tag>
          <Name>[TagName]</Name>
          <Value>[TagValue]</Value>
        </Tag>
      </Counter>
    </CounterGroup>
  </CounterGroups>
</Influx-Capacitor>

Running the console application

The console version is named Tharga.Influx-Capacitor.Console.exe and provided together with the installation. The program can be started with command parameters, or you can type the commands you want in the program.

Config

Service

Counter

Versions

The currently supported versions of InfluxDB is from 0.9.x to 0.12.x.

Metadata

By default metadata is sent fron Influx-Capacitor to influxDB. (There is an Application that can turn this off if you do not want it) The data is register as measurement named Influx-Capacitor-Metadata.

Mainly the collecting of data and the status of the queue is what can be monitored. Use the counter tag in the where statement to select the metadata you want to analyze.

Tags that appears for all metadata measurements

queueCount

For measurements where counter = queueCount

Use this measurement to monitor the queue. How data is piled up and how it is sent to the server. If you have more than one server you are sending data to, you can see all servers metadata on all servers. This makes it easy to see if the queue is increasing because data cannot be sent to one of the servers.

Tags

Values

readCount

For measurements where counter = readCount

This one can be used to monitor the collection of data. Number of data and how long it takes.

Tags

Values

readTime

For measurements where counter = readTime

The collecting of data involves several steps. Here you can monitor the time it takes for each step.

Tags

Values

configuration

For measurements where counter = configuration

Tags

action - The action that performed the configuration test (config_auto, config_database, config_change)

Values

value - The value 1

Point format and measurements schema

By default, InfluxDB points are created with a "counter" and a "category" tag, and an unique field "value". You have the possibility to use alias for instances, and assign specific tag to each counter.

<CounterGroup Name="perfmon.memory" SecondsInterval="5">
    <Counter>
        <CategoryName>Memory</CategoryName>
        <CounterName>Available Bytes</CounterName>
    </Counter>
    <Counter>
        <CategoryName>Memory</CategoryName>
        <CounterName>Committed Bytes</CounterName>
    </Counter>
</CounterGroup>

Datas send with this configuration will result in this schema in InfluxDB:

> select * from perfmon.memory

time              category  counter          instance  hostname     value
20150203002300  Memory    Available Bytes  _Total    SERVER1   15766000
20150203002300  Memory    Committed Bytes  _Total    SERVER1     460000
20150203004300  Memory    Available Bytes  _Total    SERVER1   15966000
20150203004300  Memory    Committed Bytes  _Total    SERVER1     440000

This schema has the advantage of being very flexible and powerfull, but has the disavantage of consuming more memory (see official doc, when do I need more RAM) If you do not want to use counter's specific tags, or have simplier requirements, you can compact points and gain memory by using the FieldName config element:

<CounterGroup Name="perfmon.memory" SecondsInterval="5">
    <Counter>
        <CategoryName>Memory</CategoryName>
        <CounterName>Available Bytes</CounterName>
        <FieldName>free_bytes</FieldName>
    </Counter>
    <Counter>
        <CategoryName>Memory</CategoryName>
        <CounterName>Committed Bytes</CounterName>
        <FieldName>committed_bytes</FieldName>
    </Counter>
</CounterGroup>

Will give the following result in InfluxDB:

> select * from perfmon.memory

time              hostname  free_bytes  committed_bytes
2015020311002300  SERVER1     15766000           460000
2015020311004300  SERVER1     15966000           440000

This compact mode suffers from some limitations you have to be aware of:

Nuget

There is a nuget package that contains the core fuctions. This package can be used within your C# code.

Besides the different targets included in Influx-Capacitor (InfluxDB and Kafka) there are some main features that can be useful.

Queue

Uning this feature you can place measurements on a queue to be sent as a batch. There is a resend feature that sends the measurements as soon as you get a connection to the server.

Managed functions (measure function)

You can place a measurement executor around your function that will measure how long a function take. It also listens to exceptions and send a 'Success' attribute to the database. This feature is implemented as a visitor pattern (execute around).

Simple void action

var measure = new Measure(new Queue(new InfluxDbSenderAgent(new InfluxDbAgent("http://localhost:8086", "MyDatabase", "root", "MyPassword"))));
measure.Execute(() =>
{
    //TODO: Do some stuff that you want to measure here.
    //Exceptions here will be captured, logged and re-thrown.
});

Also work with async functions

var measure = new Measure(new Queue(new InfluxDbSenderAgent(new InfluxDbAgent("http://localhost:8086", "MyDatabase", "root", "MyPassword"))));
var someResult = await measure.ExecuteAsync("MyMeasurement", () =>
{
    //TODO: Do some stuff that you want to measure here.
    //Exceptions here will be captured, logged and re-thrown.
    //The response will be logged.

    return Task.Run(() =>
    {
        return "Foo";
    });
});

Providing extra measurements

var measure = new Measure(new Queue(new InfluxDbSenderAgent(new InfluxDbAgent("http://localhost:8086", "MyDatabase", "root", "MyPassword"))));
measure.Execute((measurement) =>
{
    measurement.Tags.Add("A", 1);
});