Explain metric registration in documentation

lzap commented 7 years ago

Hello,

I am building an adapter or bridge that will read statsd protocol data and write to PCP using your library, but I don't understand how metrics survive restart of PCP daemon. Protocol statsd is a pretty dynamic environment where clients simply send metrics and in PCP all metrics must be registered at the initialization.

I tried to register metrics dynamically stopping the client first but it did not work well (I was running into issues trying to stop already stopped client - maybe just a race condition). Can you confirm it should be possible to post-register a new metric for already started client (stopping it first of course)? The documentation only mentions the client must be stopped, this could work. Will this approach work with archiving and long-term monitoring?

Thanks

suyash commented 7 years ago

Hi

PCP detects metrics through memory mapped files. The default behavior is to persist these files. If you want to make sure that the memory mapped files are deleted (i.e. metrics are lost on stopping the client), please set EraseFileOnStop to true (godoc https://godoc.org/github.com/performancecopilot/speed#pkg-variables)

I tried to register metrics dynamically stopping the client first but it did not work well (I was running into issues trying to stop already stopped client - maybe just a race condition)

You cannot stop an already stopped client, however it is possible to stop a client, register a new metric and restart it. I was trying the following snippet, and everything seems to be working fine, please reply if it satisfies your use case, or any ways to modify the library that will satisfy your use cases. Notice that it sets EraseFileOnStop to true right at the end, so all previous metrics are still there on restart, and only at the end the file is also deleted.

package main

import (
    "log"
    "math/rand"
    "strconv"
    "time"

    "github.com/performancecopilot/speed"
)

var c speed.Client

func main() {
    speed.EnableLogging(true)

    var err error

    c, err = speed.NewPCPClient("test")
    if err != nil {
        log.Fatal("Sorry, No Client")
    }

    c.MustStart()
    defer c.MustStop()

    go runner("a.b.c")
    go runner("d.e.f")
    go runner("g.h.i")

    time.Sleep(10 * time.Second)

    speed.EraseFileOnStop = true
}

func runner(metricPrefix string) {
    var x = 0
    for {
        var sleepTime = time.Duration(rand.Intn(2000)) * time.Millisecond
        time.Sleep(sleepTime)

        c.MustStop()
        var metricName = metricPrefix + "." + strconv.Itoa(x)
        c.MustRegisterString(metricName, 0, speed.Int64Type, speed.CounterSemantics, speed.OneUnit)
        c.MustStart()

        x++
    }
}

Will this approach work with archiving and long-term monitoring?

The client must be stopped to re-register a metric because its an issue of read location. Adding a metric will modify locations of different blocks inside the MMV file, which will require reinitialization of the client. Hence a metric can only be registered if a client is stopped. Another idea is to have multiple clients, each with a different name and writing their own metrics. In fact, that is the idea of the client type and not having a single global registry of metrics.

lzap commented 7 years ago

Thank you. I will file doco PR tomorrow adding new sentence. I will try your example tomorrow as well. I just hope this will not slow my app too much. I could perhaps keep a cache of all metrics in a file for faster pre-registration during startup.

I am just starting with PCP, can you confirm me one more concern. If I dynamically register metrics like that and PCP will be archiving the data, will it work? I mean will the metrics appear in archives as well? Will all the PCP tools like chart app work with this approach?

This is excellent work, the library is well designed and cleaned. And I love the helper metrics like counter or gauge which nicely maps to statsd. Thanks.

natoscott commented 7 years ago

| I am just starting with PCP, can you confirm me one more concern. | If I dynamically register metrics like that and PCP will be archiving the data, will it work?

@lzap yes - in modern versions of PCP (pcp-3.11.2 or later), just "chkconfig pmlogger on" and the pmlogconf(1) rule for mmv.* metrics will log at the default frequency.

| I mean will the metrics appear in archives as well?

(yep)

| Will all the PCP tools like chart app work with this approach?

Yes, client tools like pmchart, pmval, pmdumptext, pminfo, pmie, ... will be able to access the MMV metrics from your application, just like any kernel or other metric.

lzap commented 7 years ago

Thanks a lot. Are counters reset to init value when agent samples value? I am going to measure number of database model instances created which can be millions per minute in the worst case. I would rather see number of instances created per minute or second. Should I do the math myself or can PCP help me to calculate this so I can just keep throwing numbers in? I am in bed currently and just trying to figure this out on my tablet browsing this repo...

natoscott commented 7 years ago

PCP client will do the math for you provided the metrics are correctly classified (in your example above, those metrics should be of "counter" type). pmchart, pmval & friends look at the metric metadata and report appropriately.

lzap commented 7 years ago

I pushed my initial version that works but has many rough edges to: https://github.com/lzap/pcp-mmvstatsd

Would love to see your comments.

performancecopilot / speed

Explain metric registration in documentation #35