performancecopilot / speed

A Go implementation of the PCP instrumentation API
MIT License
37 stars 6 forks source link

metrics: use a global InstanceDomain for PCPHistogram, fixes #54 #55

Closed suyash closed 6 years ago

lzap commented 6 years ago

Guys, I tested this, it works but I do see instances twice for some reason:

# pminfo -dfmtT mmv.fm_rails_http_request_total_duration.subnets_controller.index

mmv.fm_rails_http_request_total_duration.subnets_controller.index PMID: 70.2217.448 One-line Help: Error: One-line or help text is not available
    Data Type: double  InDom: 70.4017825 0x11bd4ea1
    Semantics: instant  Units: millisec
Full Help: Error: One-line or help text is not available
    inst [-677190887 or "max"] value 215
    inst [-1629607596 or "mean"] value 84.66666666666667
    inst [231780542 or "variance"] value 3447.222222222221
    inst [375944228 or "standard_deviation"] value 58.71304984602845
    inst [-913357481 or "min"] value 47
    inst [-913357481 or "min"] value 47
    inst [-677190887 or "max"] value 215
    inst [-1629607596 or "mean"] value 84.66666666666667
    inst [231780542 or "variance"] value 3447.222222222221
    inst [375944228 or "standard_deviation"] value 58.71304984602845

I grabbed both MMV files and mmv dump from my server and published them here:

http://people.redhat.com/~lzapleta/temp/mmv_twice_problem/

For the record:

Performance Co-Pilot configuration on foreman.nat.lan:

 platform: Linux foreman.nat.lan 3.10.0-862.3.2.el7.x86_64 #1 SMP Mon May 21 23:36:36 UTC 2018 x86_64
 hardware: 2 cpus, 1 disk, 1 node, 8615MB RAM
 timezone: EDT+4
 services: pmcd pmwebd
     pmcd: Version 3.12.2-1, 8 agents
     pmda: root pmcd proc xfs linux apache mmv jbd2
suyash commented 6 years ago

@lzap it seems like you have 2 histograms recording similar values

consider the following program

package main

import (
    "fmt"
    "log"
    "math/rand"
    "time"

    "github.com/performancecopilot/speed"
)

func main() {
    max := int64(100)

    c, err := speed.NewPCPClient("histogram_test")
    if err != nil {
        log.Fatal("Could not create client, error: ", err)
    }

    m1, err := speed.NewPCPHistogram("hist1", 0, max, 5, speed.OneUnit, "a sample histogram")
    if err != nil {
        log.Fatal("Could not create histogram, error: ", err)
    }

    m2, err := speed.NewPCPHistogram("hist2", 0, max, 5, speed.OneUnit, "a sample histogram")
    if err != nil {
        log.Fatal("Could not create histogram, error: ", err)
    }

    c.MustRegister(m1)
    c.MustRegister(m2)

    c.MustStart()
    defer c.MustStop()

    for i := 0; i < 60; i++ {
        v := rand.Int63n(max)

        fmt.Println("recording", v)
        m1.MustRecord(v)
        m2.MustRecord(v << 1)

        time.Sleep(time.Second)
    }
}

this has 2 histograms, one records twice the value of the other

here is a generated MMV dump

TOC[0], offset: 40, indoms offset: 120 (1 entries)
    [4015777/120] 5 instances, starting at offset 152
        (no shorttext)
        (no longtext)

TOC[1], offset: 56, instances offset: 152 (5 entries)
    [4015777/152] instance = [-913357481/min]
    [4015777/232] instance = [-677190887/max]
    [4015777/312] instance = [-1629607596/mean]
    [4015777/392] instance = [231780542/variance]
    [4015777/472] instance = [375944228/standard_deviation]

TOC[2], offset: 72, metric offset: 552 (2 entries)
    [589/552] hist2
        type=DoubleType (0x5), sem=Semantics(3) (0x3), pad=0x0
        units=count
        indom=4015777
        shorttext=a sample histogram
        (no longtext)
    [404/656] hist1
        type=DoubleType (0x5), sem=Semantics(3) (0x3), pad=0x0
        units=count
        indom=4015777
        shorttext=a sample histogram
        (no longtext)

TOC[3], offset: 88, values offset: 760 (10 entries)
    [589/760] hist2[-1629607596 or "mean"] = 91.71428571428571
    [589/792] hist2[231780542 or "variance"] = 2226.775510204082
    [589/824] hist2[375944228 or "standard_deviation"] = 47.188722277723116
    [589/856] hist2[-913357481 or "min"] = 20
    [589/888] hist2[-677190887 or "max"] = 174
    [404/920] hist1[375944228 or "standard_deviation"] = 23.594361138861558
    [404/952] hist1[-913357481 or "min"] = 10
    [404/984] hist1[-677190887 or "max"] = 87
    [404/1016] hist1[-1629607596 or "mean"] = 45.857142857142854
    [404/1048] hist1[231780542 or "variance"] = 556.6938775510205

TOC[4], offset: 104, strings offset: 1080 (2 entries)
    [1080] a sample histogram
    [1336] a sample histogram

If you notice the indom block, there is only one indom, similarly if you notice the instances block, there are only 5 instances, however, the values block has 10 values, because we want to record the values of individual histograms individually.

Before this CL, if you had 2 histograms, the indom block would have 2 indoms and the instances block would have 10 instances, that is the main issue addressed with this. It also might be an issue with pminfo. Pinging @lberk @natoscott