influxdata / kapacitor

Open source framework for processing, monitoring, and alerting on time series data
MIT License
2.32k stars 493 forks source link

question about how to calculate percent throught specific tag #963

Open violet-day opened 8 years ago

violet-day commented 8 years ago

I have a measurement like this:

time                            city_id come_from      desc count
2016-10-09T10:43:36.5502871Z    "1"     "4"            "11" 1
2016-10-09T10:43:39.833652904Z  "1"     "4"            "16" 1
2016-10-09T10:44:37.864675526Z  "1"     "4"            "2"  1

In this measurement,I'd like to calculate each desc's percent in total. Below is the tick

var inner = batch
    |query('select sum("count") as count from "hestia"."default"."eleme_order_invalid_description"')
        .every(1d)
        .period(1d)
        .groupBy(time(1d), 'desc')
        .align()
        .fill(0)

var total = batch
    |query('select sum("count") as count from "hestia"."default"."eleme_order_invalid_description"')
        .every(1d)
        .period(1d)
        .groupBy(time(1d))
        .align()
        .fill(0)
    |log()
        .level('debug')
        .prefix('total')

total
    |join(inner)
        .tolerance(1d)
        .as('total', 'inner')
        .fill(0.0)
    |log()
        .level('debug')
        .prefix('after join')
    |eval(lambda: "inner.count" / "total.count")
        .as('rate')
    |log()
        .level('debug')
        .prefix('after evel')
    |influxDBOut()
        .database('hestia')
        .retentionPolicy('default')
        .measurement('downsample_marco_invalid_order_description')

Strange thing is the total not match the inner though the same time .Here is the logs:

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=13","tags":{"desc":"13"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":39,"total.count":0},"tags":{"desc":"13"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=14","tags":{"desc":"14"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":162,"total.count":0},"tags":{"desc":"14"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=16","tags":{"desc":"16"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":278,"total.count":0},"tags":{"desc":"16"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=2","tags":{"desc":"2"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":2,"total.count":0},"tags":{"desc":"2"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=7","tags":{"desc":"7"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":1,"total.count":0},"tags":{"desc":"7"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=8","tags":{"desc":"8"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":3,"total.count":0},"tags":{"desc":"8"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=0","tags":{"desc":"0"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":73,"total.count":0},"tags":{"desc":"0"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=1","tags":{"desc":"1"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":1,"total.count":0},"tags":{"desc":"1"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=6","tags":{"desc":"6"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":1,"total.count":0},"tags":{"desc":"6"}}]}

[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","group":"desc=4","tags":{"desc":"4"},"points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":1,"total.count":0},"tags":{"desc":"4"}}]}

//the total , not match the inners
[downsample_marco_invalid_order_description:log6] 2016/10/10 18:51:20 D! after join {"name":"eleme_order_invalid_description","tmax":"2016-10-09T00:00:00Z","points":[{"time":"2016-10-09T00:00:00Z","fields":{"inner.count":0,"total.count":561},"tags":null}]}

But if I just add come_from within groupBy in total and inner,it works.

Is there any way to calculate percent the desc tag?

Kapacitor version is Kapacitor 1.0.2 (git: master 1011dba109bf3d83366c87873ec285c7f9140d34)

nathanielc commented 8 years ago

This is what the on property on the join node is for. But is currently has a bug where you need to use at least two tags to get it to work.

This TICkscript should work for you:

var inner = batch
    |query('select sum("count") as count from "hestia"."default"."eleme_order_invalid_description"')
        .every(1d)
        .period(1d)
        .groupBy('come_from', 'desc')
        .align()
        .fill(0)

var total = batch
    |query('select sum("count") as count from "hestia"."default"."eleme_order_invalid_description"')
        .every(1d)
        .period(1d)
        .groupBy('come_from')
        .align()
        .fill(0)
    |log()
        .level('debug')
        .prefix('total')

total
    |join(inner)
        .tolerance(1d)
        .as('total', 'inner')
        .fill(0.0)
        .on('come_from')
    |log()
        .level('debug')
        .prefix('after join')
    |eval(lambda: "inner.count" / "total.count")
        .as('rate')
    |log()
        .level('debug')
        .prefix('after evel')
    |influxDBOut()
        .database('hestia')
        .retentionPolicy('default')
        .measurement('downsample_marco_invalid_order_description')

The join node only joins points that have the same group unless you specify the on property to tell it how to join across groups. In this case you are telling it to join points tagged with come_from,desc to points tagged with come_from on the come_from tag. As a result the points tagged with only the come_from tag are duplicated for each of the matching come_from,desc points. Thus allowing you to calculate percentage of totals.

Also note that I removed the time(1d) from the group by since it isn't necessary since your period and every durations are also 1d.