influxdata / kapacitor

Open source framework for processing, monitoring, and alerting on time series data
MIT License
2.31k stars 492 forks source link

Kapacitor goes OutOfMemory at regular intervals #1994

Open rohit3682 opened 6 years ago

rohit3682 commented 6 years ago

We have written a task that creates an alert if count is not more than zero for a period of one hour every one hour. Kapacitor service goes OutOfMemory at regular intervals. Following are the details

Machine EC2 (m4.xlarge)

Task (has been anonymized)

dbrp "db"."90day"

var duration = 1h
var crit = 0

var data = stream
    |from()
        .measurement('line_count')
        .groupBy('d_number','r_number')
    |default()
        .tag('type', 'old')
    |where(lambda: "type" == 'device')
    |window()
        .period(duration)
        .every(duration)
    |sum('count')
        .as('total_count')
    |eval(lambda: int("total_count"))
        .as('count_int')

var alert = data
  |alert()
    .id('D_Outage/{{ index .Tags "r_number" }}/{{index .Tags "d_number" }}')
    .message('D {{ index .Tags "d_number"}} on LCP {{ index .Tags "r_number" }} is {{.Level}}')
    .crit(lambda: "count_int" == crit)
    .critReset(lambda: "count_int" > crit)

alert
  .post('https://server.com/alert/kapacitor')

Heap dump

(pprof) top 10
Showing nodes accounting for 6645.43MB, 89.51% of 7424.30MB total
Dropped 227 nodes (cum <= 37.12MB)
Showing top 10 nodes out of 120
      flat  flat%   sum%        cum   cum%
 2524.86MB 34.01% 34.01%  2524.86MB 34.01%  bytes.makeSlice
 1230.25MB 16.57% 50.58%  1352.26MB 18.21%  github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
  885.74MB 11.93% 62.51%   885.74MB 11.93%  github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
  630.71MB  8.50% 71.00%   630.71MB  8.50%  github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
  489.19MB  6.59% 77.59%   489.19MB  6.59%  github.com/influxdata/kapacitor/tick/stateful.NewFunctions
  478.18MB  6.44% 84.03%   502.18MB  6.76%  github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
     122MB  1.64% 85.68%      122MB  1.64%  github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).StringValue
   95.96MB  1.29% 86.97%    95.96MB  1.29%  encoding/json.(*Decoder).refill
   95.52MB  1.29% 88.26%    95.52MB  1.29%  github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
   93.01MB  1.25% 89.51%    93.01MB  1.25%  github.com/influxdata/kapacitor/edge.(*pointMessage).ShallowCopy
(pprof) list makeSlice
Total: 7.25GB
ROUTINE ======================== bytes.makeSlice in /usr/local/go/src/bytes/buffer.go
    2.47GB     2.47GB (flat, cum) 34.01% of Total
         .          .    225:   defer func() {
         .          .    226:       if recover() != nil {
         .          .    227:           panic(ErrTooLarge)
         .          .    228:       }
         .          .    229:   }()
    2.47GB     2.47GB    230:   return make([]byte, n)
         .          .    231:}
         .          .    232:
         .          .    233:// WriteTo writes data to w until the buffer is drained or an error occurs.
         .          .    234:// The return value n is the number of bytes written; it always fits into an
         .          .    235:// int, but it is int64 to match the io.WriterTo interface. Any error
(pprof)

Kapacitor Version

[root@ip-xx-xx-xx-xx ec2-user]# kapacitor version
Kapacitor OSS 1.5.0 (git: HEAD 4f10efc41b4dcac070495cf95ba2c41cfcc2aa3a)

Show task output

DOT:
digraph dvc_out_alert {
graph [throughput="21.00 points/s"];

stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="940758"];

from1 [avg_exec_time_ns="31.199µs" errors="0" working_cardinality="0" ];
from1 -> default2 [processed="939757"];

default2 [avg_exec_time_ns="4.913µs" errors="0" fields_defaulted="0" tags_defaulted="0" working_cardinality="0" ];
default2 -> where3 [processed="938756"];

where3 [avg_exec_time_ns="27.573µs" errors="0" working_cardinality="50939" ];
where3 -> window4 [processed="849893"];

window4 [avg_exec_time_ns="5.107µs" errors="0" working_cardinality="49689" ];
window4 -> sum5 [processed="152363"];

sum5 [avg_exec_time_ns="10.059µs" errors="0" working_cardinality="48742" ];
sum5 -> eval6 [processed="151362"];

eval6 [avg_exec_time_ns="8.573µs" errors="0" working_cardinality="48729" ];
eval6 -> alert7 [processed="150361"];

alert7 [alerts_inhibited="0" alerts_triggered="21060" avg_exec_time_ns="729.940286ms" crits_triggered="17034" errors="0" infos_triggered="0" oks_triggered="4026" warns_triggered="0" working_cardinality="48713" ];
rohit3682 commented 6 years ago

screen shot 2018-07-05 at 4 25 56 pm

screen shot 2018-07-05 at 4 57 56 pm

faskiri commented 6 years ago

+1 for pprof, can you please point to instructions on how you got that to work? Is this from /v1/debug/pprof? I debugged such issue (which was due to a bad join) using much more painful methods

faskiri commented 6 years ago

Btw,

  |window()
        .period(duration)
        .every(duration)

is saying you store the data for 1 hour. So you can expect some data to be built up. Btw, I used batch query shift/offset to avoid any deliberate memory build up when I was using such a look back in the past query. Also notice that this has a sideeffect that a restart will result in the system to be silent for an hour as it builds the state for that hour

Refer: https://github.com/influxdata/kapacitor/issues/1992

rohit3682 commented 6 years ago

For pprof, set pprof-enabled in kapacitor.conf to true. Use https://docs.influxdata.com/kapacitor/v1.5/working/api/#miscellaneous to get the pprof file. Use go tool pprof to look into the pprof file.

rohit3682 commented 6 years ago

We cannot use batch as it increases the load on already overloaded InfluxDB server.

faskiri commented 6 years ago

I wonder if you got a solution. Also, a stream query should be significantly more expensive than a batch one. Did you do profiling that shows batch is more expensive for you. Do you mind sharing the result? Notice that a streaming requires influx to stream all the points to kapacitor. The first tuning I did was to remove subscription for all the points in kapacitor. That caused a decent drop in influx cpu usage

pdvyas commented 6 years ago

As @faskiri pointed out, the 'window' node will store points for an hour and emit it at once. You seem to be doing a sum and that doesn't actually need storing all the points and you can just calculate it on the fly. You can try writing a custom UDF ('windowSum`?) that keeps summing a field for each 'group' and emits once a duration elapses (the udf example for moving average can be a good starting point). I am in the same boat where simple aggregations like last, count, avg on a window consume a lot of memory. I tried implementing a UDF this evening but noticed that kapacitor still kept on leaking memory (a bit slowly) as also reported here, https://github.com/influxdata/kapacitor/issues/2051