pwillie / prometheus-es-adapter

Prometheus remote storage adapter for Elasticsearch
Apache License 2.0
60 stars 31 forks source link

Memory leak #22

Open sevagh opened 5 years ago

sevagh commented 5 years ago

Hello.

I just upgraded to the latest version of this adapter (with Prometheus 2.6 pollers) and the following ES settings:

ES_WORKERS=4
ES_BATCH_MAX_DOCS=-1
ES_BATCH_MAX_SIZE=-1

The adapter has a memory leak. I added the pprof debug endpoint. Here's the pprof/heap -top output:

Showing nodes accounting for 154.44MB, 98.21% of 157.26MB total
Dropped 48 nodes (cum <= 0.79MB)
      flat  flat%   sum%        cum   cum%
   52.01MB 33.07% 33.07%    80.02MB 50.88%  github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write
   45.51MB 28.94% 62.01%    45.51MB 28.94%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.(*BulkIndexRequest).Source
   28.01MB 17.81% 79.82%    28.01MB 17.81%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.NewBulkIndexRequest (inline)
   18.50MB 11.76% 91.59%    18.50MB 11.76%  github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb.(*Label).Unmarshal

Another one:

File: prometheus-es-adapter
Type: inuse_space
Time: Feb 20, 2019 at 4:14am (PST)
Showing nodes accounting for 357.28MB, 99.08% of 360.59MB total
Dropped 57 nodes (cum <= 1.80MB)
      flat  flat%   sum%        cum   cum%
  110.53MB 30.65% 30.65%   168.04MB 46.60%  github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write
   92.51MB 25.66% 56.31%    93.01MB 25.80%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.(*BulkIndexRequest).Source
   57.51MB 15.95% 72.26%    57.51MB 15.95%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.NewBulkIndexRequest (inline)
   39.50MB 10.95% 83.21%    39.50MB 10.95%  github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb.(*Label).Unmarshal
   25.30MB  7.02% 90.23%    25.30MB  7.02%  bytes.(*Buffer).String (inline)

It's mostly from this method: https://github.com/pwillie/prometheus-es-adapter/blob/master/pkg/elasticsearch/write.go#L69

Moreover, the inuse_space doesn't seem to be the source of the leak - I think it's a GC problem (but I'm not sure how well I can continue debugging). When I use pprof -alloc_space I think the bulk of the leak is evident there:

Dropped 240 nodes (cum <= 0.48GB)
Showing top 10 nodes out of 71
      flat  flat%   sum%        cum   cum%
   25.93GB 26.98% 26.98%    44.49GB 46.29%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.(*BulkIndexRequest).Source
   11.08GB 11.53% 38.51%    11.08GB 11.53%  bytes.makeSlice
    5.94GB  6.18% 44.69%     9.33GB  9.70%  github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write
    4.20GB  4.37% 49.06%    18.56GB 19.31%  encoding/json.Marshal
    4.08GB  4.25% 53.31%     6.39GB  6.65%  github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb.(*TimeSeries).Unmarshal

Running with pprof list Write:

ROUTINE ======================== github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write in /go/src/github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch/write.go
    6.52GB    10.23GB (flat, cum)  9.73% of Total
         .          .     66:}
         .          .     67:
         .          .     68:// Write will enqueue Prometheus sample data to be batch written to Elasticsearch
         .          .     69:func (svc *WriteService) Write(req []*prompb.TimeSeries) {
         .          .     70:   for _, ts := range req {
  884.04MB   884.04MB     71:           metric := make(model.Metric, len(ts.Labels))
         .          .     72:           for _, l := range ts.Labels {
    5.11GB     5.11GB     73:                   metric[model.LabelName(l.Name)] = model.LabelValue(l.Value)
         .          .     74:           }
         .          .     75:           for _, s := range ts.Samples {
         .          .     76:                   v := float64(s.Value)
         .          .     77:                   if math.IsNaN(v) || math.IsInf(v, 0) {
         .    41.51MB     78:                           svc.logger.Debug(fmt.Sprintf("invalid value %+v, skipping sample %+v", v, s))
         .          .     79:                           continue
         .          .     80:                   }
         .          .     81:                   sample := prometheusSample{
         .          .     82:                           metric,
         .          .     83:                           v,
         .          .     84:                           s.Timestamp,
         .          .     85:                   }
         .          .     86:                   r := elastic.
         .     3.67GB     87:                           NewBulkIndexRequest().
         .          .     88:                           Index(svc.config.Alias).
         .          .     89:                           Type(sampleType).

Here's the Prometheus protobuf marshaling code:

ROUTINE ======================== github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb.(*WriteRequest).Unmarshal in /go/src/github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb/remote.pb.go
    1.25GB     8.24GB (flat, cum)  7.83% of Total
         .          .    493:                   }
         .          .    494:                   postIndex := iNdEx + msglen
         .          .    495:                   if postIndex > l {
         .          .    496:                           return io.ErrUnexpectedEOF
         .          .    497:                   }
    1.25GB     1.25GB    498:                   m.Timeseries = append(m.Timeseries, &TimeSeries{})
         .     6.99GB    499:                   if err := m.Timeseries[len(m.Timeseries)-1].Unmarshal(dAtA[iNdEx:postIndex]); err != nil {
         .          .    500:                           return err
         .          .    501:                   }
         .          .    502:                   iNdEx = postIndex
         .          .    503:           default:
         .          
sevagh commented 5 years ago

Looks like after the metric[] slice is constructed and encapsulated in the BulkIndexRequest sample, it's never freed/released?

sevagh commented 5 years ago

Inuse space good:

shanssian:prometheus-es-adapter $ go tool pprof -inuse_space -top http://bleep:9000/debug/pprof/heap | head -n10
Fetching profile over HTTP from http://bleep:9000/debug/pprof/heap
Saved profile in /home/shanssian/pprof/pprof.prometheus-es-adapter.alloc_objects.alloc_space.inuse_objects.inuse_space.039.pb.gz
File: prometheus-es-adapter
Type: inuse_space
Time: Feb 20, 2019 at 4:40am (PST)
Showing nodes accounting for 65.88MB, 100% of 65.88MB total
      flat  flat%   sum%        cum   cum%
   25.01MB 37.96% 37.96%    32.51MB 49.34%  github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write
      20MB 30.36% 68.32%       20MB 30.36%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.(*BulkIndexRequest).Source
    7.50MB 11.39% 79.71%     7.50MB 11.39%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.NewBulkIndexRequest (inline)
       7MB 10.63% 90.33%        7MB 10.63%  github.com/pwillie/prometheus-es-adapter/vendor/github.com/prometheus/prometheus/prompb.(*Label).Unmarshal
    2.40MB  3.64% 93.97%     2.40MB  3.64%  bytes.(*Buffer).String (inline)

Alloc_space bad:

File: prometheus-es-adapter
Type: alloc_space
Time: Feb 20, 2019 at 4:40am (PST)
Showing nodes accounting for 167229.67MB, 98.79% of 169269.66MB total
Dropped 264 nodes (cum <= 846.35MB)
      flat  flat%   sum%        cum   cum%
45555.63MB 26.91% 26.91% 78240.27MB 46.22%  github.com/pwillie/prometheus-es-adapter/vendor/gopkg.in/olivere/elastic%2ev6.(*BulkIndexRequest).Source
19662.38MB 11.62% 38.53% 19662.38MB 11.62%  bytes.makeSlice
10465.85MB  6.18% 44.71% 16375.52MB  9.67%  github.com/pwillie/prometheus-es-adapter/pkg/elasticsearch.(*WriteService).Write
 7419.20MB  4.38% 49.10% 32684.64MB 19.31%  encoding/json.Marshal

Png dump of alloc_space: profile006

sevagh commented 5 years ago

Inuse_space for good measure (since I may be misunderstanding alloc_space): profile007 Also, runtime.MemStats:

# runtime.MemStats
# Alloc = 163905888
# TotalAlloc = 197824212744
# Sys = 1208749816
# Lookups = 0
# Mallocs = 2158845849
# Frees = 2157028671
# HeapAlloc = 163905888
# HeapSys = 1136525312
# HeapIdle = 778420224
# HeapInuse = 358105088
# HeapReleased = 0
# HeapObjects = 1817178
# Stack = 4325376 / 4325376
# MSpan = 8442536 / 15089664
# MCache = 55296 / 65536
# BuckHashSys = 1904911
# GCSys = 45088768
# OtherSys = 5750249
# NextGC = 188449376
# LastGC = 1550666769451819398
sevagh commented 5 years ago

I'm still monitoring the situation but it looks like adding ES_BATCH_MAX_AGE=30 is preventing this issue. Perhaps it's related to the GC running?

sevagh commented 5 years ago

I've been trying to look at the code since my previously deployed version of this adapter (which still works fine). The only thing I can see, since your wrapper around the BulkWriter is largely the same, just moved around to different files, is this: https://github.com/pwillie/prometheus-es-adapter/commit/616b927fcb3522db5ca517ad42465deee042a44d#diff-836546cc53507f6b2d581088903b1785 Changing the Elastic client. I'll file an issue there.

pwillie commented 5 years ago

Thanks for the detailed report. Not sure when I will get to this but please keep me posted with any further developments.

Kanshiroron commented 5 years ago

Do you guys have any update on this? We've been using this adapter and it gets regularly evicted from Kubernetes as it's consuming too much RAM (10GB+ in under 24H)... Thanks for your support!

sevagh commented 5 years ago

I've actually switched to using metricbeat and the Prometheus module. You can use metricbeat to scrape the same targets as Prometheus, rather than go through the remote_* adapters.

Kanshiroron commented 5 years ago

Thanks @sevagh for your workaround, sadly that won't work for us as the Prometheus module doesn't seem to be compatible with AWS/Kubernetes service discovery...

pwillie commented 5 years ago

@Kanshiroron, I haven't had a chance to dig into this one yet sorry.

Kanshiroron commented 5 years ago

Hey @pwillie I am actually testing a workaround I'll keep you posted with the results. Thanks for following up

Kanshiroron commented 5 years ago

Hey @pwillie, after tweaking the adapter configuration, I no longer see the issue appearing. It has been running for 10 days now without any issue. I changed the default settings to following:

  ES_WORKERS: "20"
  ES_BATCH_MAX_AGE: "30" # in seconds
  ES_BATCH_MAX_DOCS: "5000"
  ES_BATCH_MAX_SIZE: "16384" # in bytes
  ES_INDEX_SHARDS: "1"
  ES_INDEX_REPLICAS: "2"

I suspect the number of shard and replicas to be causing the issue (use to have 5 index shards and 3 replicas)

pwillie commented 5 years ago

Thanks for the update and great to hear of your experience.

Kanshiroron commented 5 years ago

You're welcome. What I think was happening is that the es-adapter was receiving more metrics that it was able to push to ElasticSearch. It would be nice to have an internal check to make sure the number of metrics in the buffer don't keep increasing.