logstash-plugins / logstash-filter-metrics

Apache License 2.0
15 stars 30 forks source link

Possible resource leak #11

Open ayashjorden opened 9 years ago

ayashjorden commented 9 years ago

Hi, I've isolated metrics plugin to cause logstash to crash with OutOfMemoryError.

configuration example:

filter { if "metric" not in [tags] { metrics { meter => [ "meter_%{type}", "meter_total" ] add_tag => "metric"

  # flush and clear at the same interval, so counts
  flush_interval => 5
  clear_interval => 5
  # discard rates, because we clear every time
  rates => []
}
if [filter_start] {
    ruby {
        code => "event['filter_duration'] = Time.now.gmtime - event['filter_start']"
        remove_field => [ 'filter_start' ]
    }
}

} else { mutate { replace => [ "type" , "metric" ] } }

}

the bug can be reproduced on both 1.4.2 and 1.5.0.rc2 versions. Only after I removed metrics filter configuration, Logstash got back to normal.

Thanks, Yarden

paetor commented 9 years ago

Even if I replicate the most basic metrics example logstash crashes within 5 events.

ssevertson commented 9 years ago

I concur - the metrics plugin, while constructing new events, has a significant memory leak. I added a few logging statements to metrics.rb, just to observe what was happening:

  def flush_rates(event, name, metric)
      @logger.info("***FLUSH_RATES BEFORE COUNT***", :event => event)
      event["#{name}.count"] = metric.count
      @logger.info("***FLUSH_RATES BEFORE ONEMINUTE***", :event => event)
      event["#{name}.rate_1m"] = metric.one_minute_rate if @rates.include? 1
      @logger.info("***FLUSH_RATES BEFORE FIVEMINUTE***", :event => event)
      event["#{name}.rate_5m"] = metric.five_minute_rate if @rates.include? 5
      @logger.info("***FLUSH_RATES BEFORE FIFTEENMINUTE***", :event => event)
      event["#{name}.rate_15m"] = metric.fifteen_minute_rate if @rates.include? 15
  end

The resulting output was:

{:timestamp=>"2015-09-25T20:49:46.823000-0500", :message=>"***FLUSH_RATES BEFORE COUNT***", :event=>#<LogStash::Event:0x58463e9f @accessors=#<LogStash::Util::Accessors:0x6bb965ff @store={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com"}, @lut={"message"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com"}, "message"]}>, @data={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com"}, @cancelled=false>, :level=>:info, :file=>"logstash/filters/metrics.rb", :line=>"223"}
{:timestamp=>"2015-09-25T20:49:46.824000-0500", :message=>"***FLUSH_RATES BEFORE ONEMINUTE***", :event=>#<LogStash::Event:0x58463e9f @accessors=#<LogStash::Util::Accessors:0x6bb965ff @store={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4}, @lut={"message"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4}, "message"], "web-beta-2.example.com : httpd-editor.count"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4}, "web-beta-2.example.com : httpd-editor.count"]}>, @data={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4}, @cancelled=false>, :level=>:info, :file=>"logstash/filters/metrics.rb", :line=>"225"}
{:timestamp=>"2015-09-25T20:49:46.829000-0500", :message=>"***FLUSH_RATES BEFORE FIVEMINUTE***", :event=>#<LogStash::Event:0x58463e9f @accessors=#<LogStash::Util::Accessors:0x6bb965ff @store={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11}, @lut={"message"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11}, "message"], "web-beta-2.example.com : httpd-editor.count"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11}, "web-beta-2.example.com : httpd-editor.count"], "web-beta-2.example.com : httpd-editor.rate_1m"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11}, "web-beta-2.example.com : httpd-editor.rate_1m"]}>, @data={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11}, @cancelled=false>, :level=>:info, :file=>"logstash/filters/metrics.rb", :line=>"227"}
{:timestamp=>"2015-09-25T20:49:46.829000-0500", :message=>"***FLUSH_RATES BEFORE FIFTEENMINUTE***", :event=>#<LogStash::Event:0x58463e9f @accessors=#<LogStash::Util::Accessors:0x6bb965ff @store={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, @lut={"message"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, "message"], "web-beta-2.example.com : httpd-editor.count"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, "web-beta-2.example.com : httpd-editor.count"], "web-beta-2.example.com : httpd-editor.rate_1m"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, "web-beta-2.example.com : httpd-editor.rate_1m"], "web-beta-2.example.com : httpd-editor.rate_5m"=>[{"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, "web-beta-2.example.com : httpd-editor.rate_5m"]}>, @data={"@version"=>"1", "@timestamp"=>"2015-09-26T01:49:46.810Z", "message"=>"logstash.example.com", "web-beta-2.example.com : httpd-editor.count"=>4, "web-beta-2.example.com : httpd-editor.rate_1m"=>2.5564659205884422e-11, "web-beta-2.example.com : httpd-editor.rate_5m"=>0.006367955078965164}, @cancelled=false>, :level=>:info, :file=>"logstash/filters/metrics.rb", :line=>"229"}

Every time a property was added to the "event" instance, the printed object was ~double the length. It's pretty clear that the issue is in accessors.rb::Accessors::lookup(accessor) - it's the only place where @lut is used that would be relevant to an event.

Is there another way of adding properties to an event that doesn't trigger this issue? Just wondering if there's a quick work-around for this plug-in, until the main project is fixed?

Thanks! --Scott