elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.2k stars 3.5k forks source link

A look into metric collection overhead in logstash 5.4.x #7396

Open jsvd opened 7 years ago

jsvd commented 7 years ago

TL;DR

While most metrics in logstash-core are taken once per batch, because input plugins don't produce events in batches, the WrappedWriteClient#<< executes WrappedWriteClient#record_metrics for every event, introducing almost a 40% performance drop in the tests below.

Purpose

Study the impact of metric collection in event per second throughput in the latest logstash stable version

Test method

I set out to compare the event throughput in a generator -> elasticsearch pipeline. For this I used logstash 5.4 with the lastest elasticsearch output. To remove the influence from network layer, I also modified the elasticsearch output to not perform the bulk request but always return a successful response.

Test setup:

Logstash

Downloaded logstash 5.4.0 and updated the logstash-output-elasticsearch to 7.2.1

bin/logstash-plugin install --version "7.2.1" logstash-output-elasticsearch

Mocking network call in elasticsearch output
diff --git a/lib/logstash/outputs/elasticsearch/http_client.rb b/lib/logstash/outputs/elasticsearch/http_client.rb
index daca851..714c4b1 100644
--- a/lib/logstash/outputs/elasticsearch/http_client.rb
+++ b/lib/logstash/outputs/elasticsearch/http_client.rb
@@ -135,12 +135,14 @@ module LogStash; module Outputs; class ElasticSearch;
     def bulk_send(body_stream)
       params = http_compression ? {:headers => {"Content-Encoding" => "gzip"}} : {}
       # Discard the URL 
-      _, response = @pool.post(@bulk_path, params, body_stream.string)
+      #_, response = @pool.post(@bulk_path, params, body_stream.string)
+      body_stream.string
       if !body_stream.closed?
         body_stream.truncate(0)
         body_stream.seek(0)
       end
-      LogStash::Json.load(response.body)
+      #LogStash::Json.load(response.body)
+      {"took" => 30, "errors" => false, "items" => [] }
     end

     def get(path)
elasticsearch

Even though the _bulk requests are mocked, the rest aren't (template install, health checks). So elasticsearch must be running on 9200

Logstash Pipeline Configuration

input {
  # 10 million messages
  generator { message => '64.242.88.10 - - [07/Mar/2004:16:05:49 -0800] "GET /twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846' count => 10000000 }
}
output {
  # es output with bulk network call mocked out. see .patch files
  elasticsearch {
    hosts => "localhost"
    index => "logstash-%{+YYYY.MM.dd.HH}"
    manage_template => false
  }
}

Test executions

a) With metric collection enabled by setting metric.collect: true in the logstash.yml

/tmp/tests/logstash-5.4.0_es_7x % bin/logstash -f ../cfg -w 4 -b 125
Sending Logstash's logs to /tmp/tests/logstash-5.4.0_es_7x/logs which is now configured via log4j2.properties
[2017-06-02T09:18:16,748][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2017-06-02T09:18:16,753][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2017-06-02T09:18:16,889][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x12607257 URL:http://localhost:9200/>}
[2017-06-02T09:18:16,891][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-06-02T09:18:16,945][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-06-02T09:18:16,956][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0x28599dd URL://localhost>]}
[2017-06-02T09:18:16,966][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-02T09:18:16,969][INFO ][logstash.pipeline        ] Pipeline main started
[2017-06-02T09:18:17,046][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-02T09:24:04,712][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
bin/logstash -f ../cfg -w 4 -b 125  643.02s user 66.43s system 196% cpu 6:01.08 total

b) With metric collection disabled by setting metric.collect: false in the logstash.yml

 /tmp/tests/logstash-5.4.0_es_7x % bin/logstash -f ../cfg -w 4 -b 125
Sending Logstash's logs to /tmp/tests/logstash-5.4.0_es_7x/logs which is now configured via log4j2.properties
[2017-06-02T09:01:50,573][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2017-06-02T09:01:50,579][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2017-06-02T09:01:50,720][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x2982556b URL:http://localhost:9200/>}
[2017-06-02T09:01:50,722][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-06-02T09:01:50,787][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-06-02T09:01:50,806][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0x19c89011 URL://localhost>]}
[2017-06-02T09:01:50,808][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-02T09:01:50,833][INFO ][logstash.pipeline        ] Pipeline main started
[2017-06-02T09:01:50,934][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-02T09:04:50,222][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
bin/logstash -f ../cfg -w 4 -b 125  447.07s user 60.67s system 264% cpu 3:12.27 total

Result summary

logstash version parameters metric collection time between pipeline start and end (seconds) total runtime wallclock (seconds) ~ eps
5.4.0 w/ es 7.2.1 -w 4 -b 125 true 347.743 357.16 28757
5.4.0 w/ es 7.2.1 -w 4 -b 125 false 179.389 192.27 55744
jsvd commented 7 years ago

I investigated further and found that the slow down is being triggered (but not necessarily caused by) WrappedWriteClient#record_metrics

I applied the patch below which removes all metrics from record_metric:

--- logstash-core/lib/logstash/instrument/wrapped_write_client.rb
+++ logstash-core/lib/logstash/instrument/wrapped_write_client.rb
@@ -29,21 +29,7 @@

     private
     def record_metric(size = 1)
-      @events_metrics.increment(:in, size)
-      @pipeline_metrics.increment(:in, size)
-      @plugin_events_metrics.increment(:out, size)
-
-      clock = @events_metrics.time(:queue_push_duration_in_millis)
-
       result = yield
-
-      # Reuse the same values for all the endpoints to make sure we don't have skew in times.
-      execution_time = clock.stop
-
-      @pipeline_metrics.report_time(:queue_push_duration_in_millis, execution_time)
-      @plugin_events_metrics.report_time(:queue_push_duration_in_millis, execution_time)
-
-      result
     end

     def define_initial_metrics_values

Removing per-event metrics reduces the runtime of the test above from around 340 seconds to 187 seconds. Each metric I remove takes about 20 seconds from the total run time.

jsvd commented 7 years ago

Another example is that grok also tracks metrics for each event, specifically the grok patterns that matched. This will also slow down the performance significantly. In the test below, a COMMONAPACHELOG grok is applied 5 million times to logstash 5.4.1 with:

  1. metric collection enabled: 4 minutes 18 seconds
  2. metric collection disabled: 2 minutes 39 seconds
  3. metric collection enabled: 4 minutes 26 seconds
 /tmp/logstash-5.4.1 % cat /tmp/grok.cfg && echo "metric.collect: true" > config/logstash.yml && bin/logstash -f /tmp/grok.cfg && echo "metric.collect: false" > config/logstash.yml && bin/logstash -f /tmp/grok.cfg && echo "metric.collect: true" > config/logstash.yml && bin/logstash -f /tmp/grok.cfg
input {
  generator { message => '64.242.88.10 - - [07/Mar/2004:16:05:49 -0800] "GET /twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846' count => 5000000 }
}
filter {
  grok { match => { "message" => "%{COMMONAPACHELOG}" } }
}
output { }
[2017-06-09T20:57:40,450][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-09T20:57:40,459][INFO ][logstash.pipeline        ] Pipeline main started
[2017-06-09T20:57:40,540][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-09T21:01:48,017][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
bin/logstash -f /tmp/grok.cfg  764.52s user 41.59s system 311% cpu 4:18.60 total
[2017-06-09T21:01:58,264][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-09T21:01:58,270][INFO ][logstash.pipeline        ] Pipeline main started
[2017-06-09T21:01:58,366][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-09T21:04:27,997][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
bin/logstash -f /tmp/grok.cfg  535.83s user 35.72s system 357% cpu 2:39.97 total
[2017-06-09T21:04:38,281][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-09T21:04:38,288][INFO ][logstash.pipeline        ] Pipeline main started
[2017-06-09T21:04:38,369][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-09T21:08:54,057][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
bin/logstash -f /tmp/grok.cfg  774.91s user 41.25s system 306% cpu 4:26.07 total
jsvd commented 7 years ago

I've done a few more benchmarks to understand the overall impact of metric producing in a tight loop. If a tight loop of setting a gauge can do 150k operations per second, inserting this gauge in an single threaded event producing loop caps the events per second to that rate.

Code used to test:

require 'logstash-core'
require 'logstash/instrument/collector'
require 'logstash/instrument/metric'
require 'benchmark/ips'

$collector = LogStash::Instrument::Collector.new
$metric = LogStash::Instrument::Metric.new($collector)

def change_one_metric
  $metric.gauge([:metrics], :m1, rand(10))
end

def change_two_metrics
  $metric.gauge([:metrics], :m2, rand(10))
  $metric.gauge([:metrics], :m3, rand(10))
end

def change_three_metrics
  $metric.gauge([:metrics], :m4, rand(10))
  $metric.gauge([:metrics], :m5, rand(10))
  $metric.gauge([:metrics], :m6, rand(10))
end

def change_four_metrics
  $metric.gauge([:metrics], :m7, rand(10))
  $metric.gauge([:metrics], :m8, rand(10))
  $metric.gauge([:metrics], :m9, rand(10))
  $metric.gauge([:metrics], :m10, rand(10))
end

Benchmark.ips do |x|
  x.config(:time => 15, :warmup => 10)

  x.report("change_one_metric", "change_one_metric")
  x.report("change_two_metrics", "change_two_metrics")
  x.report("change_three_metrics", "change_three_metrics")
  x.report("change_four_metrics", "change_four_metrics")

  x.compare!
end

Results:

% bin/ruby bench.rb
Warming up --------------------------------------
   change_one_metric    11.134k i/100ms
  change_two_metrics     6.304k i/100ms
change_three_metrics     4.476k i/100ms
 change_four_metrics     3.604k i/100ms
Calculating -------------------------------------
   change_one_metric    156.890k (± 5.4%) i/s -      2.349M
  change_two_metrics     79.067k (± 5.4%) i/s -      1.185M
change_three_metrics     53.636k (± 6.4%) i/s -    801.204k
 change_four_metrics     41.046k (± 4.6%) i/s -    616.284k

Comparison:
   change_one_metric:   156889.6 i/s
  change_two_metrics:    79067.4 i/s - 1.98x  slower
change_three_metrics:    53635.9 i/s - 2.93x  slower
 change_four_metrics:    41046.1 i/s - 3.82x  slower
ph commented 7 years ago

@jsvd Can you redo the same test with a namespaced metric? this will remove the cost associated with creating the array.

ph commented 7 years ago

@jsvd The biggest problem in the flow is probably this: https://github.com/elastic/logstash/blob/master/logstash-core/lib/logstash/instrument/metric_store.rb#L55

We clone + append, clone is an expensive operation.

jsvd commented 7 years ago

namespaced_bench.rb:

require 'logstash-core'
require 'logstash/instrument/collector'
require 'logstash/instrument/metric'
require 'logstash/instrument/namespaced_metric'
require 'benchmark/ips'

$collector = LogStash::Instrument::Collector.new
$metric = LogStash::Instrument::Metric.new($collector)
$namespaced_metric = $metric.namespace([:metrics])

def change_one_metric
  $namespaced_metric.gauge(:m1, rand(10))
end

def change_two_metrics
  $namespaced_metric.gauge(:m2, rand(10))
  $namespaced_metric.gauge(:m3, rand(10))
end

def change_three_metrics
  $namespaced_metric.gauge(:m4, rand(10))
  $namespaced_metric.gauge(:m5, rand(10))
  $namespaced_metric.gauge(:m6, rand(10))
end

def change_four_metrics
  $namespaced_metric.gauge(:m7, rand(10))
  $namespaced_metric.gauge(:m8, rand(10))
  $namespaced_metric.gauge(:m9, rand(10))
  $namespaced_metric.gauge(:m10, rand(10))
end

Benchmark.ips do |x|
  x.config(:time => 15, :warmup => 10)

  x.report("change_one_metric", "change_one_metric")
  x.report("change_two_metrics", "change_two_metrics")
  x.report("change_three_metrics", "change_three_metrics")
  x.report("change_four_metrics", "change_four_metrics")

  x.compare!
end

results:

Warming up --------------------------------------
   change_one_metric    12.663k i/100ms
  change_two_metrics     7.296k i/100ms
change_three_metrics     5.075k i/100ms
 change_four_metrics     3.918k i/100ms
Calculating -------------------------------------
   change_one_metric    165.870k (± 5.5%) i/s -      2.482M
  change_two_metrics     85.226k (± 5.0%) i/s -      1.277M
change_three_metrics     57.256k (± 5.0%) i/s -    857.675k
 change_four_metrics     43.208k (± 5.4%) i/s -    646.470k

Comparison:
   change_one_metric:   165870.4 i/s
  change_two_metrics:    85225.8 i/s - 1.95x  slower
change_three_metrics:    57256.5 i/s - 2.90x  slower
 change_four_metrics:    43207.7 i/s - 3.84x  slower
ph commented 7 years ago

Can we redo theses test with the rewrite of the metrics type in pure java https://github.com/elastic/logstash/pull/7274 ? Looking at the profiling trace that show me, we create a lot of instance in concurrent ruby for the atomic reference.

ph commented 7 years ago

@jsvd can you add your profiling export to this issue?

jsvd commented 7 years ago

profile script:

require 'logstash-core'
require 'logstash/instrument/collector'
require 'logstash/instrument/metric'
require 'logstash/instrument/namespaced_metric'

class Test
  def initialize
    collector = LogStash::Instrument::Collector.new
    metric = LogStash::Instrument::Metric.new(collector)
    @namespaced_metric = metric.namespace([:metrics])
  end

  def change_one_metric
    @namespaced_metric.gauge(:m7, rand(10))
  end
end

$obj = Test.new

10.times { $obj.change_one_metric } # ensure metrics are created

require 'jruby/profiler'

profile_data = JRuby::Profiler.profile do
  i = 0
  while i < 5_000_000
    $obj.change_one_metric
    i += 1
  end
end

#printer = JRuby::Profiler::GraphProfilePrinter.new(profile_data)
printer = JRuby::Profiler::FlatProfilePrinter.new(profile_data)
ps = java.io.PrintStream.new(STDOUT.to_outputstream)
printer.printHeader(ps)
printer.printProfile(ps)
printer.printFooter(ps)

result:

% JRUBY_OPTS="--profile.api" bin/ruby profile.rb
main profile results:
Total time: 77.20

     total        self    children       calls  method
----------------------------------------------------------------
     77.20        2.93       74.27     5000000  Test#change_one_metric
     72.75        2.76       69.99     5000000  LogStash::Instrument::NamespacedMetric#gauge
     69.49        5.04       64.45     5000000  LogStash::Instrument::Metric#gauge
     59.89        5.64       54.26     5000000  LogStash::Instrument::Collector#push
     48.11        8.74       39.37     5000000  LogStash::Instrument::MetricStore#fetch_or_store
     23.63        3.18       20.46     5000000  LogStash::Instrument::MetricType.create
     19.75        2.27       17.49    10000000  Class#new
     17.49        3.57       13.92     5000000  LogStash::Instrument::MetricType::Gauge#initialize
     13.31        3.25       10.06     5000000  Concurrent::Collection::JRubyMapBackend#put_if_absent
     10.53        3.44        7.09     5000000  Concurrent::MutexAtomicReference#initialize
      7.65        6.93        0.72     5000000  Array#hash
      6.15        2.17        3.98     5000000  LogStash::Instrument::MetricType::Gauge#execute
      4.54        3.41        1.13    10000000  Concurrent::Synchronization::JRubyLockableObject#synchronize
      3.98        2.56        1.42     5000000  Concurrent::MutexAtomicReference#set
      3.97        2.81        1.16     5000000  Concurrent::Synchronization::Object#initialize
      3.75        2.68        1.07     5000000  LogStash::Instrument::Metric.validate_key!
      2.41        2.41        0.00     5000000  Array#==
      1.52        1.52        0.00     5000000  Kernel.rand
      1.17        1.17        0.00     5000000  LogStash::Instrument::MetricType::Base#initialize
      1.13        1.13        0.00     5000000  Array#<<
      1.13        1.13        0.00     5000000  Concurrent::MutexAtomicReference#ns_initialize
      0.83        0.83        0.00     5000000  Kernel.dup
      0.72        0.72        0.00    10000000  Symbol#hash
      0.71        0.71        0.00     5000000  Hash#[]
      0.64        0.64        0.00     5000000  Symbol#empty?
      0.61        0.61        0.00     5000000  Concurrent::Synchronization::Object#initialize_volatile_with_cas
      0.55        0.55        0.00     5000000  Concurrent::Synchronization::JRubyObject#initialize
      0.50        0.50        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.46        0.46        0.00     5000000  Kernel.block_given?
      0.43        0.43        0.00     5000000  Kernel.nil?
      0.42        0.42        0.00     5000000  Kernel.class
      0.39        0.39        0.00     5000000  LogStash::Instrument::Metric#collector
jakelandis commented 7 years ago

@jsvd - Thanks for the test scripts. I ran the most recent one posted against a few different branches. My computer must be faster then yours 😕 , since these total times will be CPU bound

v5.4.3 Tag w/ JRuby 1.7

Total time: 30.39

     total        self    children       calls  method
----------------------------------------------------------------
     30.39        1.20       29.19     5000000  Test#change_one_metric
     28.45        1.12       27.33     5000000  LogStash::Instrument::NamespacedMetric#gauge
     26.95        1.88       25.07     5000000  LogStash::Instrument::Metric#gauge
     22.84        2.13       20.72     5000000  LogStash::Instrument::Collector#push
     18.39        3.03       15.36     5000000  LogStash::Instrument::MetricStore#fetch_or_store
      9.91        1.15        8.76     5000000  LogStash::Instrument::MetricType.create
      8.39        0.72        7.66    10000000  Class#new
      7.66        1.41        6.26     5000000  LogStash::Instrument::MetricType::Gauge#initialize
      5.00        1.46        3.53     5000000  Concurrent::MutexAtomicReference#initialize
      4.40        1.39        3.01     5000000  Concurrent::Collection::JRubyMapBackend#put_if_absent
      2.33        0.76        1.57     5000000  LogStash::Instrument::MetricType::Gauge#execute
      2.09        1.32        0.77     5000000  Concurrent::Synchronization::Object#initialize
      2.03        1.52        0.51     5000000  Array#hash
      1.99        1.47        0.53    10000000  Concurrent::Synchronization::JRubyLockableObject#synchronize
      1.64        1.16        0.48     5000000  LogStash::Instrument::Metric.validate_key!
      1.57        1.02        0.55     5000000  Concurrent::MutexAtomicReference#set
      0.97        0.97        0.00     5000000  Array#==
      0.74        0.74        0.00     5000000  Kernel.rand
      0.53        0.53        0.00     5000000  Concurrent::MutexAtomicReference#ns_initialize
      0.52        0.52        0.00     5000000  Array#<<
      0.51        0.51        0.00    10000000  Symbol#hash
      0.49        0.49        0.00     5000000  LogStash::Instrument::MetricType::Base#initialize
      0.39        0.39        0.00     5000000  Concurrent::Synchronization::Object#initialize_volatile_with_cas
      0.38        0.38        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.38        0.38        0.00     5000000  Concurrent::Synchronization::JRubyObject#initialize
      0.37        0.37        0.00     5000000  Hash#[]
      0.35        0.35        0.00     5000000  LogStash::Instrument::Metric#collector
      0.31        0.31        0.00     5000000  Kernel.dup
      0.26        0.26        0.00     5000000  Symbol#empty?
      0.24        0.24        0.00     5000000  Kernel.class
      0.23        0.23        0.00     5000000  Kernel.block_given?
      0.22        0.22        0.00     5000000  Kernel.nil?

6.x w/JRuby 9x, Before https://github.com/elastic/logstash/pull/7564 (10007948)

Total time: 30.97

     total        self    children       calls  method
----------------------------------------------------------------
     30.97        1.23       29.74     5000000  Test#change_one_metric
     29.06        1.18       27.88     5000000  LogStash::Instrument::NamespacedMetric#gauge
     27.52        2.00       25.53     5000000  LogStash::Instrument::Metric#gauge
     23.16        2.24       20.92     5000000  LogStash::Instrument::Collector#push
     18.50        2.88       15.62     5000000  LogStash::Instrument::MetricStore#fetch_or_store
     10.16        1.24        8.92     5000000  LogStash::Instrument::MetricType.create
      8.53        0.79        7.74    10000000  Class#new
      7.74        1.36        6.38     5000000  LogStash::Instrument::MetricType::Gauge#initialize
      5.14        1.49        3.64     5000000  Concurrent::MutexAtomicReference#initialize
      4.40        1.39        3.01     5000000  Concurrent::Collection::JRubyMapBackend#put_if_absent
      2.43        0.89        1.54     5000000  LogStash::Instrument::MetricType::Gauge#execute
      2.11        1.31        0.80     5000000  Concurrent::Synchronization::Object#initialize
      2.07        1.40        0.67    10000000  Concurrent::Synchronization::JRubyLockableObject#synchronize
      2.03        1.53        0.50     5000000  Array#hash
      1.82        1.28        0.53     5000000  LogStash::Instrument::Metric.validate_key!
      1.54        1.00        0.53     5000000  Concurrent::MutexAtomicReference#set
      0.98        0.98        0.00     5000000  Array#==
      0.68        0.68        0.00     5000000  Kernel.rand
      0.67        0.67        0.00     5000000  Concurrent::MutexAtomicReference#ns_initialize
      0.54        0.54        0.00     5000000  Array#<<
      0.50        0.50        0.00    10000000  Symbol#hash
      0.45        0.45        0.00     5000000  LogStash::Instrument::MetricType::Base#initialize
      0.42        0.42        0.00     5000000  Concurrent::Synchronization::Object#initialize_volatile_with_cas
      0.38        0.38        0.00     5000000  Hash#[]
      0.37        0.37        0.00     5000000  Concurrent::Synchronization::JRubyObject#initialize
      0.36        0.36        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.33        0.33        0.00     5000000  LogStash::Instrument::Metric#collector
      0.30        0.30        0.00     5000000  Kernel.dup
      0.27        0.27        0.00     5000000  Kernel.nil?
      0.26        0.26        0.00     5000000  Symbol#empty?
      0.23        0.23        0.00     5000000  Kernel.block_given?
      0.22        0.22        0.00     5000000  Kernel.class

6.x w/JRuby 9x, After (87c3e1178):

Total time: 20.40

     total        self    children       calls  method
----------------------------------------------------------------
     20.40        1.06       19.34     5000000  Test#change_one_metric
     18.59        1.13       17.46     5000000  LogStash::Instrument::NamespacedMetric#gauge
     17.11        1.80       15.32     5000000  LogStash::Instrument::Metric#gauge
     13.13        2.01       11.12     5000000  LogStash::Instrument::Collector#push
      8.89        2.07        6.82     5000000  LogStash::Instrument::MetricStore#fetch_or_store
      5.73        0.58        5.15    10000000  Concurrent::Map#[]
      4.23        1.26        2.97     5000000  Concurrent::Collection::JRubyMapBackend#[]
      2.23        0.71        1.51     5000000  LogStash::Instrument::MetricType::Gauge#execute
      1.93        1.46        0.47     5000000  Array#hash
      1.65        1.18        0.47     5000000  LogStash::Instrument::Metric.validate_key!
      1.51        0.97        0.54     5000000  Concurrent::MutexAtomicReference#set
      1.04        1.04        0.00     5000000  Array#==
      0.75        0.75        0.00     5000000  Kernel.rand
      0.57        0.57        0.00     5000000  Array#<<
      0.54        0.54        0.00     5000000  Concurrent::Synchronization::JRubyLockableObject#synchronize
      0.47        0.47        0.00    10000000  Symbol#hash
      0.44        0.44        0.00    10000000  Kernel.nil?
      0.35        0.35        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.31        0.31        0.00     5000000  LogStash::Instrument::Metric#collector
      0.30        0.30        0.00     5000000  Kernel.dup
      0.25        0.25        0.00     5000000  Symbol#empty?
      0.22        0.22        0.00     5000000  Kernel.class

The 10 second increase in speed seems to be primarily due to the change in "LogStash::Instrument::MetricType.create", since we no longer create uneeded objects. In more realistic tests, I was only able to achieve a 7% increase in speed due to that bug fix.

6.x w/JRuby 9x, with the PR on https://github.com/elastic/logstash/pull/7595 (moving to Java backed counter and guage objects, but store still in Ruby), no difference in performance

Total time: 20.27

     total        self    children       calls  method
----------------------------------------------------------------
     20.27        1.12       19.15     5000000  Test#change_one_metric
     18.52        1.05       17.47     5000000  LogStash::Instrument::NamespacedMetric#gauge
     17.15        1.86       15.29     5000000  LogStash::Instrument::Metric#gauge
     13.14        2.14       11.00     5000000  LogStash::Instrument::Collector#push
      8.94        2.14        6.80     5000000  LogStash::Instrument::MetricStore#fetch_or_store
      5.72        0.59        5.13    10000000  Concurrent::Map#[]
      4.22        1.31        2.91     5000000  Concurrent::Collection::JRubyMapBackend#[]
      2.06        0.86        1.19     5000000  LogStash::Instrument::MetricType::Gauge#execute
      1.98        1.47        0.51     5000000  Array#hash
      1.58        1.07        0.51     5000000  LogStash::Instrument::Metric.validate_key!
      1.19        0.75        0.44     5000000  Kernel.send
      0.93        0.93        0.00     5000000  Array#==
      0.62        0.62        0.00     5000000  Kernel.rand
      0.54        0.54        0.00     5000000  Array#<<
      0.51        0.51        0.00    10000000  Symbol#hash
      0.47        0.47        0.00    10000000  Kernel.nil?
      0.44        0.44        0.00     5000000  Java::OrgLogstashInstrumentMetricsGauge::LazyDelegatingGauge#set
      0.32        0.32        0.00     5000000  LogStash::Instrument::Metric#collector
      0.32        0.32        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.30        0.30        0.00     5000000  Kernel.dup
      0.28        0.28        0.00     5000000  Symbol#empty?
      0.24        0.24        0.00     5000000  Kernel.class
original-brownbear commented 7 years ago

@jsvd @jakelandis @andrewvc see #7772 ...this seems to be the main culprit here.

jakelandis commented 7 years ago

@original-brownbear - nice find! I am working on related metrics stuff, and will prioritize #7772.

jakelandis commented 7 years ago

Only marginal increases of performance ~ 9% with this test from the change over at #7772. Other tests have seen #7772 with as much as a 15% increase in throughput.

Total time: 18.61

     total        self    children       calls  method
----------------------------------------------------------------
     18.61        1.13       17.47     5000000  Test#change_one_metric
     16.86        1.07       15.78     5000000  LogStash::Instrument::NamespacedMetric#gauge
     15.45        1.83       13.62     5000000  LogStash::Instrument::Metric#gauge
     11.49        2.06        9.43     5000000  LogStash::Instrument::Collector#push
      7.40        2.50        4.90     5000000  LogStash::Instrument::MetricStore#fetch_or_store
      2.03        0.89        1.14     5000000  LogStash::Instrument::MetricType::Gauge#execute
      1.94        1.46        0.48     5000000  Array#hash
      1.85        0.57        1.27    10000000  Concurrent::Map#[]
      1.59        1.10        0.49     5000000  LogStash::Instrument::Metric.validate_key!
      1.14        0.68        0.46     5000000  Kernel.send
      0.62        0.62        0.00     5000000  Kernel.rand
      0.59        0.59        0.00     5000000  Array#<<
      0.48        0.48        0.00    10000000  Symbol#hash
      0.46        0.46        0.00     5000000  Java::OrgLogstashInstrumentMetricsGauge::LazyDelegatingGauge#set
      0.45        0.45        0.00    10000000  Kernel.nil?
      0.33        0.33        0.00     5000000  LogStash::Instrument::NamespacedMetric#namespace_name
      0.32        0.32        0.00     5000000  Concurrent::Collection::JRubyMapBackend#[]
      0.31        0.31        0.00     5000000  LogStash::Instrument::Metric#collector
      0.30        0.30        0.00     5000000  Kernel.dup
      0.26        0.26        0.00     5000000  Symbol#empty?
      0.24        0.24        0.00     5000000  Kernel.class
original-brownbear commented 7 years ago

@jakelandis @suyograo I think we can close here right? All dealt with for now (until the Java impl. comes in) right?