logstash-plugins / logstash-codec-multiline

Apache License 2.0
7 stars 31 forks source link

Memory Leak issue #28

Closed gmoskovicz closed 8 years ago

gmoskovicz commented 8 years ago

I am using a simple configuration:

input {

    file {

        type => "one_type"

        start_position => "end"

        path => [

        "/Users/Gabriel/Documents/ElasticSearch/Test/logstash-2.1.2/logs/*"

        ]

        sincedb_path => "/Users/Gabriel/Documents/ElasticSearch/Test/.sincedb"

        codec => multiline {

        auto_flush_interval => 120

        max_lines => 100

        max_bytes => "10 MiB"

        pattern => "NVRAZBYCX"

        what => "next"

        negate => true

        }

    }

}

filter {

    drop{}

}

And generating random lines of logs using a simple command line that writes new lines to a log inside logs folder.

The memory consumption is going up and up with no stop, and looks like the GC is not being able to collect much garbage:

screen shot 2016-02-08 at 15 07 46

I generated a heap dump from the logstash instance and got the following:

HTML to be opened on a browser (remove the .txt since Github prevents people from uploading html):

Objects-by-class.html.txt

Image:

screen shot 2016-02-08 at 15 30 44
gmoskovicz commented 8 years ago

Same test without the auto_flush_interval is showing better results:

screen shot 2016-02-08 at 15 49 42 1
guyboertje commented 8 years ago

I can confirm that a library we use concurrent_ruby is not fully removing cancelled timer tasks.

I am working on two solutions for two different scenarios.

ph commented 8 years ago

I can confirm that #29 PR fixes this issue, I used @gmoskovicz configuration and a script that generate log line. I ran the scripts for 2 hours and monitored the memory usager. GC was correctly behaving with no leak.

gmoskovicz commented 8 years ago

Thanks for taking care of this @ph @guyboertje