logstash-plugins / logstash-filter-grok

Grok plugin to parse unstructured (log) data into something structured.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Apache License 2.0
122 stars 97 forks source link

fix memory leak on JRuby 1.x #136

Closed jakelandis closed 6 years ago

jakelandis commented 6 years ago

It seems that on JRuby 1.x (tested jruby-1.7.27), the outer loop when coded as Java forEach causes JRuby to leak some internal state. Specifically the NonBlockingHashMapLong$CHM dominates the memory. See the referenced bug for a reproduction case. The commit changes the outer loop to a standard for loop to avoid this bug.

Fixes #135

jakelandis commented 6 years ago

@yaauie - thanks!

I updated to use the keySet, I agree it reads much better.

I also updated the commit message (and this PR's first comment) to help clarify. I meant Java 8's lambda based forEach instead of the standard for loop.

jakelandis commented 6 years ago

I ran a test overnight with this code and the 5.6 branch of code. The same test that helped to identify this issue, 48 workers with a generator input and multiple groks. Prior to this change, ~5hours in there was a noticeable memory issue, and after ~12 hours Logstash was so un-responsive YourKit wouldn't even stay attached.

Now:

image

The big drop is a manual GC, and it came back to the same level it was at the start.

jakelandis commented 6 years ago

@yaauie - thanks. Odd indeed. Since this only presents itself on JRuby 1.x (which i believe is EOL) I didn't push too hard to understand what is going on under the covers.

elasticsearch-bot commented 6 years ago

Jake Landis merged this into the following branches!

Branch Commits
master eaaac0e85edc597d0abb918abe9f1e8cca5f9545