logstash-plugins / logstash-filter-grok

Grok plugin to parse unstructured (log) data into something structured.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Apache License 2.0
122 stars 97 forks source link

Pipeline crashes with undefined method `each' for nil:NilClass error in event filter method #191

Open erhudy opened 1 week ago

erhudy commented 1 week ago

Logstash information:

  1. Logstash version: 7.17.3
  2. Installation source: Elastic container image
  3. Being run: via Docker
  4. Included in container image (Grok 4.4.1)
  5. JVM version: whatever is in the container image (11?)
  6. OS: RHEL 7.7

Description of the problem including expected versus actual behavior: Our Logstash pipelines sometimes crash with a particular error in the Grok plugin:

{
    "level": "ERROR",
    "loggerName": "logstash.javapipeline",
    "timeMillis": 1719522558635,
    "thread": "[beats-plain]>worker11",
    "logEvent": {
        "message": "Pipeline worker error, the pipeline will be stopped",
        "pipeline_id": "beats-plain",
        "error": "(NoMethodError) undefined method `each' for nil:NilClass",
        "exception": {
            "metaClass": {
                "metaClass": {
                    "exception": "Java::OrgJrubyExceptions::NoMethodError",
                    "backtrace": [
                        "usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_filter_minus_grok_minus_4_dot_4_dot_1.lib.logstash.filters.grok.filter(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.1/lib/logstash/filters/grok.rb:300)",
                        "usr.share.logstash.logstash_minus_core.lib.logstash.filters.base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:159)",
                        "usr.share.logstash.logstash_minus_core.lib.logstash.filters.base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:178)",
                        "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1821)",
                        "usr.share.logstash.logstash_minus_core.lib.logstash.filters.base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:175)",
                        "org.logstash.config.ir.compiler.AbstractFilterDelegatorExt.multi_filter(org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:134)",
                        "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_workers(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:299)"
                    ],
                    "thread": "#<Thread:0x4cc678c3@/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:53 sleep>"
                }
            }
        }
    }
}

I suspect that there is a race condition of some sort based on the fact that we can have Logstash instances go weeks or even longer without suffering from this problem and yet relatively low-traffic instances will exhibit the pipeline crash and require a Logstash restart (not that this seems directly tied to low traffic, it happens on both low- and high-traffic instances). When it does happen, we will get a couple of these tracebacks on different threads in Logstash and then the pipeline grinds to a halt and we get an alert that it is no longer passing events and we have to come and restart Logstash.

Steps to reproduce: Unfortunately there is no known reproducer for the bug, it just happens randomly.

Please include a minimal but complete recreation of the problem, including (e.g.) pipeline definition(s), settings, locale, etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it:

If I had one I would put it here.