elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.18k stars 3.49k forks source link

[BUG] starting database-managed geoip filter after db refresh causes pipeline crash #16221

Closed yaauie closed 2 months ago

yaauie commented 2 months ago

Logstash information:

Please include the following information:

  1. Logstash version:
    • 8.12.x
    • 8.13.x
    • 8.14.0-8.14.1
  2. Logstash installation source: ANY
  3. How is Logstash being run: ANY

Plugins installed: (bin/logstash-plugin list --verbose)

logstash-filter-geoip

JVM (e.g. java -version):

  1. JVM version: ANY
  2. JVM installation source: N/A
  3. Value of the LS_JAVA_HOME environment variable if set: N/A

OS version (uname -a if on a Unix-like system): ANY

Description of the problem including expected versus actual behavior:

When a geoip filter is initialized after the geoip database management has updated its databases, it is possible that the newly-initialized filter can receive a database path that is no longer on disk.

The GeoIP filter crashes when it is given a path that does not exist on disk.

_[NOTE: only the geoip filter is affected; the elastic_integration filter, which also uses Logstash's Geoip Database Management feature, is NOT affected by this issue.]_

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including (e.g.) pipeline definition(s), settings, locale, etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it.

  1. start a pipeline with a geoip filter that is configured to use logstash's geoip database management (e.g., no database_path directive, with xpack.geoip.downloader.enabled: true in logstash.yml)
  2. wait for database to be updated (may take days)
  3. trigger a reload the pipeline (e.g., by changing the id of a plugin

Provide logs (if relevant):

Database manager updates its database and correctly notifies the plugin instances in all currently-running plugins correctly, before deleting the stale databases:

[2024-06-14T13:01:26,372][INFO ][logstash.geoipdatabasemanagement.manager] managed geoip database has been updated on disk {:database_type=>"ASN", :database_path=>"/var/lib/logstash/geoip_database_management/1718370071/GeoLite2-ASN.mmdb"}
[2024-06-14T13:01:29,223][INFO ][logstash.geoipdatabasemanagement.manager] managed geoip database has been updated on disk {:database_type=>"City", :database_path=>"/var/lib/logstash/geoip_database_management/1718370071/GeoLite2-City.mmdb"}
[2024-06-14T13:01:26,372][INFO ][logstash.filters.geoip.databasemanager] geoip filter plugin will use database /var/lib/logstash/geoip_database_management/1718370071/GeoLite2-City.mmdb {:database_type=>"City", :pipeline_ids=>["main"]}
[2024-06-14T13:01:29,280][INFO ][logstash.geoipdatabasemanagement.manager] Stale database directory `/var/lib/logstash/geoip_database_management/1718283370` has been deleted

Reload of the pipeline causes a new instance of the plugin to pick up the stale path and crash:

[2024-06-14T13:23:30,772][INFO ][logstash.filters.geoip   ][main] Using geoip database {:path=>"/var/lib/logstash/geoip_database_management/1718283370/GeoLite2-City.mmdb"}

...

[2024-06-14T13:23:31,755][ERROR][logstash.javapipeline    ][main] Pipeline error {:pipeline_id=>"main", :exception=>#<Java::JavaLang::IllegalArgumentException: The database provided was not found in the path>, :backtrace=>["org.logstash.filters.geoip.GeoIPFilter.<init>(org/logstash/filters/geoip/GeoIPFilter.java:92)", "jdk.internal.reflect.GeneratedConstructorAccessor109.newInstance(jdk/internal/reflect/GeneratedConstructorAccessor109)", "jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(jdk/internal/reflect/DelegatingConstructorAccessorImpl.java:45)", "java.lang.reflect.Constructor.newInstanceWithCaller(java/lang/reflect/Constructor.java:499)", "java.lang.reflect.Constructor.newInstance(java/lang/reflect/Constructor.java:480)", "org.jruby.javasupport.JavaConstructor.newInstanceDirect(org/jruby/javasupport/JavaConstructor.java:133)", "org.jruby.RubyClass.new(org/jruby/RubyClass.java:931)", "org.jruby.RubyClass$INVOKER$i$newInstance.call(org/jruby/RubyClass$INVOKER$i$newInstance.gen)", "usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_filter_minus_geoip_minus_7_dot_2_dot_13_minus_java.lib.logstash.filters.geoip.setup_filter(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-filter-geoip-7.2.13-java/lib/logstash/filters/geoip.rb:152)", "usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_filter_minus_geoip_minus_7_dot_2_dot_13_minus_java.lib.logstash.filters.geoip.register(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-filter-geoip-7.2.13-java/lib/logstash/filters/geoip.rb:109)", "org.jruby.RubyClass.finvoke(org/jruby/RubyClass.java:580)", "org.jruby.RubyBasicObject.callMethod(org/jruby/RubyBasicObject.java:349)", "org.logstash.config.ir.compiler.FilterDelegatorExt.doRegister(org/logstash/config/ir/compiler/FilterDelegatorExt.java:89)", "org.logstash.config.ir.compiler.AbstractFilterDelegatorExt.register(org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75)", "org.logstash.config.ir.compiler.AbstractFilterDelegatorExt$INVOKER$i$0$0$register.call(org/logstash/config/ir/compiler/AbstractFilterDelegatorExt$INVOKER$i$0$0$register.gen)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:237)", "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1989)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:236)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.maybe_setup_out_plugins(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:611)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_workers(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:249)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.run(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:194)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:146)", "org.jruby.RubyProc.call(org/jruby/RubyProc.java:352)", "java.lang.Thread.run(java/lang/Thread.java:840)"], "pipeline.sources"=>["central pipeline management"], :thread=>"#<Thread:0x47ce68f2 /usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-06-14T13:23:31,758][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
yaauie commented 2 months ago

The fix for this is set to ship in the next patch release (8.14.2) and in the next minor (8.15.0).

Workaround: provide your own Geoip database(s)

If all instances of the geoip filter are instantiated with database_path pointing to a self-managed mmdb database file on disk, they will not use the Geoip Database Manager and not hit this bug.

Workaround: check for updates less frequently

The licensing of the geoip databases from Maxmind require that updates be consumed within 30 days. The database manager polls for updates every xpack.geoip.downloader.poll.interval (default: 24h [24 hours]), and either marks the current database as still-valid or updates it if an update is available. Checking daily ensures that we have many chances to get updates, even if some of them fail due to network glitches or transient failures.

By setting this to a higher value (like 7d [7 days]), we can reduce the frequency with which we hit this bug, but we also increase the risk of falling out of sync. If the poll interval is set too high, failure to check for updates for 30 consecutive days can result in either:

[NOTE: the suffix of time-value settings is very important: h for hours, d for days]