logstash-plugins / logstash-filter-translate

Translate filter for Logstash
Apache License 2.0
21 stars 47 forks source link

The incoming YAML document exceeds the limit: 3145728 code points #96

Closed SnDsound closed 1 year ago

SnDsound commented 1 year ago

Hi folks,

Description of the problem including expected versus actual behavior: After change introduced in version 8.6.1 (Updated snakeyaml to 1.33 #14848) my logstash stopped working. This change is introducing limit of 3MB for YAML file due to CVE. I'm using translate filter plugin, with large YAML files as input. In version 8.6.0 everything works, because there is no file limit. In version 8.6.2 pipeline is not loading correctly.

Logstash information:

  1. Logstash version: 8.6.2
  2. Logstash installation source: Docker
  3. How is Logstash being run: Docker

Plugins installed: (bin/logstash-plugin list --verbose) logstash-filter-translate (3.4.0)

Steps to reproduce:

Use large YAML file (25 MB in my case) with translate plugin:

translate {
    id => "filter_translate_123456"
    source => "something.ip"
    target => "something.name"
    exact => "true"
    refresh_interval => 0
    refresh_behaviour => "replace"
    dictionary_path => '/usr/share/logstash/files/largefile.yaml'
}

Provide logs (if relevant):

[2023-02-24T14:58:10,950][ERROR][logstash.javapipeline    ][pipeline_translate] Pipeline error {:pipeline_id=>"pipeline_translate", :exception=>#<LogStash::Filters::Dictionary::DictionaryFileError: Translate: The incoming YAML document exceeds the limit: 3145728 code points. when loading dictionary file at /usr/share/logstash/files/largefile.yaml>, :backtrace=>["org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:342)", "org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:263)", "org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingKey.produce(ParserImpl.java:662)", "org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:185)", "org.yaml.snakeyaml.parser.ParserImpl.getEvent(ParserImpl.java:195)", "org.jruby.ext.psych.PsychParser.parse(PsychParser.java:210)", "org.jruby.ext.psych.PsychParser$INVOKER$i$parse.call(PsychParser$INVOKER$i$parse.gen)", "usr.share.logstash.vendor.jruby.lib.ruby.stdlib.psych.RUBY$method$parse_stream$0(/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/psych.rb:460)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.yaml_file.RUBY$method$read_file_into_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/yaml_file.rb:19)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$merge_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:84)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:152)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:148)", "org.jruby.RubyMethod.call(RubyMethod.java:116)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$load_dictionary$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:56)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$initialize$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:50)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.dictionary.file.RUBY$method$create$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/dictionary/file.rb:14)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_6_dot_0.gems.logstash_minus_filter_minus_translate_minus_3_dot_4_dot_0.lib.logstash.filters.translate.RUBY$method$register$0(/usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-filter-translate-3.4.0/lib/logstash/filters/translate.rb:184)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:152)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:148)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.RubyClass.finvoke(RubyClass.java:572)", "org.jruby.runtime.Helpers.invoke(Helpers.java:649)", "org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:348)", "org.logstash.config.ir.compiler.FilterDelegatorExt.doRegister(FilterDelegatorExt.java:88)", "org.logstash.config.ir.compiler.AbstractFilterDelegatorExt.register(AbstractFilterDelegatorExt.java:75)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$block$register_plugins$1(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:234)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:151)", "org.jruby.runtime.BlockBody.yield(BlockBody.java:106)", "org.jruby.runtime.Block.yield(Block.java:188)", "org.jruby.RubyArray.each(RubyArray.java:1865)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$register_plugins$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:233)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:165)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:278)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$maybe_setup_out_plugins$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:601)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$maybe_setup_out_plugins$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:598)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$start_workers$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:246)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$start_workers$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:242)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$run$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:191)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$run$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:186)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:248)", "org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:255)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$block$start$1(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:143)", "org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:141)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:64)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:58)", "org.jruby.runtime.Block.call(Block.java:143)", "org.jruby.RubyProc.call(RubyProc.java:309)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:107)", "java.base/java.lang.Thread.run(Thread.java:833)"], "pipeline.sources"=>["/usr/share/logstash/pipeline/syslog.conf"], :thread=>"#<Thread:0x4a3de69b run>"}
[2023-02-24T14:58:10,950][INFO ][logstash.javapipeline    ][pipeline_translate] Pipeline terminated {"pipeline.id"=>"pipeline_translate"}
[2023-02-24T14:58:10,958][ERROR][logstash.agent           ] Failed to execute action {:id=>:pipeline_translate, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create<pipeline_translate>, action_result: false", :backtrace=>nil}
MikeKemmerer commented 1 year ago

Thank you for identifying this. We use the translate filter with large yaml files, and this is a dealbreaker for upgrading.

nicpenning commented 1 year ago

We are stuck on 8.3.2 due to this as well. Please investigate!

nicpenning commented 1 year ago

We migrated this particular pipeline to an Elasticsearch ingest node and used an enrich pipeline instead. We could then upgrade from 8.3.2 to 8.7.0 without any major side effects. Hope this helps others out there!

kaisecheng commented 1 year ago

It is an issue related to jruby psych and got a workaround in jruby 9.4.1.0

The fix for this plugin wiill need update Psych usage