fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.82k stars 1.34k forks source link

Allow plugin/parser_multiline to drop lines containing invalid byte sequence in UTF-8 #4007

Open digital88 opened 1 year ago

digital88 commented 1 year ago

Is your feature request related to a problem? Please describe.

I recently found out that sometimes (very rarely) my app logs are getting corrupted. One line in app log may contain invalid UTF-8 byte sequence. In this case fluentd (I'm using td-agent) is flooding it's own log file with repeated error message:

2023-01-16 11:10:31 +0300 [error]: #0 [irm3] invalid byte sequence in UTF-8
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/parser_multiline.rb:107:in `match?'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/parser_multiline.rb:107:in `firstline?'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:664:in `block in parse_multilines'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:663:in `each'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:663:in `parse_multilines'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:607:in `call'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:607:in `receive_lines'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1104:in `block in handle_notify'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1140:in `with_io'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1064:in `handle_notify'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1006:in `block in on_notify'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1006:in `synchronize'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:1006:in `on_notify'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:817:in `on_notify'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:447:in `construct_watcher'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:452:in `block in start_watchers'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:451:in `each_value'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:451:in `start_watchers'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:384:in `refresh_watchers'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/plugin/in_tail.rb:254:in `start'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:203:in `block in start'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:192:in `block (2 levels) in lifecycle'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:191:in `each'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:191:in `block in lifecycle'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:178:in `each'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:178:in `lifecycle'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/root_agent.rb:202:in `start'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/engine.rb:248:in `start'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/engine.rb:147:in `run'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/supervisor.rb:783:in `block in run_worker'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/supervisor.rb:1056:in `main_process'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/supervisor.rb:774:in `run_worker'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/lib/fluent/command/fluentd.rb:370:in `<top (required)>'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in `require'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in `require'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.15.3/bin/fluentd:15:in `<top (required)>'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/bin/fluentd:23:in `load'
  2023-01-16 11:10:31 +0300 [error]: #0 C:/opt/td-agent/bin/fluentd:23:in `<main>'

This results in log file of fluentd growing to very large size (several 10s of GB) over weekend.

Describe the solution you'd like

Either write this error once and skip (drop) offending log line, or retry log line reading in controlled maner (for example with config option, once in N minutes) to prevent fluentd own log growing.

Describe alternatives you've considered

Alternative could be config option to limit fluentd log file size. Also this existing config options ignore_repeated_log_interval ignore_same_log_interval help mitigate problem a little, but log size still continues to grow (with much lesser speed).

Additional context

No response

Athishpranav2003 commented 1 month ago

https://github.com/kataring/fluent-plugin-string-scrub @digital88 will this actually solve this problem?

Athishpranav2003 commented 1 month ago

@kenhys there is a existing filter to capture invalid byte sequence as i mentioned in the thread, is it needed seperately to handle this issue internally?

digital88 commented 1 month ago

https://github.com/kataring/fluent-plugin-string-scrub @digital88 will this actually solve this problem?

Could be the solution.

I actually somehow solved problems with corrupted log entries. I think I finally found "right" fluentd configuration options for my use case. Current config seems to work without causing any troubles.

Athishpranav2003 commented 1 month ago

i see I am not sure if its advised to handle the invalid byte seq case internally. let @kenhys reply

kenhys commented 4 weeks ago

How about using the following options?