fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.82k stars 1.34k forks source link

time_key default needlessly overridden in json, regexp, and ltsv parsers #3269

Open wrossmann opened 3 years ago

wrossmann commented 3 years ago

Describe the bug The global default for time_key is nil, indicating that the event time should not be derived from the message at all, but the json, regexp, and ltsv parsers override this to "time". Furthermore, the fluentd config syntax does not permit time_key to be set back to nil. as specifying nil, null, or "" result in the literal string representations "nil", "null", and "" being used. Specifying "#{use_nil}" results in an error that a value is required.

While it appears that time_key was required to be defined in the original implementations of these parsers, it has not been required for some time now, and the result is an edge case where a message contains a field called time that is not parseable as time causes an exception and the message to be dropped.

It is possible to work around this issue by setting time_key to a value that you do not expect to occur as a key in your messages, but this is a kludge at best.

It's also worth noting here that the docs do note that time_key is overridden by these parsers: https://docs.fluentd.org/configuration/parse-section#parse-parameters

To Reproduce

  1. Define any input with a json parser.
  2. Submit a message such as {"time": {"begin": 1611709938020}}

Expected behavior The message to be decoded as a dict with a key named "time" and its arbitrary value.

Environment

Configuration

<source>
  # ...
  <parse>
    @type json
  </parse>
</source>

Error Log

#<Fluent::Plugin::Parser::ParserError: value must be a string or a number: {"client_sent"=>1611709938020}(Hash)> @ /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.11.5/lib/fluent/plugin/parser.rb:196:in `rescue in parse_time'
cosmo0920 commented 3 years ago

timekey use_nil shouldn't work for you?

<parse>
  @type json
  time_key use_nil
</parse>

Sample data:

{"time": {"begin": 1611709938020}}
{"time": {"begin": 1611710938020}}

Read logs:

2021-03-10 15:06:52.923680000 +0900 log.issue3269.log: {"time":{"begin":1611709938020}}
2021-03-10 15:06:52.923701000 +0900 log.issue3269.log: {"time":{"begin":1611710938020}}
github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

ashie commented 3 years ago

timekey use_nil shouldn't work for you?

Although it will work as an work around in most cases, it isn't a right solution. In this case the string use_nil is used as time_key. So that if an incoming JSON includes a key use_nil, the value for it will be used as time!

Specifying "#{use_nil}" results in an error that a value is required.

Yes, "#{use_nil}" is the only way to set nil. But fluentd doesn't permit to set nil even by "#{use_nil}" when a plugin set a non nil value as default for an option.

https://github.com/fluent/fluentd/blob/c62dc312eedb08bcd124c58042125e98385e1d6e/lib/fluent/config/section.rb#L186-L191