fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.84k stars 1.34k forks source link

remove_key_name_field true when key_name is message causes ArgumentError error="message does not exist" #4575

Closed nicklit closed 1 month ago

nicklit commented 1 month ago

Describe the bug

When trying to parse a json log in message to a new field, if you try to remove the message field it works the first time, then fails every time after it runs

To Reproduce

copy the config file below and restart td-agent

Expected behavior

no error. message field gone and data field only existing every time sample runs 2024-07-31 16:12:44.030559702 -0400 log: {"data":{"field":"value"}}

Your Environment

- Fluentd version: 1.17.0 (I attempted to update this using td-agent-gem)
- Package version: # td-agent --version
td-agent 4.4.1 fluentd 1.15.2 (c32842297ed2c306f1b841a8f6e55bdd0f1cb27f)
- Operating system: Ubuntu
- Kernel version: Linux 5.15.0-1065-aws #71~20.04.1-Ubuntu

Your Configuration

<source>
    @type sample
    tag log
    sample {"message": "{\"field\":\"value\"}"}
</source>
<filter log>
    @type parser
    key_name message
    reserve_data true
    remove_key_name_field true
    hash_value_field data
    <parse>
        @type json
    </parse>
</filter>
<match log>
  @type stdout
</match>

Your Error Log

2024-07-31 16:12:41 -0400 [info]: init supervisor logger path="/var/log/td-agent/td-agent.log" rotate_age=nil rotate_size=nil
2024-07-31 16:12:41 -0400 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-beats' version '1.1.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-calyptia-monitoring' version '0.1.3'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-elasticsearch' version '5.2.3'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-explode_filter' version '0.2.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-filter_empty_keys' version '0.0.4'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-grok-parser' version '2.6.2'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-kafka' version '0.18.1'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-kinesis-aggregation' version '0.4.1'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-kv-parser' version '1.0.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-metrics-cmetrics' version '0.1.2'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-mutate_filter' version '1.0.7'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-opensearch' version '1.0.7'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-prometheus' version '2.0.3'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-record-modifier' version '2.2.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-rename-key' version '0.4.1'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-s3' version '1.7.1'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-statsd-output' version '1.4.2'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-td' version '1.2.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-utmpx' version '0.5.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluentd' version '1.17.0'
2024-07-31 16:12:42 -0400 [info]: gem 'fluentd' version '1.15.2'
2024-07-31 16:12:42 -0400 [info]: gem 'fluentd' version '0.14.25'
2024-07-31 16:12:42 -0400 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2024-07-31 16:12:42 -0400 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2024-07-31 16:12:42 -0400 [info]: using configuration file: <ROOT>
  <source>
    @type sample
    tag "log"
    sample {"message":"{\"field\":\"value\"}"}
  </source>
  <filter log>
    @type parser
    key_name "message"
    reserve_data true
    remove_key_name_field true
    hash_value_field "data"
    <parse>
      @type "json"
    </parse>
  </filter>
  <match log>
    @type stdout
  </match>
</ROOT>
2024-07-31 16:12:42 -0400 [info]: starting fluentd-1.17.0 pid=1879243 ruby="2.7.6"
2024-07-31 16:12:42 -0400 [info]: spawn command to main:  cmdline=["/opt/td-agent/bin/ruby", "-Eascii-8bit:ascii-8bit", "/opt/td-agent/bin/fluentd", "--log", "/var/log/td-agent/td-agent.log", "--daemon", "/var/run/td-agent/td-agent.pid", "--under-supervisor"]
2024-07-31 16:12:42 -0400 [info]: #0 init worker0 logger path="/var/log/td-agent/td-agent.log" rotate_age=nil rotate_size=nil
2024-07-31 16:12:42 -0400 [info]: adding filter pattern="log" type="parser"
2024-07-31 16:12:42 -0400 [info]: adding match pattern="log" type="stdout"
2024-07-31 16:12:43 -0400 [info]: adding source type="sample"
2024-07-31 16:12:43 -0400 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2024-07-31 16:12:43 -0400 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2024-07-31 16:12:43 -0400 [info]: #0 starting fluentd worker pid=1879252 ppid=1879249 worker=0
2024-07-31 16:12:43 -0400 [info]: #0 fluentd worker is now running worker=0
2024-07-31 16:12:44.030559702 -0400 log: {"data":{"field":"value"}}
2024-07-31 16:12:45 -0400 [warn]: #0 dump an error event: error_class=ArgumentError error="message does not exist" location=nil tag="log" time=2024-07-31 16:12:45.033706463 -0400 record={}
2024-07-31 16:12:45.033706463 -0400 log: {"data":{}}
2024-07-31 16:12:46 -0400 [warn]: #0 dump an error event: error_class=ArgumentError error="message does not exist" location=nil tag="log" time=2024-07-31 16:12:46.035347126 -0400 record={}
2024-07-31 16:12:46.035347126 -0400 log: {"data":{}}
2024-07-31 16:12:47 -0400 [warn]: #0 dump an error event: error_class=ArgumentError error="message does not exist" location=nil tag="log" time=2024-07-31 16:12:47.037108884 -0400 record={}
2024-07-31 16:12:47.037108884 -0400 log: {"data":{}}
2024-07-31 16:12:48 -0400 [warn]: #0 dump an error event: error_class=ArgumentError error="message does not exist" location=nil tag="log" time=2024-07-31 16:12:48.041690410 -0400 record={}
2024-07-31 16:12:48.041690410 -0400 log: {"data":{}}

Additional context

I didn't see anywhere in the fluentd docs where message has to exist. The other ticket I found relating to this error looked specific to an elasticsearch output plugin

kenhys commented 1 month ago

How about specifying auto_increment_key?

<source>
    @type sample
    tag log
    sample {"message": "{\"field\":\"value\"}"}
    auto_increment_key key
</source>
<filter log>
    @type parser
    key_name message
    reserve_data true
    remove_key_name_field true
    hash_value_field data
    <parse>
        @type json
    </parse>
</filter>
<match log>
  @type stdout
</match>

It seems that sample input is reused and it causes such a error.

kenhys commented 1 month ago

TODO: maybe in_sample implementation should be improved.

daipom commented 1 month ago

@nicklit Thanks for your report! I have confirmed that the pair of in_sample and filter_parser causes this issue.

I'd like to confirm one thing. Are you having similar problems with other input plugins? (Is in_sample plugin in this issue just an example?)

daipom commented 1 month ago

We can consider this a problem on filter_parser's part. We can think that the responsibility for changes to Fluentd's events should lie with the plugin that makes the changes.

Actually, filter_record_transformer copies the incoming events by itself.

nicklit commented 1 month ago

I'm not seeing the problem with other input plugins, just sample. My other input plugins (in_syslog) are typically just a one and done thing though, where it seems like sample is using the same modified input rather than new inputs on each run like I'd expect? It was frustrating/confusing when trying to debug another issue I was having with in_syslog to see the constant error stream (I learned the hard way it was limited to 2048 characters by default). Good idea on the auto_increment_key workaround @kenhys, I do feel like it's still a bug though.