fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.92k stars 1.34k forks source link

log: add caller plugin-id about emitting error events #4632

Open daipom opened 2 months ago

daipom commented 2 months ago

Which issue(s) this PR fixes:

What this PR does / why we need it: Add caller plugin-id to warning logs about emitting error events.

It would be helpful if we could know what plugin emitted the error event.

We need to care about the compatibility. This signature change would not break compatibility.

However, I'm concerned that caller_plugin_id has a race condition, although I don't confirm it. It looks to me that the id can be another plugin-id running concurrently... It is not the issue with this fix, it is the issue of the existing implementation.

Docs Changes: Not needed.

Release Note: Add plugin-id to warning logs about emitting error events.

Example: From #4567.

<source>
  @type sample
  @id id_sample_not_json
  sample {"data":"this is not json}"}
  tag sample.json
</source>

<filter sample.json>
  @type parser
  @id id_not_json
  key_name data
  <parse>
    @type json
  </parse>
</filter>

<match sample.**>
  @type stdout
</match>
2024-09-11 15:26:26 +0900 [info]: #0 fluentd worker is now running worker=0
2024-09-11 15:26:27 +0900 [warn]: #0 dump an error event:
 error_class=Fluent::Plugin::Parser::ParserError
 error="pattern not matched with data 'this is not json}'"
 location=nil tag="sample.json"
 time=2024-09-11 15:26:27.040657722 +0900
 plugin_id="id_not_json" ### <= This info is added by this PR ###
 record={"data"=>"this is not json}"}
...
daipom commented 2 months ago

However, I'm concerned that caller_plugin_id has a race condition, although I don't confirm it. It looks to me that the id can be another plugin-id running concurrently...

https://github.com/fluent/fluentd/blob/51b860b1be2eb59076d706fda12d55657a614c8f/lib/fluent/plugin_helper/event_emitter.rb#L27-L36

kenhys commented 2 months ago

As error.backtrace does not have enough information in this context, so current approach seems reasonable. (No need to force 3rdparty plugins to fix it)

github-actions[bot] commented 1 month ago

This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 7 days