fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.71k stars 1.56k forks source link

[out_syslog] support nested keys in syslog output plugin #2168

Open segwin opened 4 years ago

segwin commented 4 years ago

Is your feature request related to a problem? Please describe. Our organisation requires processing records with nested data. We also need to stream records to a syslog server, which is newly supported in the master branch thanks to PR #1601 being merged. However, this feature does not support using nested keys (e.g. "log"=>"syslog"=>"severity"=>"code" when using ECS). The current syntax is as follows:

[OUTPUT]
    Name syslog
    <...>
    Syslog_Severity_key my_severity_key

This does not allow us to handle a record structured as follows (ECS 1.5, very simplified):

{
  "log": {
    "syslog": {
      "severity": {
        "name": "warning",
        "code": "4"
      }
    }
  },
  "message": "my very important message"
}

Describe the solution you'd like Ideally, the syslog output plugin should support nested keys using the current interface. This could be done by adding a new plugin parameter (e.g. NestedKeySeparator) that if provided, would tell the plugin to split all key strings into nested key sequences. This means that internally, the native format for the keys would become arrays which are used to iterate over nested record keys.

For example, we could parse the JSON message above using the following config:

[OUTPUT]
    Name syslog
    <...>
    Syslog_Severity_key log.syslog.severity.code
    NestedKeySeparator .

Describe alternatives you've considered We have examined the possibility of adding a filter step that lifts the records before they are sent with out_syslog. However, this has the following drawbacks:

  1. Additional filter step introduced in the processing pipeline
  2. Causes the record to no longer respect our schema, which interferes with other outputs in the same pipeline

Point (1) is basically negligible in our use cases. However, point (2) is a very major drawback: if we want to route the same processed event to e.g. Elasticsearch and syslog, we would either need to break our Elasticsearch schema in order to use top-level keys for syslog, or we would need to duplicate our pipeline for non-syslog outputs.

Additional context Our use case is to use Fluent Bit to read a single input (e.g. system journal), process the events into a structured format, and send those processed events off to Elasticsearch, syslog and potentially other FluentD/Fluent Bit instances. The current behaviour will prevent us from relying on the official syslog output plugin and potentially force us into maintaining our own output plugin, which is obviously undesireable when 99.9% of the features we need are present in the current implementation.

I would appreciate any input on the solution proposed in this issue. For what it's worth, we are willing to create a pull request with this feature if the community is open to it.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

acastong commented 3 years ago

Commenting to keep the issue from being closed. The PR #1601 is ready to be merged to address this issue, please consider reviewing and merging it.

RicardoAAD commented 1 year ago

PR was merged

https://github.com/fluent/fluent-bit/commit/36cfed705f97b7e125607702a765f4df0438de59

Please check

acastong commented 1 year ago

Looks like I had mentioned to the wrong PR in my previous message, the one that implements the change is #2516