Closed meehanman closed 10 months ago
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
This issue was closed because it has been stalled for 5 days with no activity.
I'm also interested in this feature. Thanks.
PS: I think this can be done with nest operations btw
Is your feature request related to a problem? Please describe. Currently when using the JSON parser if the original log source is a JSON map string, it will take its structure and convert it directly to the internal binary representation.
eg.
will be processed to:
This same functionality is repeated for all other parsers eg. if you parse Regular Expression
will parse
to
The difference between parsing Regular Expression and JSON is that we are able to name the fields in our REGEX and extract these fields having full control of how they are extracted.
With the ability to set custom naming for Regular Expression fields we are then able to use other Filters such as
Nest
that allows us to nest any fields by name that we specified or nest all fields that start with a prefix eg.nest_
.With the JSON parser this isn't possible and the fields are just extracted onto the Root Data Structure.
Describe the solution you'd like
Specifically JSON parsing should have the ability to choose to not parse directly onto the Root Data Structure and into field within the data structure or to prefix the JSON extracted root keys so they can be identified later in the Data Pipeline for processing.
For example:
Parent_Key
will be processed to:
Extracted_key_prefix
will be processed to:
An alternative approach would be implement this on the Parsing Filter/Inputs with Parsers that would then take effect for both the JSON parsers and other parsers. This will allow parsing JSON to be predictable when processing logs from many or unknown sources.
Describe alternatives you've considered
The only real alternative to this problem for JSON parsing is to:
prefix_
prefix_log
to JSONprefix_
can be identified as not being from the parsed JSONWe can then use a Lua filter to process the logs to conform to the correct format we are looking for without the chance of overriding keys that may be set on the Root Data Structure. There are no other 'native' ways to do this without ending up with duplicate keys.
Additional context
The Splunk_HEC forwarder does a poor job at formatting logs correctly so that metadata added by Fluentbit is only in the
fields
metadata with all fields ending up as the event including host, source, sourceType, index. This causes additional log size and storage/processing costs.To better have control of what is sent, we utilise the HTTP collector, which gives us more control of exactly what we are sending. For additional context; the Splunk HEC event endpoint only will accept logs formatted in the following structure:
(Note: Most of these fields can be optional)
We are using Fluentbit to set log context via the
Fields
key. AllLog
data should be processed to be under theEvent
key.