logstash-plugins / logstash-filter-grok

Grok plugin to parse unstructured (log) data into something structured.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Apache License 2.0
122 stars 97 forks source link

"Prefix" functionality for grok #146

Closed SolomonShorser-OICR closed 4 years ago

SolomonShorser-OICR commented 5 years ago

Feature Request

The kv plugin has a "prefix" function, which will prepend all extracted keys with a string:

https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html#plugins-filters-kv-prefix

I think it would be useful if a similar prefix function existed for grok, to prefix the fields that are created by grok.

I am using grok to process/extract fields from messages that are calls to web services. Some of the services have similar arguments/parameters, but the differences are significant enough that I'd like to be able to keep them separate in elastic, so I can report on their usage separately.

A simple example:

/ServiceA/download/file/${IDENTIFIER}/svg?q=1234
/ServiceB/token/${TOKEN}/download/${IDENTIFIER}/jpeg/file?q=1234

The patterns (simplified), look something like this:

\/ServiceA\/download\/file\/(?<ServiceA_identifier>[^/]?)\/(?<ServiceA_imageType>[a-zA-Z]+).*
\/ServiceB\/token\/(?<ServiceB_token>[\/]?)\/download\/(?<ServiceB_identifier>[^/]?)\/(?<ServiceB_imageType>[a-zA-Z]+)\/file\?(?<args>.*)
...
# There are many more, I won't bore you with them all

I have one set of grok patterns for ServiceA and another set for ServiceB. We are interested in which identifiers get sent to ServiceA and ServiceB. So I have named capture groups that are very long such as ServiceA_identifier, ServiceB_identifier, ServiceC_identifier, etc... it's rather long and ugly. It would be nice if I could do this:

    grok {
        match => ["service_request", "%{SERVICEPATTERN}"]
        patterns_dir => ["/usr/share/logstash/pipeline/patterns/"]
        prefix => "ServiceA_"
    }

so that I could simplify my regular expressions and make them shorter and easier to read:

\/ServiceA\/download\/file\/(?<identifier>[^/]?)\/(?<imageType>[a-zA-Z]+).*
\/ServiceB\/token\/(?<token>[\/]?)\/download\/(?<identifier>[^/]?)\/(?<imageType>[a-zA-Z]+)\/file\?(?<args>.*)

It would also let me re-use the same patterns with different prefixes.

sliddjur commented 4 years ago

Great idea, I would also like an enhancment of this plugin to add a target, just like kv does.

kares commented 4 years ago

the plugin already shipped the target => ... feature in version 4.3.0 https://github.com/logstash-plugins/logstash-filter-grok/pull/156

SolomonShorser-OICR commented 4 years ago

Thanks! When does 4.3.0 come out?

sliddjur commented 4 years ago

@SolomonShorser-OICR you can just run ./bin/logstash-plugin update logstash-filter-grok to upgrade to 4.3.0 That worked for me, and I am on logstash 7.6.1