elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.21k stars 3.5k forks source link

Implement exclusive grok #4471

Open nellicus opened 8 years ago

nellicus commented 8 years ago

In a scenario like the below:

grok{
                        match => { "message" => ["%{SSH_AUTH_1}","%{SSH_AUTH_2}"] }
                        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
                        add_tag => [ "auth_success" ]
                    }

                grok{
                        match => { "message" => ["%{SSH_AUTH_3}","%{SSH_AUTH_4}"] }
                        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
                        add_tag => [ "auth_failure" ]
                    }

                if "auth_success" in [tags] or "auth_failure" in [tags]{
                    mutate {
                                remove_tag => [ "_grokparsefailure" ] 
                            }
                }

the goal is to run an event through a list of grok filters and tag(categorize) the event. Today this approach suffers from the fact that there is no elegant/user friendly way to exit the grok filter(s) as soon as one of the pattern has matched. All the grok filters will need to be executed regardless of the fact that a match had already occurred. This causes the "_grokparsefailure" to inevitably be added to the event even though a match was performed.

it'd be great to explore possibilities for making this less painful , e.g. through a exclusive-grok (just to give the idea)

exclusive-grok{ #on first matching grock block, complete grock block execution 
then jump to end of exclusive-grok block 
               grok{
                        match => { "message" => ["%{SSH_AUTH_1}","%{SSH_AUTH_2}"] }
                        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
                        add_tag => [ "auth_success" ]
                    }

                grok{
                        match => { "message" => ["%{SSH_AUTH_3}","%{SSH_AUTH_4}"] }
                        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
                        add_tag => [ "auth_failure" ]
                    }
}
untergeek commented 8 years ago

I suppose this functionality is doable now, but only if each grok line is encapsulated in conditionals, and a "success" tag of some kind is added afterward, e.g.

if ![@metadata][grok] == "success" {
  grok { 
    # ... rule 1
    add_field => { "[@metadata][grok]" => "success" }
   # only added on successful grok parsing
  }
}
if ![@metadata][grok] == "success" {
  grok { 
    # ... rule 2
    add_field => { "[@metadata][grok]" => "success" }
   # only added on successful grok parsing
  }
}

... and so forth. Perhaps rather than "exclusive-grok," one potential solution would be to ignore events with a given metadata tag or field value. I'm not sure how others will feel about this, since we went away from this sort of "match on tag" in favor of conditionals, as above here, but I can see the value in shortening the file for situations with lots of grok filters, particularly.

Reasons to not change things include the fact that some grok filters build on previous grok efforts. For example, a syslog grok rule might parse out the syslog message field, but you want to parse and identify ssh auth failures separately once they are in that field. In such a case the ability to ignore events with a given metadata field, or only act on events with a given metadata field could prove helpful. Before we added the metadata field, this would not have been a recommended approach, but perhaps now that we can use non-propagating fields for tagging it may be worth reconsidering.

gmoskovicz commented 8 years ago

I like @nellicus idea. I have been strugeling with this kind of situation, and i ended up with another solution which is OK but its not elegant/user-friendly at all:

grok{
    match => { "message" => ["%{SSH_AUTH_1}","%{SSH_AUTH_2}"] }
    patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
    add_tag => [ "auth_success" ]
}   

if "_grokparsefailure" in [tags]{

    mutate {
        remove_tag => [ "_grokparsefailure" ] 
    }
    grok {
        match => { "message" => ["%{SSH_AUTH_3}","%{SSH_AUTH_4}"] }
        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
        add_tag => [ "auth_failure" ]
    }
}

if "_grokparsefailure" in [tags]{

    mutate {
        remove_tag => [ "_grokparsefailure" ] 
    }
    grok {
        match => { "message" => ["%{SSH_AUTH_3}","%{SSH_AUTH_4}"] }
        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
        add_tag => [ "auth_failure" ]
    }
}

if "_grokparsefailure" in [tags]{

    mutate {
        remove_tag => [ "_grokparsefailure" ] 
    }
    grok {
        match => { "message" => ["%{SSH_AUTH_3}","%{SSH_AUTH_4}"] }
        patterns_dir => "/opt/elk/PRODSEC/logstash/config/patterns"
        add_tag => [ "auth_failure" ]
    }
}

You just add a condition after each block to see if the previous grok failed. If so, you continue removing tags and matching with grok. It is a bit more expensive, but it is very safe. An exclusive-grok can help since you can add tags for each grok, and then use this tags in order to execute another filter, however you will still need a lot of conditions and if blocks after this. It will be nice if we also include something so we execute the exclusive grok, once one succeeds, you just execute something else. So its kind of a GoTo action inside the exclusive groks.

I believe that this will help many users and will be much more elegant.

nellicus commented 8 years ago

cc @jsvd if you have any thoughts on this

wiibaa commented 8 years ago

Maybe the combination of the existing break_on_match and the match_and_tag proposal from https://logstash.jira.com/browse/LOGSTASH-1641 would provide a less verbose solution

aleph-zero commented 6 years ago

It seems to me that we want to add a switch statement to the language.

switch ("message") {
    case:    
        grok {
            match => { "message" => ["<some pattern>"] }
                 add_tag => [ "<some tag>" ]
                    break
           }
        }
    default:
        // whatever   
}
jordansissel commented 6 years ago

I try not change the Logstash DSL because all roads appear to lead to inventing a new programming language, and I don't want to go down that road.

We could extend the grok filter to allow you to specify more clearly a tuple of (pattern, tag) where the tag is applied when the pattern is successful.

That said, do you have an example list of (pattern, tag) ? Maybe we could solve this another way?

aleph-zero commented 6 years ago

@jordansissel I'll fill you in on another ticket.