logstash-plugins / logstash-filter-grok

Grok plugin to parse unstructured (log) data into something structured.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Apache License 2.0
122 stars 97 forks source link

Incoherent behavior of field references with overwrite #173

Open lucabelluccini opened 2 years ago

lucabelluccini commented 2 years ago

Logstash information:

Please include the following information:

  1. Logstash version (e.g. bin/logstash --version) 7.x

Description of the problem including expected versus actual behavior:

The following filters should be equivalent, but they behave differently.

grok {
overwrite => [ "[b]" ]
match => { "[a]" => "%{DATA:b}" } 
}
grok {
overwrite => [ "[b]" ]
match => { "[a]" => "%{DATA:[b]}" } # THIS WORKS OK
}
grok {
overwrite => [ "b" ]
match => { "[a]" => "%{DATA:[b]}" }
}
grok {
overwrite => [ "b" ]
match => { "[a]" => "%{DATA:b}" } # THIS WORKS OK
}

As a temporary workaround:

Steps to reproduce:

Pipeline:

input {
  generator {
    codec => "json"
    lines => [ '{ "a": "A", "b":"B" }']
    count => 1
  }
}

output { stdout { codec => rubydebug } }

Example I

grok {
overwrite => [ "b" ]
match => { "[a]" => "%{GREEDYDATA:b}" }
}

Result (OK):

{
      "@version" => "1",
    "@timestamp" => 2021-10-19T14:41:17.927Z,
          "host" => "Lucas-MacBook-Pro.local",
             "b" => "A",
      "sequence" => 0,
             "a" => "A"
}

Example II

  grok {
    overwrite => "[b]"
    match => { "[a]" => "%{DATA:b}" }
  }

Produces the following:

{
             "b" => [
        [0] "B",
        [1] "A"
    ],
      "sequence" => 0,
      "@version" => "1",
    "@timestamp" => 2021-10-19T14:43:08.937Z,
          "host" => "Lucas-MacBook-Pro.local",
             "a" => "A"
}

Example III

  grok {
    overwrite => "b"
    match => { "[a]" => "%{DATA:[b]}" }
  }

Produces the following:

{
      "@version" => "1",
             "a" => "A",
          "host" => "Lucas-MacBook-Pro.local",
             "b" => [
        [0] "B",
        [1] "A"
    ],
    "@timestamp" => 2021-10-19T14:44:16.308Z,
      "sequence" => 0
}

Example IV

  grok {
    overwrite => "b"
    match => { "[a]" => "%{DATA:[b]}" }
  }

Produces the following (OK):

{
      "sequence" => 0,
             "b" => "A",
    "@timestamp" => 2021-10-19T14:45:21.302Z,
             "a" => "A",
      "@version" => "1",
          "host" => "Lucas-MacBook-Pro.local"
}