logstash-plugins / logstash-input-s3

Apache License 2.0
57 stars 150 forks source link

Metadata missing from last event when using multiline codec #153

Closed mllacek closed 2 years ago

mllacek commented 6 years ago

Opening this issue as a result of this forum post: https://discuss.elastic.co/t/multiline-plugin-metadata-missing-from-last-line/136725

To summarize, I have noticed that when combining multiple lines into one event using the multiline codec, the metadata for the last line of the file is missing. From the discussion, it looks like [metadata][s3][key] is not set on the event when the codec is flushed (line 220).

My configuration:

input {
        s3{
                bucket => "bucket_name"
        region => "us-east-2"
        codec => multiline {
                pattern => "^(%{DATESTAMP})"
                negate => "true"
                what => "previous"
        }
        }
}
filter { mutate { add_field => { "file_name" => "%{[@metadata][s3][key]}"}} }
output{ stdout { codec => rubydebug } }

Sample input file:

06-19-2018 15:25:35.7046|ERROR
    more info...
06-19-2018 15:25:35.7046|DEBUG
    more info...
06-19-2018 15:25:35.7046|INFO
    more info...

Logstash output:

{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|ERROR\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|DEBUG\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.999Z,
       "message" => "06-19-2018 15:25:35.7046|INFO\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "%{[@metadata][s3][key]}"
}
Frikitrok commented 5 years ago

Have same trouble, and all these features of multi-line codec fade behind this issue. All trace-backs i logging do not have metadata and they are useless because of this.

jinliangXX commented 5 years ago

In s3 input plugin, when codec multiline is used(actually used), the metadata is missing and add_field is useless(seemingly add_tag is useless), current version is 6.7.0.

I will try to use version 7.0. If codec multiline still causes metadada missing.


I have tried it in version 7.0.0, the metadata is still missing......

My final plan is use filter plugin: aggregate, but I need to set Logstash filter workers to 1 (-w 1 flag).

duaraghav8 commented 5 years ago

I'm facing the same issue. If I use the multiline codec with this plugin, type, add_field & tags - all are ineffective.

What's the best workaround? My constraints are that I HAVE to set any of the above fields in the input itself so I can use it to conditionally output the events to separate destinations.

I can't even use the id in conditionals, since it doesn't stick to events.

duaraghav8 commented 5 years ago

UPDATE: I fixed it in my fork, added a PR

TheVastyDeep commented 5 years ago

It's not just the decorate that is missing when the codec is flushed. All of the metadata handling is skipped. Basically, all of the stuff done in the '@codec.decode(line) do |event|' loop also has to be done in the '@codec.flush do |event|' block.

offerbaruch commented 4 years ago

Hi,

Just hit this as well... Just wondering if there are any plans to fix this anytime soon.

Thanks!

stiller-leser commented 3 years ago

@yaauie Maybe given how many PRs there are (#218,#190,#173 - see also https://github.com/elastic/logstash/issues/9686) could you consider merging one of them?