magnusbaeck / logstash-filter-verifier

Apache License 2.0
191 stars 27 forks source link

Multiline yaml strings treated as separate events #190

Open mutt13y opened 7 months ago

mutt13y commented 7 months ago

It does not seem possible to inject multiline log data with the default codec

  - input:
      - |-
        2023-11-15 07:20:00.000000 | CRITICAL | TEST_UNIT |  message in a bottle | foo=bar
        backtrace
        stuff
    expected:
      - "@timestamp": "2023-11-15T07:20:00.000Z"
        message: message in a bottle
        event:
          original: 2023-11-15 07:20:00.000000 | CRITICAL | TEST_UNIT | message in a bottle | foo=bar\nbacktrace\nstuff

With this input data 1.6.2 and 2.0.0 beta 2 generate the error Expected 1 event got 3 instead

It is possible to work around this by using the json_lines codec and \\n for new lines

jgough commented 7 months ago

What do you mean specifically? The default codec will split on \n and your multiline input will be split into 3 events - splitting on the \n. This is expected logstash behaviour.

For logstash to not separate \n into separate log messages you'll need to use the multiline codec on your input plugin. You'll need something like this

        codec => multiline {
            charset => "UTF-8"
            pattern => "^\d\d\d\d-\d\d-\d\d"
            what => "next"
            auto_flush_interval => 10
        }
mutt13y commented 7 months ago

When I say codec, I mean the codec specified in the filter-verifier config file

I have multiline set in beats, I do not have multiline set in logstash beats input plugin. On the production system this causes matching multiline events to be sent as a single event into logstash.

With the filter-verifier I can simulate this with codec: json_lines in the filter-verifier config. I am not able to simulate this using a yaml file.

So perhaps the question is how can I replicate this

{
  "codec": "json_lines",

  "testcases": [
    {
    "input": [
        "{\"message\": \"2020-09-23 07:20:00.000000 | CRIT |  Something broke | foo=bar \\r\\nextra stuff\\r\\nextra extra stuf\"}"
        ],
...

Using a yaml input