magnusbaeck / logstash-filter-verifier

Apache License 2.0
192 stars 27 forks source link

Logstash 7 support #68

Closed matejzero closed 5 years ago

matejzero commented 5 years ago

I tried running my tests with logstash 7 and it doesn't work "all the way".

Most of the tests work, but for some tests it seems like it takes one input and compares it to another output.

One of the test files is compared like so:

Weirdly enought, tests from 11-13 from the same test file work OK.

It's the same with some other test files, some of the tests work and some don't.

Example:

    {
      "input": [
        "Jun  1 01:02:03 testhost radiusd: Login OK: [user@example.com] (from client client.example.com port 12 cli D0-B2-12-B4-F7-BB via TLS tunnel)"
      ],
      "expected": [
        {
          "@timestamp": "2019-06-01T01:02:03.000Z",
          "timestamp_utc": "2019-06-01T01:02:03.000Z",
          "timestamp": "Jun  1 01:02:03",
          "host": "testhost",
          "program": "radiusd",
          "type": "radiusd",
          "app": "radiusd",
          "message": "Login OK: [user@example.com] (from client client.example.com port 12 cli D0-B2-12-B4-F7-BB via TLS tunnel)",
          "status": "ok",
          "client": "client.example.com",
          "mac": "D0-B2-12-B4-F7-BB",
          "username": "user@example.com",
          "realm": "example.com",
          "user": "user",
          "tags": [
            "_grok_syslog_prefilter_success",
            "save_to_elasticsearch",
            "_grok_radiusd_match"
          ]
        }
      ]
    }

but the result is compared to this scenario:

{
      "input": [
        "Jun  1 01:02:03 testhost radiusd: rlm_radutmp: rlm_eap: SSL error error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca"
      ],
      "expected": [
        {
          "@timestamp": "2019-06-01T01:02:03.000Z",
          "timestamp_utc": "2019-06-01T01:02:03.000Z",
          "timestamp": "Jun  1 01:02:03",
          "host": "testhost",
          "program": "radiusd",
          "type": "radiusd",
          "app": "radiusd",
          "message": "rlm_radutmp: rlm_eap: SSL error error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca",
          "module": "radutmp",
          "error": "rlm_eap: SSL error error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca",
          "status": "other",
          "tags": [
            "_grok_syslog_prefilter_success",
            "save_to_elasticsearch",
            "_grok_radiusd_match",
            "save_to_logfile",
            "save_to_elasticsearch"
          ]
        }
      ]
    }

Output from LFV:

Comparing message 1 of 13 from radius.json...
--- /tmp/872842369/radius.json/1/expected   2019-04-16 07:24:46.256301746 +0200
+++ /tmp/872842369/radius.json/1/actual 2019-04-16 07:24:46.256301746 +0200
@@ -1,21 +1,20 @@
 {
   "@timestamp": "2019-06-01T01:02:03.000Z",
   "app": "radiusd",
-  "client": "client.example.com",
+  "error": "rlm_eap: SSL error error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca",
   "host": "testhost",
-  "mac": "D0-B2-12-B4-F7-BB",
-  "message": "Login OK: [user@example.com] (from client client.example.com port 12 cli D0-B2-12-B4-F7-BB via TLS tunnel)",
+  "message": "rlm_radutmp: rlm_eap: SSL error error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca",
+  "module": "radutmp",
   "program": "radiusd",
-  "realm": "example.com",
-  "status": "ok",
+  "status": "other",
   "tags": [
     "_grok_syslog_prefilter_success",
     "save_to_elasticsearch",
-    "_grok_radiusd_match"
+    "_grok_radiusd_match",
+    "save_to_logfile",
+    "save_to_elasticsearch"
   ],
   "timestamp": "Jun  1 01:02:03",
   "timestamp_utc": "2019-06-01T01:02:03.000Z",
-  "type": "radiusd",
-  "user": "user",
-  "username": "user@example.com"
+  "type": "radiusd"
 }

Any ideas?

matejzero commented 5 years ago

After checking some other runs, it looks like it fails randomly. On another tests, output from 1st test was compared to results from 4th test.

It looks like inputs and output order is not guaranteed. I did a quick look at the breaking changes for LS7 but didn't find anything that would break the order.

jamiegwatkin commented 5 years ago

I can confirm I'm experiencing the same weirdness with https://artifacts.elastic.co/downloads/logstash/logstash-7.0.0.deb and logstash-filter-verifier_1.5.0_linux_amd64.tar.gz.

logstash-filter-verifier seems to be getting confused when using multiple test cases, I'm seeing multiple failures where I've defined an "input" line but it's being compared against the wrong "expected" causing really weirds diffs.

Here's a redacted example:

Output of /logstash-filter-verifier test_data/production_ecs.json roles/logstash/templates/main.conf

Comparing message 3 of 5 from production_ecs.json ($REDACTED Service)...
--- /tmp/481031650/production_ecs.json/3/expected   2019-04-16 15:22:23.635197530 +0000
+++ /tmp/481031650/production_ecs.json/3/actual 2019-04-16 15:22:23.635197530 +0000
@@ -1,16 +1,13 @@
 {
-  "bytes_in": 0,
-  "bytes_out": 5,
-  "id": "34687115637880113099407548222622387368389000080156131328",
-  "latency": 5236,
-  "latency_human": "5.236µs",
-  "log_stream": "$CUSTOM_SERVICE_NAME_1/$CUSTOM_SERVICE_NAME_1/1b44419d-f352-4c35-90bc-c569c12697d9",
-  "message": "{\"time\":\"2019-04-16T14:14:20.007377065Z\",\"id\":\"\",\"remote_ip\":\"127.0.0.1\",\"host\":\"$REDACTED.$REDACTED.com\",\"method\":\"GET\",\"uri\":\"/ping\",\"status\":200, \"latency\":5236,\"latency_human\":\"5.236µs\",\"bytes_in\":0,\"bytes_out\":5}",
-  "method": "GET",
-  "remote_ip": "127.0.0.1",
-  "status": 200,
-  "time": "2019-04-16T14:14:20.007377065Z",
-  "timestamp": 1555424060007,
-  "type": "cloudwatch-logs-production",
-  "uri": "/ping"
+  "component": "$REDACTED_REDACTED",
+  "environment": "production",
+  "id": "34687131043903918600317933922169995268568875280802250778",
+  "log_stream": "$CUSTOM_SERVICE_NAME_2/$CUSTOM_SERVICE_NAME_2/0d146096-0128-4224-ba95-e2f4975490db",
+  "message": "D, [2019-04-16T14:25:50.837412 #11] DEBUG -- : [6cbaefc1-9888-4209-9430-a6eaefd259c6]   \u001b[1m\u001b[35m (0.4ms)\u001b[0m  \u001b[1m\u001b[35mSET SESSION \"$REDACTED.whodunnit\"=''\u001b[0m",
+  "tags": [
+    "_jsonparsefailure"
+  ],
+  "task_id": "0d146096-0128-4224-ba95-e2f4975490db",
+  "timestamp": 1555424750837,
+  "type": "cloudwatch-logs-production"
 }

As you can see the expected keys are completely different, this is because logstash-filter-verifier is looking at the expected keys from a different test case.

I'm not encountering this issue when I only define a single test case per JSON file.

langoureaux-s commented 5 years ago

Hi,

I have the same issue on logstash 7.2.0. It seems that logstash no more guarantee the order when it process events. So, in the temporary file that contains the output, the order is wrong.

You can mitigate this issue if you add the following arguments when you run logstash-filter verifier --logstash-arg=-b --logstash-arg 1.

magnusbaeck commented 5 years ago

Thanks @langoureaux-s! I wasn't able to reproduce the problem (but I didn't try that hard either) but I've published a 1.5.1 release that adds -b 1 to the default Logstash command line. I hope this addresses the problem for everyone.