magnusbaeck / logstash-filter-verifier

Apache License 2.0
191 stars 27 forks source link

The [host] is always removed in every case, making it impossible to test filters that modify it #152

Open nicolasreich opened 2 years ago

nicolasreich commented 2 years ago

The host field is always removed, I believe there, which makes it impossible to test filters that modify this field.

Example config:

input {
  stdin {
    id => json_lines
    codec => json
  }
}

filter {
  mutate { 
    rename => { "[host][before]" => "[host][after]" }
    id => "mutate"
  }
}

output {
  tcp {
    id => "stdout"
    codec => "json_lines"
  }
} 

Example test:

---
ignore:
  - "@timestamp"
input_plugin: "json_lines"
testcases:
  - input:
      - >
        {
          "host": {"before": "hello"},
          "field1": "value1",
          "field2": "value2"
        }
    expected:
      - host:
          after: "hello"
        field1: value1
        field2: value2

Example output:

bash-4.2$ ./logstash-filter-verifier daemon run --pipeline ../logstash/config/single_rule.yml --testcase-dir testcases/mutate_test/mutate3.test.yaml 
☐ Comparing message 1 of 1 from mutate3.test.yaml:
--- /tmp/533008966/mutate3.test.yaml/1/expected 2021-10-13 12:12:28.567867195 +0200
+++ /tmp/533008966/mutate3.test.yaml/1/actual   2021-10-13 12:12:28.567867195 +0200
@@ -1,7 +1,4 @@
 {
   "field1": "value1",
-  "field2": "value2",
-  "host": {
-    "after": "hello"
-  }
+  "field2": "value2"
 }

Summary: ☐ All tests: 0/1
         ☐ mutate3.test.yaml: 0/1
logstash-filter-verifier: error: failed test cases
jgough commented 2 years ago

I believe host is something that is added by the generator input plugin that LFV uses to inject the events. https://github.com/logstash-plugins/logstash-input-generator/blob/b3fd6064aabb046054784de94287a322a945eb21/lib/logstash/inputs/generator.rb#L69

Looks like the problem here is that the JSON input codec needs to be injected into the input plugin and there is no way of telling if the host field came from the input codec or the generator. I think LFV1 did not remove this field (it was frustrating having to ignore it on all tests when host was not used). Not sure what changes could be made here short of not using the generator plugin. A custom LFV input plugin maybe??

I suspect the same would apply to a field called sequence

breml commented 2 years ago

It is correct, that currently the field host is always removed, if this field is set via input and codec (https://github.com/magnusbaeck/logstash-filter-verifier/blob/master/internal/daemon/session/files.go#L48). It is for sure debatable, if this is the correct thing to do. In my experience, this field was never of value in my test cases, because as @jgough mentioned, this field gets populated by the Logstash input plugin with the hostname of the system where the tests are executed. So in my experience one normally put this field in the ignore fields anyway.

That being said, it is possible to set the host field via fields in the test case file.

nicolasreich commented 2 years ago

Yeah I think setting it via fields is perfectly fine once https://github.com/magnusbaeck/logstash-filter-verifier/issues/155 is fixed, maybe it could be mentioned in the doc to make it clearer though.

breml commented 2 years ago

@nicolasreich It would be great, if you could provide a PR for the updated documentation. With the change in PR #156 the following fields are removed by default:

All these fields can be added again using fields in the configuration.