magnusbaeck / logstash-filter-verifier

Apache License 2.0
192 stars 27 forks source link

Ignoring sub-field #47

Closed leosunmo closed 4 years ago

leosunmo commented 7 years ago

I've been trying to figure out how to ignore sub-fields (if that's the correct word for it).

The easiest way to show what I mean is probably to paste the testcase and diff output.

First the diff:

  "dup_message": "This is a test message",            "dup_message": "This is a test message",
  "event": {                              "event": {
    "format": ""                            "format": ""
  },                                  },
  "logstash": {                           "logstash": {
    "index_time": "2017-07-03T02:14:44+00:00",            |     "index_time": "2017-07-03T03:57:04+00:00",
    "indexer": "local-logstash",                    "indexer": "local-logstash",
    "processing": {                         "processing": {
      "duration": 0.029000043869018555,               |       "duration": 0.009999990463256836,
      "lag": 7.522000074386597                    |       "lag": 6.716000080108643
    },                                  },
    "size": {                               "size": {
      "approx": 317                           "approx": 317
    }                                   }
  },                                  },
  "message": "This is a test message"                 "message": "This is a test message"
}                               }

As you can see I need to ignore logstash.processing.duration, logstash.processing.lag and logstash.index_time at least.

What I've tried is this:

  "codec": "json",
  "ignore": ["host", "@timestamp", "logstash.index_time", "logstash.processing.duration", "logstash.processing.lag"],
  "input": [
    "{\"message\": \"This is a test message\"}"
  ],
  "expected": [
 {
  "dup_message": "This is a test message",
  "event": {
    "format": ""
  },
  "logstash": {
    "index_time": "2017-07-03T02:14:44+00:00",
    "indexer": "local-logstash",
    "processing": {
      "duration": 0.029000043869018555,
      "lag": 7.522000074386597
    },
    "size": {
      "approx": 317
    }
  },
   "message": "This is a test message"
 }
  ]
}

Which didn't seem to work. If I put "logstash" in the ignore array it works and it ignores the whole "logstash" map, which could be the workaround, but I'd prefer to keep fields like logstash.size for example since it's a pretty good check to have.

Let me know if I've missed something obvious here. In the meantime I'll try to read the code and figure out if it's possible.

Thanks in advance

magnusbaeck commented 7 years ago

This is unfortunately a limitation in the current implementation of the ignore functionality, i.e. only top-level fields can be ignored. This should be addressed by allowing subfields to be ignored via the Logstash syntax ("[field][subfield]"). Until this has been done I can document the limitation in README.md.

leosunmo commented 7 years ago

Thanks for the quick reply Magnus. I'll try to work with it the way it is at the moment and maybe take a look at adding in the functionality myself.

magnusbaeck commented 4 years ago

Fixed in PR #70.