dmachard / go-dnscollector

Ingesting, pipelining, and enhancing your DNS logs with usage indicators, security analysis, and additional metadata.
MIT License
207 stars 45 forks source link

Transforms not being applied correctly on matching statements - detect invalid config #522

Closed johnhtodd closed 9 months ago

johnhtodd commented 10 months ago

This is using the pipeline_mode branch.

Describe the bug Transformations do not seem to be working in some circumstances. Either that, or my regexp is poor for the matching criteria and there is a bug in the "drop-unmatched" logic. Either way, something is wrong. :-)

Expected behavior I expect the IP address anonymization to be applied, GeoIP data to be attached, and tag data to be attached to the packet. If my regexp is incorrectly formatted, I would expect no data at all to show up in the blocks.log file since I am applying "drop-unmatched" as the policy.

Additional context

Here is the config file snippet. I have other ports active in this configuration file doing other processing, and the "prom" prometheus stanza is excluded.

  - name: block-messages
    dnstap:
      listen-ip: 0.0.0.0
      listen-port: 59311
      chan-buffer-size: 131070
    transforms:
      normalize:
        qname-lowercase: true
    routes: [ blocks-geotag, prom ]

  - name: blocks-geotag
    dnsmessage:
      matching:
        include:
          dnstap.operation: "CLIENT_QUERY"
      policy: "drop-unmatched"
      transforms:
        user-privacy:
          anonymize-ip: true
        geoip:
          mmdb-country-file: "/root/maxmind/GeoIP2-Country_20231117/GeoIP2-Country.mmdb"
          mmdb-city-file: "/root/maxmind/GeoIP2-City_20231117/GeoIP2-City.mmdb"
          mmdb-asn-file: "/root/maxmind/GeoLite2-ASN_20231117/GeoLite2-ASN.mmdb"
        atags:
          tags: [ "TXT:blocks-geotag-hit" ]
    routes: [ blocks-out ]

  - name: blocks-out
    logfile:
      file-path: "/tmp/blocks.log"
      max-size: 10000
      max-files: 10
      mode: json

Here is an example of the data in the /tmp/blocks.log file (after parsing via jq) - missing tags and geoip data and has full IP address of client:

{
  "network": {
    "family": "IPv4",
    "protocol": "DOT",
    "query-ip": "10.44.19.232",
    "query-port": "40562",
    "response-ip": "9.9.9.9",
    "response-port": "853",
    "ip-defragmented": false,
    "tcp-reassembled": false
  },
  "dns": {
    "length": 41,
    "opcode": 0,
    "rcode": "NOERROR",
    "qname": "imaginarydomainname.net",
    "qtype": "A",
    "flags": {
      "qr": false,
      "tc": false,
      "aa": false,
      "ra": false,
      "ad": false
    },
    "resource-records": {
      "an": [],
      "ns": [],
      "ar": []
    },
    "malformed-packet": false
  },
  "edns": {
    "udp-size": 4096,
    "rcode": 0,
    "version": 0,
    "dnssec-ok": 0,
    "options": []
  },
  "dnstap": {
    "operation": "CLIENT_QUERY",
    "identity": "res310.pdx",
    "version": "dnsdist 0.0.0.HEAD.g",
    "timestamp-rfc3339ns": "2023-12-21T19:36:10.179766563Z",
    "latency": "0.000000",
    "extra": "cdb:{sources={\"threatprovider1.mainfeed\"}}"
  }
}
dmachard commented 9 months ago

I think your problem is not related to the pipeline_mode branch.

If your copy/paste is correct, I see an indentation issue in your configuration. The transforms section in your stanza blocks-geotag should be at the same level as dnsmessage.

DNScollector does not raise an error when unknown keys are present in YAML config file. I will look into improving that.

johnhtodd commented 9 months ago

You're right. I'll try to fix that. I am not really a big fan of spaces driving configuration, but I understand that's the way it is so I'll just grumble and look more closely at where my cursor is on the screen. Having error outputs would be useful as it could flag those things, yes - that is already a problem that has cost me quite a bit of time and I'm not even in a production environment. Another method to allow checking would be to have a "pre-flight" parser that checked the configs (Prometheus does this, as an example, with "promtool check config") and at least validated them for correctness.

johnhtodd commented 9 months ago

Yes, false alarm - this works fine after indentation repair.

dmachard commented 9 months ago

The -test-config argument is available to do a dry run. I see that the documentation needs to be updated as well. This option has been added by @pieterlexis-tomtom

$ go run . -test-config
INFO: 2023/12/23 14:53:35.522232 main - config OK!

PR #523 in progress to improve this part.

dmachard commented 9 months ago

New function has been added to detect invalid configuration.

dmachard commented 9 months ago

completed via #523