dmachard / go-dnscollector

Ingesting, pipelining, and enhancing your DNS logs with usage indicators, security analysis, and additional metadata.
MIT License
205 stars 45 forks source link

GeoIP database seems not working with Unbound DNSTAP input #158

Closed helixzz closed 1 year ago

helixzz commented 1 year ago

Hi everyone.

I'm using the latest release version 0.25.0-beta1 (from GitHub releases) along with Unbound 1.13.1 (from Ubuntu APT) act as a forwarder and cacher for our team. I'm using fluentd as the output pipe and do some sort of field hacks (that's another story), and eventually put all DNS logs into an elasticsearch instance.

In short, the problem occurs as I'm planning to add GeoIP statistics for our DNS logs. I'm using mmdb files from GeoLite repository, and the startup logs of go-dns-collector shows that the GeoIP databases seemes loaded correctly. However, I can't find any recognized ASNs, country codes, or cities in the output stream, no matter I'm using text output nor fluentd output.

Here shows my go-dns-collector configuration.

multiplexer:
  collectors:
    - name: tap
      dnstap:
        listen-ip: 0.0.0.0
        listen-port: 10053
        tls-support: false
        cache-support: true
      transforms:
        geoip:
          mmdb-country-file: "GeoLite2-Country.mmdb"
          mmdb-city-file: "GeoLite2-City.mmdb"
          mmdb-asn-file: "GeoLite2-ASN.mmdb"

  loggers:
    - name: console
      stdout:
        mode: text
        text-format: "timestamp-rfc3339ns identity country city as-number operation rcode queryip queryport family protocol length qname qtype answer aa latency"
    - name: fluentd
      fluentd:
        transport: tcp
        remote-address: 127.0.0.1
        remote-port: 24224
        retry-interval: 5
        tag: "dns.collector"
        tls-support: false
        tls-insecure: false
  routes:
    - from: [ tap ]
      to: [ fluentd, console ]

Here's my Unbound configuration related to DNSTAP:

dnstap:
    dnstap-enable: yes
    dnstap-ip: 10.201.10.33@10053
    dnstap-tls: no
    dnstap-send-identity: yes
    dnstap-send-version: yes
    dnstap-log-resolver-query-messages: yes
    dnstap-log-resolver-response-messages: yes
    dnstap-log-client-query-messages: yes
    dnstap-log-client-response-messages: yes
    dnstap-log-forwarder-query-messages: yes
    dnstap-log-forwarder-query-messages: yes

Here're startup logs of go-dns-collector.

INFO: 2022/11/10 16:34:03.561112 main - version 0.25.0-beta1
INFO: 2022/11/10 16:34:03.561169 main - starting dns-collector...
INFO: 2022/11/10 16:34:03.561171 main - loading loggers...
INFO: 2022/11/10 16:34:03.561290 [console] logger stdout - enabled
INFO: 2022/11/10 16:34:03.561424 [fluentd] logger to fluentd - enabled
INFO: 2022/11/10 16:34:03.561459 main - loading collectors...
INFO: 2022/11/10 16:34:03.561571 [tap] dnstap collector - enabled
INFO: 2022/11/10 16:34:03.561583 main - routing: collector[tap] send to logger[fluentd]
INFO: 2022/11/10 16:34:03.561984 main - running all collectors and loggers...
INFO: 2022/11/10 16:34:03.562034 [tap] dnstap collector - starting collector...
INFO: 2022/11/10 16:34:03.562169 [tap] dnstap collector - running in background...
INFO: 2022/11/10 16:34:03.562237 [console] logger to stdout - running in background...
INFO: 2022/11/10 16:34:03.562435 [fluentd] logger to fluentd - running in background...
INFO: 2022/11/10 16:34:03.562493 [tap] dnstap collector - is listening on [::]:10053
INFO: 2022/11/10 16:34:03.562506 [fluentd] logger to fluentd - connecting to 127.0.0.1:24224
INFO: 2022/11/10 16:34:03.562810 [fluentd] logger to fluentd - connected
INFO: 2022/11/10 16:34:03.566863 [tap] dnstap collector - <MASKED>:55960 - new connection
INFO: 2022/11/10 16:34:03.566897 [tap] dnstap processor - initialization...
INFO: 2022/11/10 16:34:03.567236 [tap] dnstap processor - dns cached enabled: true
INFO: 2022/11/10 16:34:03.567440 [tap] dnstap collector - receiver framestream initialized
INFO: 2022/11/10 16:34:03.567533 Subprocessor GeoIP - country database loaded (919324 records)
INFO: 2022/11/10 16:34:03.567607 Subprocessor GeoIP - city database loaded (5026343 records)
INFO: 2022/11/10 16:34:03.567697 Subprocessor GeoIP - asn database loaded (939652 records)
INFO: 2022/11/10 16:34:03.567722 [tap] dnstap processor - geoip is enabled
INFO: 2022/11/10 16:34:03.567998 [tap] dnstap processor - running... waiting incoming dns message

Well, here's some text output of actual DNSTAP messages being parsed.

2022-11-10T08:48:51.799571Z sh-dns-02 - - - FORWARDER_QUERY NOERROR - - INET UDP 49b gslb-pek01.zhihu.com A 0.000000
2022-11-10T08:48:51.801018Z sh-dns-02 - - - FORWARDER_QUERY NOERROR - - INET UDP 57b e4094fc1d98c915a.ksyunad.com A 0.000000
2022-11-10T08:48:51.802444Z sh-dns-02   0 CLIENT_RESPONSE NOERROR 10.201.18.17 57490 INET UDP 219b captcha.zhihu.com A 0.008948
2022-11-10T08:48:51.813236Z sh-dns-02   0 CLIENT_QUERY NOERROR 10.201.33.11 56326 INET UDP 32b mesu.apple.com HTTPS 0.000000
2022-11-10T08:48:51.813579Z sh-dns-02   0 CLIENT_RESPONSE NOERROR 10.201.33.11 56326 INET UDP 190b mesu.apple.com HTTPS 0.000343
2022-11-10T08:48:51.942343Z sh-dns-02   0 CLIENT_QUERY NOERROR 10.201.18.17 49624 INET UDP 32b da.dun.163.com A 0.000000
2022-11-10T08:48:51.942724Z sh-dns-02 - - - FORWARDER_QUERY NOERROR - - INET UDP 43b da.dun.163.com A 0.000000
2022-11-10T08:48:51.944953Z sh-dns-02   0 CLIENT_RESPONSE NOERROR 10.201.18.17 49624 INET UDP 48b da.dun.163.com A 0.002610

Interestingly, from the output I noticed there is no "AA" nor "answer" field has been displayed. So what field does dnstap-collector takes to resolve GeoIP information? Is this a problem with Unbound?

dmachard commented 1 year ago

Thank to report that. It should works, I will check if I reproduce in my side.

dmachard commented 1 year ago

After some checks, it is the expected behavior regarding your config, GeoIP lookup cannot be performed because

helixzz commented 1 year ago

After some checks, it is the expected behavior regarding your config, GeoIP lookup cannot be performed because

  • the source IP is missing in FORWARDER_QUERY messages (unbound does not send this information)
  • for CLIENT_QUERY, the source IP is an IP address located in private network

Thank you very much! My log output pasted above may not be a good example. My initial purpose was not only add GeoIP information for client IPs, but also (more importantly) the targets (e.g. answered IPs in A/AAAA records) of the queries. This is for statistical needs to check the where the domains (sites) are that users access. Is this possible?

dmachard commented 1 year ago

Yes it's possible. You can add resolved IP in JSON mode ( resource-records item) or in text mode with the directive answer .

https://github.com/dmachard/go-dns-collector/blob/main/example-config/use-case-3.yml