dmachard / go-dnscollector

Ingesting, pipelining, and enhancing your DNS logs with usage indicators, security analysis, and additional metadata.
MIT License
211 stars 48 forks source link

Elasticsearch configuration doesn't seem to include authentication #775

Open PenelopeFudd opened 3 months ago

PenelopeFudd commented 3 months ago

Is your feature request related to a problem? Please describe. The elasticsearch logger doesn't let me specify a username+password. Our pipeline is all ready to send its data into Elasticsearch but it can't authenticate. ☹️

Describe the solution you'd like Could you modify:

Describe alternatives you've considered

Additional context

PenelopeFudd commented 3 months ago

I've taken a stab at adding authentication, but now I'm getting 400 (bad request) and 413 (payload too large) errors.

curl -sS -u username:password 'https://elasticsearch.infra/_cluster/settings?include_defaults=true&filter_path=defaults.http.max_content_length'

{"defaults":{"http":{"max_content_length":"100mb"}}}

The config.yml file says bulk-size: 1048576 # 1MB, so size shouldn't be an issue.

I've added debugging statements to view what's being sent to elasticsearch:

{ "create" : {}}

{"dns.flags.aa":false,"dns.flags.ad":false,"dns.flags.cd":false,"dns.flags.qr":false,"dns.flags.ra":false,"d.....

From this StackOverflow post, it looks like it should be starting with { "index":{} }, but I don't know what's normal yet.

Any suggestions?

dmachard commented 3 months ago

Could you share your elastic and dnscollector config files please ?

PenelopeFudd commented 3 months ago

Here's my dnscollector config.yml file, after I modified the source to support Basic authentication:

Details

``` ################################################ # global configuration # more details: https://github.com/dmachard/go-dnscollector/blob/main/docs/configuration.md#global ################################################ global: trace: verbose: true server-identity: "dns-collector" pid-file: "" # text-format: "timestamp-rfc3339ns identity operation rcode queryip queryport family protocol length-unit qname qtype latency" text-format: "timestamp-rfc3339ns identity operation rcode queryip queryport family protocol length-unit qname qtype edns-csubnet latency answer" text-format-delimiter: " " text-format-boundary: "\"" text-jinja: "" worker: interval-monitor: 10 buffer-size: 4096 telemetry: enabled: true web-path: "/metrics" web-listen: ":9165" prometheus-prefix: "dnscollector_exporter" tls-support: false tls-cert-file: "" tls-key-file: "" client-ca-file: "" basic-auth-enable: false basic-auth-login: admin basic-auth-pwd: omitted ################################################ # Pipelining configuration # more details: https://github.com/dmachard/go-dnscollector/blob/main/docs/running_mode.md#pipelining # workers: https://github.com/dmachard/go-dnscollector/blob/main/docs/workers.md # transformers: https://github.com/dmachard/go-dnscollector/blob/main/docs/transformers.md ################################################ pipelines: - name: powerdns powerdns: listen-ip: 127.0.0.1 listen-port: 6001 tls-support: false tls-min-version: 1.2 cert-file: "" key-file: "" reset-conn: true chan-buffer-size: 0 add-dns-payload: true routing-policy: forward: [ console ] dropped: [ ] - name: tap dnstap: listen-ip: 127.0.0.1 listen-port: 6000 transforms: normalize: qname-lowercase: true qname-replace-nonprintable: true routing-policy: forward: [ elastic ] dropped: [ ] - name: console stdout: mode: json - name: elastic elasticsearch: server: "https://elasticsearch.example.org/" index: "" chan-buffer-size: 0 bulk-size: 1048576 # 1MB flush-interval: 10 # in seconds compression: none bulk-channel-size: 10 basic-auth-enable: true basic-auth-login: elastic basic-auth-pwd: omitted ```

Unfortunately I don't have access to the Elasticsearch config file, it's run by another team.

I did add debugging statements that printed the body of an elasticsearch request, which looks somewhat like this:

Details

```javascript { "create" : {}} {"dns.flags.aa":false,"dns.flags.ad":false,"dns.flags.cd":false,"dns.flags.qr":false,"dns.flags.ra":false,"dns.flags.rd":true,"dns.flags.tc":false,"dns.id":0,"dns.length":128,"dns.malformed-packet":false,"dns.opcode":0,"dns.qclass":"IN","dns.qname":"v1.pv-txt.pool.dns.example.com","dns.qtype":"TXT","dns.questions-count":1,"dns.rcode":"NOERROR","dns.resource-records.an":"-","dns.resource-records.ar":"-","dns.resource-records.ns":"-","dnstap.extra":"-","dnstap.identity":"dnsdist_server","dnstap.latency":0,"dnstap.operation":"CLIENT_QUERY","dnstap.peer-name":"localhost","dnstap.policy-action":"NXDOMAIN","dnstap.policy-match":"QNAME","dnstap.policy-rule":"-","dnstap.policy-type":"-","dnstap.policy-value":"-","dnstap.query-zone":"-","dnstap.timestamp-rfc3339ns":"2024-07-25T00:57:51.2575189Z","dnstap.version":"dnsdist 1.9.6","edns.dnssec-ok":0,"edns.options.0.code":8,"edns.options.0.data":"159.250.13.0/24","edns.options.0.name":"CSUBNET","edns.options.1.code":12,"edns.options.1.data":"-","edns.options.1.name":"PADDING","edns.rcode":0,"edns.udp-size":4096,"edns.version":0,"network.family":"IPv4","network.ip-defragmented":false,"network.protocol":"DOH","network.query-ip":"10.167.0.248","network.query-port":"42927","network.response-ip":"10.0.22.133","network.response-port":"443","network.tcp-reassembled":false} ```

Wondering if the "create" should really be "index", and whether the index:"" should have a value.

PenelopeFudd commented 3 months ago

Had a bit of a breakthrough!

Added this module to the code, got a real curl command and the output of that was way more informative!

This input record failed:

{"dns.flags.aa":true,"dns.flags.ad":false,"dns.flags.cd":false,"dns.flags.qr":true,"dns.flags.ra":false,"dns.flags.rd":true,"dns.flags.tc":false,"dns.id":0,"dns.length":394,"dns.malformed-packet":false,"dns.opcode":0,"dns.qclass":"IN","dns.qname":"v1.pv-txt.pool.dns.example.com","dns.qtype":"TXT","dns.questions-count":1,"dns.rcode":"NOERROR","dns.resource-records.an.0.class":"IN","dns.resource-records.an.0.name":"v1.pv-txt.pool.dns.example.com","dns.resource-records.an.0.rdata":"{\"version\": \"v1.0\", \"selection\": [{\"popId\": \"xyzzy\"}, ","dns.resource-records.an.0.rdatatype":"TXT","dns.resource-records.an.0.ttl":12,"dns.resource-records.ar":"-","dns.resource-records.ns":"-","dnstap.extra":"cached","dnstap.identity":"dnsdist_server","dnstap.latency":0,"dnstap.operation":"CLIENT_RESPONSE","dnstap.peer-name":"localhost","dnstap.policy-action":"NXDOMAIN","dnstap.policy-match":"QNAME","dnstap.policy-rule":"-","dnstap.policy-type":"-","dnstap.policy-value":"-","dnstap.query-zone":"-","dnstap.timestamp-rfc3339ns":"2024-07-25T20:50:31.485687332Z","dnstap.version":"dnsdist 1.9.6","edns.dnssec-ok":0,"edns.options.0.code":8,"edns.options.0.data":"19.50.13.0/24","edns.options.0.name":"CSUBNET","edns.rcode":0,"edns.udp-size":1232,"edns.version":0,"network.family":"IPv4","network.ip-defragmented":false,"network.protocol":"DOH","network.query-ip":"10.67.0.248","network.query-port":"48421","network.response-ip":"10.20.22.133","network.response-port":"443","network.tcp-reassembled":false}

Ran a test-case minimization program and came up with this:

Details

``` $ curl -sS --fail-with-body \ -X POST https://elasticsearch.infra/dnscollector/_bulk \ -H 'Authorization: Basic xxxxxx:yyyyyy' \ -H 'Content-Type: application/x-ndjson' \ -d '{ "create" : {}}'$'\n''{"dns.resource-records.an.0.class":"IN"}'$'\n' \ | jq . { "errors": true, "took": 6, "items": [ { "create": { "_index": "dnscollector", "_id": "8qR87JABur2Qow5SzDbK", "status": 400, "error": { "type": "document_parsing_exception", "reason": "[1:36] failed to parse field [dns.resource-records.an] of type [text] in document with id '8qR87JABur2Qow5SzDbK'. Preview of field's value: '{0={class=IN}}'", "caused_by": { "type": "illegal_state_exception", "reason": "Can't get text on a START_OBJECT at 1:2" } } } } ] }

Any idea what's wrong with it? Do we need to tweak the Elasticsearch configuration?

Thanks

PenelopeFudd commented 3 months ago

It turns out that if there are dots in the field names, ElasticSearch (ES) 8.13.4 interprets them as subobjects.

If you try to give the same name to an object and a string ES complains.

The ES naming conventions say this (among other things):

If a field name matches the namespace used for nested fields, add .value to the field name. For example, instead of:

workers workers.busy workers.idle

Use:

workers.value workers.busy workers.idle

PenelopeFudd commented 3 months ago

Another breakthrough:

It appears that the other team is using Nginx in front of ES, and the default size limit is in place, which is 1MB minus some bytes for overhead. By setting bulk-size: 1000000 # 1MB in config.yml, I stopped getting those 413 Payload too large errors.

Now the only error I'm getting is

ERROR: 2024/07/25 22:23:06.092128 worker - [elastic] elasticsearch - Send buffer is full, bulk dropped
ERROR: 2024/07/25 22:23:06.110239 worker - [elastic] elasticsearch - Send buffer is full, bulk dropped
ERROR: 2024/07/25 22:23:06.128498 worker - [elastic] elasticsearch - Send buffer is full, bulk dropped

But that's probably because I'm using my standard load testing script to exercise this on a puny 2-cpu virtual machine. Chaos engineering in practice. 😄

dmachard commented 3 months ago

Regarding authentication, could you submit a pull request to add support? It could be useful for others.