Closed xelite closed 8 months ago
I deliberately didn't use the OS because the usual google dns, clownflare, etc. don't work. And I didn't want to support these queries.
I remember Go used an internal resolver, and only the system when specifically asked for. Though this could have changed.
Does not using your system's resolver automatically present some kind of challenge? I can try to brainstorm with you.
Otherwise: Do you wanna do some R&D or a pr? I'd try to help.
To also respond to the question about using multiple resolvers:
I am not opposed to it, though I am curious, is this for error handling in case one is down, or would you expect the checks to resolve against all supplied servers?
I have multiple resolvers configured in case one is down. Currently passed only one to --config.dns-resolver
. I expect the checks to resolve against any of supplied servers.
@xelite sorry to clarify, when you e.g. supply 2 resolvers. Do you expect any of them in the response (/metrics
), or both?
I guess both would imply another label. I am also not a 100% sure if there's an efficient way to see if a resolver is working — I think this is why I opted for something like unbound. Or maybe dnsmasq would be more appropriate as a pure forwarder. It seems like adding multiple DNS servers into the mix makes lots of things very complicated.
Any thoughts?
@xelite sorry to clarify, when you e.g. supply 2 resolvers. Do you expect any of them in the response (
/metrics
), or both?
I expect any of them in the response. Exporter should use next resolver if first fail.
I guess both would imply another label. I am also not a 100% sure if there's an efficient way to see if a resolver is working — I think this is why I opted for something like unbound. Or maybe dnsmasq would be more appropriate as a pure forwarder. It seems like adding multiple DNS servers into the mix makes lots of things very complicated.
Any thoughts?
Yes, dnsmasq can resolve this case but its not pretty solution. I have configured custom resolvers in /etc/resolv.conf and I want use them when I dont pass arg --config.dns-resolver. Its the best solution IMO. When pass arg --config.dns-resolver. then it will be set.
@xelite do you feel like prototyping something that somehow verifies if a DNS server is working? Otherwise, I am not sure if I want to spend the time on this right now.
Unfortunately not. I don't know anything about Go programming.
@xelite I wrote some code to fetch a nameserver from /etc/resolv.conf
(see #203), but I am not sure if this is going to be super helpful. But can you have a look and let me know your thoughts?
Btw, do you do /etc/resolv.conf
yourself, or via systemd-resolved?
Also, did you test how your system behaves when the first nameserver in that file fails to respond? From what I know, it's still gonna be sluggish as the system will always try the first one. You need a solid combo of timeout
and attempts
, or rotate
which is something I don't want to replicate in Go.
@till thank you. I'll test this and give you feedback.
I am using dnsbl_exporter in several projects. That projects have a different configuration of system resolvers. Projects on GCP are using default cloud resolvers in /etc/resolv.conf
. Projects on premise are using local dnsmasq (nameserver 127.0.0.1 in /etc/resolv.conf) and dnsmasq sends request to different dns servers depends on domain.
And yes... When resolver in gcp will fail, than any domain will not be resolved. But I don't assume a cloud DNS failure. Even if that happens broken exporter won't be the biggest problem. :) in the case of dnsmasq, requests are always sent to all dns servers, so this solution is more robust.
@till I've done tests. It's works as expected. Thank you.
I can type any value in --config.dns-resolver
parameter. I think that exporter shoud (maybe optionally) return metric with state of domain resolve problems, eg. luzilla_rbls_domain_resolve_problems{hostname="nonexistent.gmx.net"} 1
. It will keep us about non exist domain's in configuratoin or resolvers problems. Currently I can see in log something like:
...and /metrics endpoint looks like that:
root@l-kjakowcz:/go/dnsbl_exporter# curl 0:9211/metrics
# HELP luzilla_rbls_duration The scrape's duration (in seconds)
# TYPE luzilla_rbls_duration gauge
luzilla_rbls_duration 0.554354604
# HELP luzilla_rbls_listed The number of listings in RBLs (this is bad)
# TYPE luzilla_rbls_listed gauge
luzilla_rbls_listed{rbl="ix.dnsbl.manitu.net"} 0
luzilla_rbls_listed{rbl="zen.spamhaus.org"} 0
# HELP luzilla_rbls_targets The number of targets that are being probed (configured via targets.ini or ?target=)
# TYPE luzilla_rbls_targets gauge
luzilla_rbls_targets 7
# HELP luzilla_rbls_used The number of RBLs to check IPs against (configured via rbls.ini)
# TYPE luzilla_rbls_used gauge
luzilla_rbls_used 2
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0
...but its is other issue. In summary, the fix with --config.dns-resolver
works great. Thank you again.
@xelite thanks for letting me know, feel free to create a new ticket for the other thing.
@xelite btw, can you share what you use the exporter for? Curious to know!
dnsbl_exporter is installed at production smtp servers. Prometheus is scraping /metrics
endpoint and evaluate alert rule:
alerts:
"groups":
- "name": "dnsbl-exporter"
"rules":
- "alert": "DnsblRblListed"
"expr": "luzilla_rbls_ips_blacklisted > 0"
"for": "15m"
"labels":
"severity": "critical"
"annotations":
"description": "Domain {{ $labels.hostname }} listed at {{ $labels.rbl }}"
"summary": "Domain listed at RBL"
"runbook_url": "https://jira......com/confluence/display/......../DnsblRBLListed+runbook"
@xelite If you want to send your example rule as a PR, would be much appreciated :)
@till I expanded the readme file with sample alerts, but I don't have access to push my branch.
@xelite you need to:
You can also go on the README in the browser and click edit, it will do the fork for you.
I would like to use multiple resolvers or even better - use defaults from OS when
--config.dns-resolver
is not provided.