Closed scottmuc closed 4 months ago
This is pretty cool so far but I will look to see if extended statistics is worth it.
9000 queries while I slept is a bit disturbing. Shows how much background activity is going on just on the LAN!
Seeing the number of DHCP leases change over time might be interesting to see how dynamic my LAN is. I am going to guess it's only sam
, frodo
, and sauron
that slip between being online and offline.
I can also see DNS metrics here too. Anything under dnsmasq_misses
will get forwarded to unbound
. The other metrics can help me tune dnsmasq
once I have more time to record data.
Found a useful resource for an unbound dashboard. The dashboard I found originally isn't supported anymore and is based on a different exporter implementation.
I think this is a small enough context to attempt building my own dashboard. It's about time I learned how to make grafana dashboards.
While reading that repo, I was convinced that enabling the extended statistics will be useful to know what are the most common requested domains.
When dnsmasq
or unbound
restart, the related counter metrics reset to zero. It turns out I didn't quite understand how I should be using counter metrics.
Using the increase
function (e.g.: increase(dnsmasq_hits{job="dnsmasq"}[$__range])
), I can now specify the timespan in grafana
and see the numbers match.
I'm not quite sure how to interpret the Cache insertions and evictions data, but this post does try and explain it.
I'm also not sure what hit rate I'm aiming for and whether or not this is a product of my cache configuration (left at default 150).
There's a richer set of metrics with unbound
. It's interesting to see that unbound
has to do a magnitude more queries because it has to perform the recursion algorithm.
I've also curious why so many IPv6 resolutions are being performed. They seem to appear in spikes. Also, TIL about HTTPS records which is part of a Nov 2023 RFC 9460
Definitely happy to call dnsmasq
done:
Calling unbound
done for now too:
This was a fun exercise and got me to better understand the tools to create a dashboard. I've previously used off-the-self dashboard and never got too much into the guts of setting up the different types of visualizations. At the moment the dashboard is all configured by hand via the grafana UI. I'll need to at least save the JSON exports if I want to store them for safe keeping in case the USB stick the grafana DB is stored on dies.
Already, this has illuminated some details of the DNS traffic on my network.
I have a sneaky suspicion that my RIPE Atlas is responsible for many of these requests, but how many?
I'd like to have a bit more visibility in the performance of my DNS setup. I'd use this information to tweak configuration (e.g.: cache size).
Preliminary researched has pointed me at: