klutchell / unbound-docker

unofficial unbound multiarch docker image
BSD 3-Clause "New" or "Revised" License
113 stars 23 forks source link

Pihole showing "BOGUS (DNSSEC signature expired)", reason for that maybe missing/incorrect TZ in unbound container? #363

Closed jaydee73 closed 1 month ago

jaydee73 commented 9 months ago

Hey,

I am using this container in combination with a separate pihole container (both in a macvlan running on a Synology NAS docker environment) with DNSSEC activated.

That said, I am getting "BOGUS (DNSSEC signature expired)" errors from time to time in the pi hole logs. Only occasionally and also, for whatever reason, after some retries, it is working again.

For example, this happened this morning when I wanted to check updates within my Synology NAS and the NAS contacted www.synology.com for that.

Digging a little deeper into this, I found out that DNSSEC relies on a working and correct timezone. And your docker container (I am using a yaml file) doesn't specify any TZ variable.

Can I simply add the correct TZ variable in my yaml file or does your container ignore this variable, hence has to be updated first to take this into account?

KR, Stefan

klutchell commented 9 months ago

The image is distroless, so I doubt the env var will do anything to set the timezone.

You could try mounting /etc/localtime as readonly from your host? I expect some combination of the correct bind mounts will reflect the correct timezone in unbound.

Here is an example for New York timezone"

docker run -v /usr/share/zoneinfo/America/New_York:/etc/localtime:ro klutchell/unbound
LawnMo commented 9 months ago

You could try mounting /etc/localtime as readonly from your host? I expect some combination of the correct bind mounts will reflect the correct timezone in unbound.

I thought I was smart using that workaround, but it doesn't work I'm afraid 😸 It seemed to go away after restarting the stack but nop. Couldn't investigate further, but there's something causing these BOGUS errors, same as OP, it's sporadic and only for a couple tries, as if the container was "lagging in time".

Edit: smol tip but you can simply pass /etc/localtime:/etc/localtime:ro instead of picking a TZ, that's one less thing to care about and if it's ever added to an example yaml, it's "generic".

LawnMo commented 9 months ago

@klutchell sorry to ping, any reason you're using serve-expired: yes ?
I'm digging ( 🥁 ) through the unbound.conf you provide and this would seem to correlate with the behaviors noticed by jaydee73 and I, cache-min-ttl: 0 discards expired records as they get served while unbound updates them, so they end up working after a couple retries.
In this case prefetch: yes would only pre-fetch domains often queried, I'm seeing BOGUS errors on load-balanced services (paypal primarily) using subdomains, my query log (in pihole) often shows "paypal.com OK > first-subdomain.paypal.com OK > c6.paypal.com BOGUS". I'll add a custom config to change serve-expired to no, see if it improves and fixes the issue.
I can't seem to find a pihole-related unbound custom conf that makes use of that feature and it's often run on low-resources SBC so if it was some kind of preemptive optimization, it might be a good idea to just drop serve-expired ?

Cheers.

churchofnoise commented 9 months ago

@klutchell sorry to ping, any reason you're using serve-expired: yes ?
I'm digging ( 🥁 ) through the unbound.conf you provide and this would seem to correlate with the behaviors noticed by jaydee73 and I, cache-min-ttl: 0 discards expired records as they get served while unbound updates them, so they end up working after a couple retries.
In this case prefetch: yes would only pre-fetch domains often queried, I'm seeing BOGUS errors on load-balanced services (paypal primarily) using subdomains, my query log (in pihole) often shows "paypal.com OK > first-subdomain.paypal.com OK > c6.paypal.com BOGUS". I'll add a custom config to change serve-expired to no, see if it improves and fixes the issue.
I can't seem to find a pihole-related unbound custom conf that makes use of that feature and it's often run on low-resources SBC so if it was some kind of preemptive optimization, it might be a good idea to just drop serve-expired ?

Cheers.

This config is often used in unbound use cases (I might even have been the one suggesting to use it, together with some other tweaks) as it indeed optimises speed.

This being said, I also use this container and pi-hole and have not seen this issue, so I would advise refraining from n=1 based decision making. This setting has been used for months if not years in klutchells repository and would have had more issues reported, as would have been the case with the developers of unbound (NLnetLabs)...

Could it be linked to this specific case : https://github.com/NLnetLabs/unbound/issues/994

LawnMo commented 9 months ago

"serve-expired-client-timeout" isn't set in klutchell's config, could be, could be not 🤷 .

My initial question about serve-expired wasn't a n=1 decision, sometimes (and I'm the first to do it) when one goes through a config file, one may get enthusiastic about enabling things that may look nice or useful and overlook the details. If the answer is "I did it because...", it's different than "I thought it might..." and that's perfectly fine.

I've started using this container as my previous solution was unmaintained and a couple CVEs popped up in dnsmasq/pihole, have had this issue happen sporadically, just enough that after a couple cheap tricks to fix it, I searched "bogus" in the issue tracker, just to see if I was the only one :)

There has to be something wrong somewhere and the laziest solutions are the easiest to find, I found this issue and started looking at unbound config, saw an unusual unbound setting comparing to other pihole+unbound setup, figured I'd mention it, that's all ;)

jaydee73 commented 9 months ago

Coming back to my initial issue: IF my problem is related to TZ issues, wouldn't it be possible to implement TZ into the image? I'm not quite familiar with the meaning of "distroless", but I have seen other unbound repos on Github which are (as they say...) also distroless, but do use TZ env variable.

LawnMo commented 9 months ago

Kluchell's answer should be all you need to test the timezone theory, the container doesn't need tzdata if your mount the correct volume(s) from a configured host :

    volumes:
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro

/etc/timezone is superfluous but better safe than sorry, alas it doesn't fix the issue (for me, ymmv).

(Is this container really distroless? Dockerfile points to an alpine base)

Turning serve-expired off in custom.conf.d/ seems to work around the BOGUS (DNSSEC signature expired) errors, so it could be that churchofnoise is right about the unbound issue linked.

jaydee73 commented 9 months ago

I'll try, thanks. But as you said in response of my initial post, this workaround hasn't worked for you, so I was sceptical.

But maybe, also as you said, we are on the wrong highway and the BOGUS problem isn't related to TZ at all, but to the serve-expired variable instead.

klutchell commented 9 months ago

Is this container really distroless? Dockerfile points to an alpine base

alpine is only used for the build stage, the final stage is from scratch