NLnetLabs / unbound

Unbound is a validating, recursive, and caching DNS resolver.
https://nlnetlabs.nl/unbound
BSD 3-Clause "New" or "Revised" License
3.06k stars 349 forks source link

DNS cache poison of DNSSEC invalid result? #649

Open PeterDaveHello opened 2 years ago

PeterDaveHello commented 2 years ago

Describe the bug

By default, if the upstream resolver is doing DNSSEC validation, the query of the record with broken signature will return SERVFAIL, but using dig with the +cd flag(to set the CD (checking disabled) bit in the query) to query domain will ask the resolver to bypass the DNSSEC check, return the result without validation.

Just noticed that the result of a CD bit enabled query will be cached by unbound, but not only reply to those requests with CD bit enabled, but also served to the clients they didn't enable the CD bit. Once the cache of invalid records that not passing DNSSEC is created by CD bit enabled in the request, the following queries without CD bit will retrieve the same result, without DNSSEC validation, just like the CD bit is enabled.

Not sure if this is something should be recognized as bug and will be fixed here? Might have security concerns IMO, thanks for your time and the effort on developing unbound.

To reproduce

Steps to reproduce the behavior:

  1. Set up an unbound resolver. (In this scenario, I compiled unbound from the latest master branch)

  2. Confirm that by default domain with broken DNSSEC record can't be resolved by unbound:

$ dig +short sigfail.verteiltesysteme.net @127.0.0.1
$ dig +short sigfail.verteiltesysteme.net @127.0.0.1
[1647706695] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647706697] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN SERVFAIL 2.427531 0 57
[1647706701] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647706701] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN SERVFAIL 0.000000 1 57
  1. Enable the CD bit and see if now we can get the record:
$ dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1
134.91.78.139
[1647706708] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647706708] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN NOERROR 0.236822 0 73
  1. Now disable CD bit, just like before, and see now we can get it resolved, even DNSSEC is not valid.
$ dig +short sigfail.verteiltesysteme.net @127.0.0.1
134.91.78.139
[1647706711] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647706711] unbound[27498:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN NOERROR 0.000000 1 73

Expected behavior

Maybe just like Google, Cloudflare and Quad9 will do, no poisoned result for DNSSEC invalid records, for example:

$ dig +short sigfail.verteiltesysteme.net @8.8.8.8
$ dig +short sigfail.verteiltesysteme.net @8.8.8.8
$ dig +cd +short sigfail.verteiltesysteme.net @8.8.8.8
134.91.78.139
$ dig +short sigfail.verteiltesysteme.net @8.8.8.8

System:

Version 1.15.1

Configure line: 
Linked libs: mini-event internal (it uses select), OpenSSL 1.0.2n  7 Dec 2017
Linked modules: dns64 respip validator iterator

BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues

Additional information

This could be a security issue, but just not sure what the severity would be here, thanks again for your effort!

wcawijngaards commented 2 years ago

For me, when I try this, it works okay, and it returns SERVFAIL for the final step. It also logs servfail to the log that it returns. The cd flag applies only to that one query and not to the next query.

How can it work for me, but not for you? I am using the development version, from the repo. But I do not see how the recent changes since the 1.15.1 version and the repo could affect the outcome. Perhaps something else is going on. Can you enable verbosity 5 debug logging and try again? Especially the debug logs for the failure part could be interesting. For actual cache responses there is little to no logging, only the lines that you pasted already.

PeterDaveHello commented 2 years ago

That's interesting, let me try to do the experiments inside a docker container and see if it's reproducible inside.

PeterDaveHello commented 2 years ago

@wcawijngaards not sure if a Dockerfile for us to test on a very similar environment is good to you? I tried to pack a simple Docker image so that we can test it together.

Here's the Dockerfile that can build both of the environment and unbound for us to test it:

FROM debian:10

RUN apt-get update && apt-get install build-essential libssl-dev libexpat1-dev bison git dnsutils -y
RUN git clone --depth 1 https://github.com/NLnetLabs/unbound /unbound

WORKDIR /unbound

RUN ./configure && make && make install
RUN echo "forward-zone:\n    name: "."\n    forward-addr: 8.8.8.8" >> /usr/local/etc/unbound/unbound.conf
RUN sed -i 's/# interface: 192.0.2.153/interface: 0.0.0.0/g' /usr/local/etc/unbound/unbound.conf
RUN sed -i -e 's/# log-queries: no/log-queries: yes/g' -e 's/# log-replies: no/log-replies: yes/g' /usr/local/etc/unbound/unbound.conf
RUN useradd unbound
RUN unbound-checkconf

It'll clone the latest unbound from GitHub repository, simply forward requests

You can build it into a Docker image, by put it into a text-based file, called Dockerfile by default, run docker build to build it, and run it:

$ docker build -t unbound-test -f Dockerfile .
$ docker run --rm -it unbound-test bash

Once inside the Docker container environment, execute unbound, and use dig to test it:

root@beeb691d33b0:/unbound# unbound
root@beeb691d33b0:/unbound# dig +short sigfail.verteiltesysteme.net @127.0.0.1
root@beeb691d33b0:/unbound# dig +short sigfail.verteiltesysteme.net @127.0.0.1
root@beeb691d33b0:/unbound# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1
root@beeb691d33b0:/unbound# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1
134.91.78.139
root@beeb691d33b0:/unbound# dig +short sigfail.verteiltesysteme.net @127.0.0.1
134.91.78.139

⬆️ That's something not expect, right?

I also tried to install tmux inside that Docker container, and don't fork unbound into the background, and see its logs:

root@beeb691d33b0:/unbound# unbound -dd
[1647881014] unbound[235:0] notice: init module 0: validator
[1647881014] unbound[235:0] notice: init module 1: iterator
[1647881014] unbound[235:0] info: start of service (unbound 1.15.1).
[1647881018] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647881021] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN SERVFAIL 2.416024 0 57                                                                                        
[1647881023] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647881023] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN SERVFAIL 0.000000 1 57                                                                                        
[1647881058] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN
[1647881058] unbound[235:0] info: 127.0.0.1 sigfail.verteiltesysteme.net. A IN NOERROR 0.243145 0 73
wcawijngaards commented 2 years ago

I do not actually use Docker, and I do not understand why it behaves so weirdly. Are you really querying the unbound server? Or something else? Because the tmux output shows a different number of queries and responses. Also the commandline output you show has one +cd output not show any output, which is also not possible.

Could you run this with verbosity 5, perhaps run unbound on a different port number, with port: 12345 or so, and dig -p 12345. So that you are sure to actually run unbound for the query.

Another thing that is impossible here is the default config does not enable DNSSEC validation. It has the validator module, but the trust anchor is not configured. So you should not really be having DNSSEC validation results.

Perhaps this is what is going on, everything is driven by the choice to have 8.8.8.8 as upstream. And no DNSSEC validation enabled?

wcawijngaards commented 2 years ago

After testing yes. It is all caused by not having DNSSEC enabled on unbound, but only having the forwarder. If you did not have the forwarder, but used root hints, it would be different. And enable a trust anchor.

What happens is that you have no DNSSEC on the local machine, so it queries the upstream. This responds with SERVFAIL because that does validation. But here is no timestamp on it, so unbound caches that servfail for a few seconds. It then depends on how fast you type the next query, whether that will be a copy, from cache, of the previous response.

Then you try the same with the +CD flag, and then the upstream does not do validation for you. And unbound caches this for a few seconds. And if you query quickly, then you get the cached response for that.

PeterDaveHello commented 2 years ago

Thanks for the explanation @wcawijngaards, is it something that you'd consider changing the behavior of unbound? Like using the CD bit as part of the cache key?

liang-hiwin commented 2 years ago

hello, I also have the same bug, my unbound is installed in debian x64

root@debian:~# dig +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
root@debian:~# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
root@debian:~# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139
root@debian:~# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139
root@debian:~# dig +cd +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139
root@debian:~# dig +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139
root@debian:~# dig +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139
root@debian:~# dig +short sigfail.verteiltesysteme.net @127.0.0.1 -p 5353
134.91.78.139

######################################

unbound -V
Version 1.16.0

Configure line: --enable-subnet --with-libevent --with-pthreads --with-ssl --enable-dnscrypt --with-libhiredis --enable-cachedb --enable-tfo-client --enable-tfo-server --enable-ipset
Linked libs: libevent 2.1.8-stable (it uses epoll), OpenSSL 1.1.1m  14 Dec 2021
Linked modules: dns64 cachedb subnetcache ipset respip validator iterator

I used forward 8.8.8.8

@wcawijngaards