Closed gregwbrooks closed 3 years ago
Hi! Looks like an issue with bind9. What do you see when running these commands?
sudo service bind9 status
sudo journalctl -xe | grep bind9
Also, what's the content of /etc/default/bind9
(relevant for Debian) and /etc/default/named
? (relevant for Ubuntu)
sudo service bind9 status
(Results anonymized to EXAMPLE.COM)
sudo: unable to resolve host mailhub.newwtg.com: Temporary failure in name resolution
● named.service - BIND Domain Name Server
Loaded: loaded (/lib/systemd/system/named.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2021-08-07 15:51:50 PDT; 1h 23min ago
Docs: man:named(8)
Main PID: 721 (named)
Tasks: 14 (limit: 2278)
Memory: 42.2M
CGroup: /system.slice/named.service
└─721 /usr/sbin/named -f -u bind -4
Aug 07 17:15:02 mailhub.newwtg.com named[721]: validating 2.ubuntu.pool.ntp.org/A: bad cache hit (o> Aug 07 17:15:02 mailhub.newwtg.com named[721]: broken trust chain resolving '2.ubuntu.pool.ntp.org/> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '.org.DOMAIN.com/A/IN'> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '2.ubuntu.pool.ntp.org.> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '2.ubuntu.pool.ntp.org.> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com/A/I> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com/AAA> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving '.com.EXAMPLE.com/A/IN'> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com.new> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com.new
sudo journalctl -xe | grep bind9
sudo: unable to resolve host mailhub.EXAMPLE.com: Temporary failure in name resolution
Aug 07 17:14:07 mailhub.EXAMPLE.com sudo[3578]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service bind9 status
Aug 07 17:15:07 mailhub.EXAMPLE.com sudo[3616]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service bind9 status
Contents of /etc/default/named
#
RESOLVCONF=no
OPTIONS="-u bind -4"
Alright, it is nsd then. bind9 looks fine, but nsd failing shouldn't have caused this :thinking: I messed with the nsd configuration in the last updates so it's possible it messed up your NSD install.
Could you please give me the output of the following:
sudo service nsd status
journalctl -xe | grep nsd
ip a
and the contents of /etc/nsd/nsd.conf
?
ip a
andnsd.conf
may expose public ip addresses - feel free to redact those, but make sure that if you're redacting, for example,1.1.1.1
, that you're replacing that with something unique (likePublic IP 1
or something like that)
sudo service nsd status
Unit nsd.service could not be found.
journalctl -xe | grep nsd
Aug 07 17:46:59 mailhub.EXAMPLE.com sudo[4311]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd status
Aug 07 17:47:14 mailhub.EXAMPLE.com sudo[4319]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd start
Aug 07 17:47:44 mailhub.EXAMPLE.com sudo[4339]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/bin/apt install nsd
Aug 07 17:48:29 mailhub.EXAMPLE.com sudo[4392]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd status
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 8a:ce:50:84:76:64 brd ff:ff:ff:ff:ff:ff
inet (SUBNET MASK) brd (BROADCAST IP) scope global ens18
valid_lft forever preferred_lft forever
inet6 2607:ff28:c005:2b:88ce:50ff:fe84:7664/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591493sec preferred_lft 604293sec
inet6 fe80::88ce:50ff:fe84:7664/64 scope link
valid_lft forever preferred_lft forever
and the contents of /etc/nsd/nsd.conf
There is no nsd directory nor nsd.conf -- I think we're onto something. :)
Yeah, for sure - is this a brand new install by any chance?
(What's the contents of /etc/resolv.conf
?)
Yep -- the cascade:
Makes me wonder if an update to NSD is what's broken.
I don't think so - else I would have easily noticed by now :\
I'm not sure how is your network laid up but ideally the contents of /etc/resolv.conf
should be:
nameserver 127.0.0.1
which is essentially bind9.
In case it doesn't work, as a temporary workaround, you can choose one public dns of your liking - for example 1.1.1.1
from cloudflare or 8.8.8.8
from google:
# /etc/resolv.conf - sets resolver to cloudlfare public nameservers
nameserver 1.1.1.1
No luck -- setting resolv.conf to 1.1.1.1 and running the install script gets me to the same break I initially reported.
If this seems more like a singular case of user error, let me know -- I don't want to waste your time having you play tech support for a one-off case.
ACK. Keep me posted in case you find anything interesting
I had the same problem. I am trying to install on a virtual machine running on a home network behind a nat. I took a look at the nds.conf file. It listed the local ip address for my machine as well as the public ip address. I commented out the public ip address then ran the setup/start.sh script again. The install then successfully continued. It is at the point of installing SpamAssassin. So hopefully it will continue without any problems.
We have continued this issue privately - turns out the fault was on both nsd
and bind
.
For nsd
, I have a fix ready that will be incorporated in the next version. For bind
, it was DNSSEC-related and required some specific configuration changes to their machine and some service restarts.
power-mailinabox stopped working (seemed to be a DNS issue) after a recent update of system software. Reinstallation on both Debian Buster and Ubuntu 20.04 fail. Text of the latter's error is below. After throwing this error, anything requiring online access -- wget, apt update, etc. -- all fail, even following reboot.
My guess: Something changed with a bind/nsd update.
FAILED: apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confnew install ldnsutils openssh-client
Reading package lists... Building dependency tree... Reading state information... openssh-client is already the newest version (1:8.2p1-4ubuntu0.2). The following NEW packages will be installed: ldnsutils libldns2 0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded. Need to get 278 kB of archives. After this operation, 1,142 kB of additional disk space will be used. Err:1 http://mirror.enzu.com/ubuntu focal/universe amd64 libldns2 amd64 1.7.0-4.1ubuntu1 Temporary failure resolving 'mirror.enzu.com' Err:2 http://mirror.enzu.com/ubuntu focal/universe amd64 ldnsutils amd64 1.7.0-4.1ubuntu1 Temporary failure resolving 'mirror.enzu.com' E: Failed to fetch http://mirror.enzu.com/ubuntu/pool/universe/l/ldns/libldns2_1.7.0-4.1ubuntu1_amd64.deb Temporary failure resolving 'mirror.enzu.com' E: Failed to fetch http://mirror.enzu.com/ubuntu/pool/universe/l/ldns/ldnsutils_1.7.0-4.1ubuntu1_amd64.deb Temporary failure resolving 'mirror.enzu.com' E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?