gregwbrooks commented 3 years ago

power-mailinabox stopped working (seemed to be a DNS issue) after a recent update of system software. Reinstallation on both Debian Buster and Ubuntu 20.04 fail. Text of the latter's error is below. After throwing this error, anything requiring online access -- wget, apt update, etc. -- all fail, even following reboot.

My guess: Something changed with a bind/nsd update.

FAILED: apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confnew install ldnsutils openssh-client

Reading package lists... Building dependency tree... Reading state information... openssh-client is already the newest version (1:8.2p1-4ubuntu0.2). The following NEW packages will be installed: ldnsutils libldns2 0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded. Need to get 278 kB of archives. After this operation, 1,142 kB of additional disk space will be used. Err:1 http://mirror.enzu.com/ubuntu focal/universe amd64 libldns2 amd64 1.7.0-4.1ubuntu1 Temporary failure resolving 'mirror.enzu.com' Err:2 http://mirror.enzu.com/ubuntu focal/universe amd64 ldnsutils amd64 1.7.0-4.1ubuntu1 Temporary failure resolving 'mirror.enzu.com' E: Failed to fetch http://mirror.enzu.com/ubuntu/pool/universe/l/ldns/libldns2_1.7.0-4.1ubuntu1_amd64.deb Temporary failure resolving 'mirror.enzu.com' E: Failed to fetch http://mirror.enzu.com/ubuntu/pool/universe/l/ldns/ldnsutils_1.7.0-4.1ubuntu1_amd64.deb Temporary failure resolving 'mirror.enzu.com' E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

ddavness commented 3 years ago

Hi! Looks like an issue with bind9. What do you see when running these commands?

sudo service bind9 status
sudo journalctl -xe | grep bind9

Also, what's the content of /etc/default/bind9 (relevant for Debian) and /etc/default/named? (relevant for Ubuntu)

gregwbrooks commented 3 years ago

sudo service bind9 status (Results anonymized to EXAMPLE.COM) sudo: unable to resolve host mailhub.newwtg.com: Temporary failure in name resolution ● named.service - BIND Domain Name Server Loaded: loaded (/lib/systemd/system/named.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2021-08-07 15:51:50 PDT; 1h 23min ago Docs: man:named(8) Main PID: 721 (named) Tasks: 14 (limit: 2278) Memory: 42.2M CGroup: /system.slice/named.service └─721 /usr/sbin/named -f -u bind -4

Aug 07 17:15:02 mailhub.newwtg.com named[721]: validating 2.ubuntu.pool.ntp.org/A: bad cache hit (o> Aug 07 17:15:02 mailhub.newwtg.com named[721]: broken trust chain resolving '2.ubuntu.pool.ntp.org/> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '.org.DOMAIN.com/A/IN'> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '2.ubuntu.pool.ntp.org.> Aug 07 17:15:02 mailhub.newwtg.com named[721]: connection refused resolving '2.ubuntu.pool.ntp.org.> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com/A/I> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com/AAA> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving '.com.EXAMPLE.com/A/IN'> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com.new> Aug 07 17:15:07 mailhub.newwtg.com named[721]: connection refused resolving 'mailhub.EXAMPLE.com.new

sudo journalctl -xe | grep bind9 sudo: unable to resolve host mailhub.EXAMPLE.com: Temporary failure in name resolution Aug 07 17:14:07 mailhub.EXAMPLE.com sudo[3578]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service bind9 status Aug 07 17:15:07 mailhub.EXAMPLE.com sudo[3616]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service bind9 status

Contents of /etc/default/named #

run resolvconf?

RESOLVCONF=no

startup options for the server

OPTIONS="-u bind"

OPTIONS="-u bind -4"

ddavness commented 3 years ago

Alright, it is nsd then. bind9 looks fine, but nsd failing shouldn't have caused this :thinking: I messed with the nsd configuration in the last updates so it's possible it messed up your NSD install.

Could you please give me the output of the following:

sudo service nsd status
journalctl -xe | grep nsd
ip a

and the contents of /etc/nsd/nsd.conf?

ip a and nsd.conf may expose public ip addresses - feel free to redact those, but make sure that if you're redacting, for example, 1.1.1.1, that you're replacing that with something unique (like Public IP 1 or something like that)

gregwbrooks commented 3 years ago

sudo service nsd status Unit nsd.service could not be found.

journalctl -xe | grep nsd Aug 07 17:46:59 mailhub.EXAMPLE.com sudo[4311]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd status Aug 07 17:47:14 mailhub.EXAMPLE.com sudo[4319]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd start Aug 07 17:47:44 mailhub.EXAMPLE.com sudo[4339]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/bin/apt install nsd Aug 07 17:48:29 mailhub.EXAMPLE.com sudo[4392]: greg : TTY=pts/0 ; PWD=/home/greg ; USER=root ; COMMAND=/usr/sbin/service nsd status

ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 8a:ce:50:84:76:64 brd ff:ff:ff:ff:ff:ff inet (SUBNET MASK) brd (BROADCAST IP) scope global ens18 valid_lft forever preferred_lft forever inet6 2607:ff28:c005:2b:88ce:50ff:fe84:7664/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 2591493sec preferred_lft 604293sec inet6 fe80::88ce:50ff:fe84:7664/64 scope link valid_lft forever preferred_lft forever

and the contents of /etc/nsd/nsd.conf There is no nsd directory nor nsd.conf -- I think we're onto something. :)

ddavness commented 3 years ago

Yeah, for sure - is this a brand new install by any chance? (What's the contents of /etc/resolv.conf?)

gregwbrooks commented 3 years ago

Yep -- the cascade:

Web interface for Buster-based power MIAB stopped working after an apt-get upgrade
Reboot did nothing, so tried to run the install script again, thinking maybe the upgrade hosed some conf files. No luck.
Clean ISO reinstall of Buster, apt update and apt upgrade... and the script died during install, throwing an error.
Ditto for a clean reinstall of Ubuntu 20.04, attempted with both an older ISO as well as one just downloaded from Canonical.

Makes me wonder if an update to NSD is what's broken.

ddavness commented 3 years ago

I don't think so - else I would have easily noticed by now :\

ddavness commented 3 years ago

I'm not sure how is your network laid up but ideally the contents of /etc/resolv.conf should be:

nameserver 127.0.0.1

which is essentially bind9.

In case it doesn't work, as a temporary workaround, you can choose one public dns of your liking - for example 1.1.1.1 from cloudflare or 8.8.8.8 from google:

# /etc/resolv.conf - sets resolver to cloudlfare public nameservers
nameserver 1.1.1.1

gregwbrooks commented 3 years ago

No luck -- setting resolv.conf to 1.1.1.1 and running the install script gets me to the same break I initially reported.

If this seems more like a singular case of user error, let me know -- I don't want to waste your time having you play tech support for a one-off case.

ddavness commented 3 years ago

ACK. Keep me posted in case you find anything interesting

dephillipsmi commented 3 years ago

I had the same problem. I am trying to install on a virtual machine running on a home network behind a nat. I took a look at the nds.conf file. It listed the local ip address for my machine as well as the public ip address. I commented out the public ip address then ran the setup/start.sh script again. The install then successfully continued. It is at the point of installing SpamAssassin. So hopefully it will continue without any problems.

ddavness commented 3 years ago

We have continued this issue privately - turns out the fault was on both nsd and bind.

For nsd, I have a fix ready that will be incorporated in the next version. For bind, it was DNSSEC-related and required some specific configuration changes to their machine and some service restarts.

ddavness / power-mailinabox

Install failing on Buster and Ubuntu - related to nsd? #22

FAILED: apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confnew install ldnsutils openssh-client

run resolvconf?

startup options for the server

OPTIONS="-u bind"