Open MartinX3 opened 2 years ago
Ah, after turning /etc/resolv.conf again into a symlink of systemd-resolved
sudo ln -rsf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
the messages disappear.
I assume it wants now to call the localhost ip of the systemd-resolved instead of the network ip of the outside resolver.
I don't know how I could let this plugin verbose debug logging into the journal, but I'm glad that it is fixed now.
And it's back
dns request failed: request timed out
It seems to happen if I execute nslookup
in a container.
But I get a result. Weird.
I detected the same problem on my raspberry pi with Fedora and rootless podman containers.
Having the exact same issue. DNS lookup is extremely unstable, only two thirds of lookups work.
This is on Fedora Silverblue 37 using rootless containers and the new networking stack.
Now using a nextcloud container which communicates with a PostgreSQL and a LDAP container. I access the nextcloud container with a nginx reverse proxy container from the internet.
1/3 of the nextcloud web interface results into a 502 bad gateway, because of the unstable DNS. And my logs are getting spammed with
aardvark-dns[3617]: 50310 dns request failed: request timed out
I hope the next podman release will include the new network stack which hopefully fixes this issue.
It is impossible to help with these issues as reporters did not provide versions for podman, netavark, and aarvark-dns. please provide as much relevant information as possible.
podman: 4.3.1 netavark: 1.4.0 aardvark-dns: 1.4.0
would you say your machine/vm is high performent or maybe has slow IO/processor/RAM limitations? I'm trying to understand if you might have a race.
The first bare metal server server with this issue has the HDD connected to SATA with the server. CPU: Intel Xeon E3-1231 v3 - 3.4 GHz - 4 core(s) RAM: 32GB - DDR3
The second bare metal server has a 870 Evo SSD connected via SATA to the server. CPU: AMD Phenom 2 955 X4 3.2 GHz 4 cores RAM: 6GB - DDR2
The second server spams this issue many times. The first server spams it less often but also regularly.
@flouthoc wdyt?
Now my faster server spammed it at night while the server pods weren't used by clients.
We used to see similar issues in older versions of netavark and aardvark in Podman CI as well but it was fixed in newer versions with https://github.com/containers/aardvark-dns/pull/220 but I guess there might be some issue which is not being reproduced in our CI, I'll try to reproduce this locally and see if i can reproduce this.
I'm seeing the same problems on a freshly installed Fedora Server 37 instance as well. The machine is basically completely idling with no load and my journal is still filled with these errors. This issue was not present before 2022, and I've ran similar setups on slower machines without any DNS lookup issues.
podman: 4.3.1 netavark: 1.4.0 aardvark-dns: 1.4.0
Can you test with v1.5?
I think the timeout is fixed. But now sometimes I get an "empty dns response" on all machines.
Does this cause problems for the container or is just an error that is logged often?
It's logged often with long breaks between. So it appears in groups most of the time.
I think the services just repeat the DNS action again until it works. So I would say it just consumes CPU time and maybe network speed?
Maybe I don't use it long enough to see long term errors.
Do you have a simple reproducer? What kind of application are you running and how many dns request does it make?
Tested with v1.5 and I'm getting a lot of dns request got empty response
as well. Here's a list of all the containers I'm running on my system:
eclipse-mosquitto:2
koenkk/zigbee2mqtt
homeassistant/home-assistant:stable
It's notable that none of these containers are particularly demanding on the hardware, and my system load average is generally below 0.1 at all times.
It happens without workload.
The server just runs
I'm experiencing the same issue with aardvark-dns reporting lots of dns request got empty response
errors in my logs. It seems to be causing problems for at least the containers running Uptime Kuma, Invidious and Jellyseerr. Uptime Kuma starts throwing ECONNRESET
when doing GET requests, and Invidious and Jellyseerr similarly start to have their requests fail, with external content taking a long time to load, if at all.
It happens with both 1.5.0 and the latest 1.6.0 from podman-next. For me it seems to start after around 3 days of uptime. I've tried changing machines and switching from onboard Realtek to an Intel i350-T2 controller, but both to no avail. Rebooting solves the issue, until uptime reaches 3 days again.
Just ran into this issue as well with Nextcloud + Nginx Proxy Manager. What's funny is that I am using the same docker-compose setup on two different servers and one works fine while the other one doesn't. The only difference is that the one that is breaking isn't publicly accessible on the Internet and is instead setup to respond over a .lan
domain which is configured on the home router. NPM has a proxy host setup that responds to mydomain.lan
and redirects it to the nextcloud container.
It will work for a bit when I up/down NPM, but then eventually fail after a few hours or even days with 502 bad gateway errors, and dns request got empty response
starts getting spammed into journalctl.
My setup
Here are my docker-compose files to set up each of them (rootful btw): Nextcloud docker-compose.yml
version: '3'
services:
db:
image: mariadb
command: --transaction-isolation=READ-COMMITTED --log-bin=binlog --binlog-format=ROW
restart: always
volumes:
- ./db:/var/lib/mysql
environment:
- MYSQL_ROOT_PASSWORD=<pw here>
- MARIADB_AUTO_UPGRADE=1
- MARIADB_DISABLE_UPGRADE_BACKUP=1
env_file:
- db.env
networks:
- backend
redis:
image: redis:alpine
restart: always
networks:
- backend
nextcloud:
image: nextcloud:apache
restart: always
volumes:
- ./html:/var/www/html
environment:
- MYSQL_HOST=db
- REDIS_HOST=redis
env_file:
- db.env
depends_on:
- db
- redis
networks:
- nextcloud_frontend
- backend
cron:
image: nextcloud:apache
restart: always
volumes:
- ./html:/var/www/html
entrypoint: /cron.sh
depends_on:
- db
- redis
networks:
- backend
networks:
nextcloud_frontend:
external: true
backend:
db.env
MYSQL_PASSWORD=<pw here>
MYSQL_DATABASE=nextcloud
MYSQL_USER=nextcloud
Nginx Proxy Manager docker-compose.yml
version: '3.8'
services:
proxy:
image: 'jc21/nginx-proxy-manager:latest'
restart: always
ports:
# These ports are in format <host-port>:<container-port>
- '80:80' # Public HTTP Port
- '443:443' # Public HTTPS Port
- '81:81' # Admin Web Port
# Add any other Stream port you want to expose
# - '21:21' # FTP
# Uncomment the next line if you uncomment anything in the section
# environment:
# Uncomment this if you want to change the location of
# the SQLite DB file within the container
# DB_SQLITE_FILE: "/data/database.sqlite"
# Uncomment this if IPv6 is not enabled on your host
# DISABLE_IPV6: 'true'
healthcheck:
test: ["CMD", "/bin/check-health"]
interval: 30s
timeout: 3s
volumes:
- ./data:/data
- ./letsencrypt:/etc/letsencrypt
networks:
- nextcloud_frontend
networks:
nextcloud_frontend:
external: true
Can anyone check if they still see this with aardvark-dns v1.12.2?
@Luap99 At least for me it looks fine. Thank you.
I started getting a fair number of dns request got empty response
from aardvark-dns out of the blue yesterday. I'm running podman from COPR and daily updates with dnf-automatic
so I can't say for sure what version it started with. My system has been stable otherwise.
podman version 5.3.0-dev-29eb8ce09 aardvark-dns 1.13.0-dev
I encountered the same issue, and it turned out to be self-inflicted. I had added the podman network gateway IP to my /etc/resolv.conf
, which I believe caused an infinite DNS resolution loop since Aardvark DNS might be using /etc/resolv.conf
for upstream. Once I removed the podman network gateway IP from /etc/resolv.conf
, the issue was resolved.
My resolv.conf
seems to be in order.
# Generated by NetworkManager
nameserver 1.1.1.1
nameserver 1.0.0.1
nameserver 185.12.64.1
# NOTE: the libc resolver may not support more than 3 nameservers.
# The nameservers listed below may not be recognized.
nameserver 185.12.64.2
nameserver 2a01:4ff:ff00::add:2
nameserver 2a01:4ff:ff00::add:1
Over the course of a few hours yesterday (21:31 to 23:00 UTC), a bunch of empty response
messages were logged, but none have been logged since.
Also network related, and also odd, is pasta
has recently started logging No external routable interface for IPv6
on my dev machine.
podman run --rm -i --log-driver=passthrough-tty -v /codebase/webapp:/webapp dd2to3 pyup_dirs --recursive --py313-plus /webapp
Oct 16 13:30:53 dd-owen-dev pasta[79493]: No external routable interface for IPv6
% ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:0c:29:67:79:43 brd ff:ff:ff:ff:ff:ff
altname enp3s0
inet 172.16.209.6/24 brd 172.16.209.255 scope global noprefixroute ens160
valid_lft forever preferred_lft forever
15: podman1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether fe:39:d9:c2:2b:8a brd ff:ff:ff:ff:ff:ff
inet 10.89.0.1/24 brd 10.89.0.255 scope global podman1
valid_lft forever preferred_lft forever
inet6 fe80::fc39:d9ff:fec2:2b8a/64 scope link
valid_lft forever preferred_lft forever
16: veth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether e2:4f:e8:89:28:c4 brd ff:ff:ff:ff:ff:ff link-netns netns-9f79fe86-d8c5-bd93-4fef-0f95d4ec782c
inet6 fe80::e04f:e8ff:fe89:28c4/64 scope link
valid_lft forever preferred_lft forever
17: veth1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether b2:12:13:3e:d7:d1 brd ff:ff:ff:ff:ff:ff link-netns netns-264e8cb4-a3a2-84ec-4dad-5d146bff3f72
inet6 fe80::b012:13ff:fe3e:d7d1/64 scope link
valid_lft forever preferred_lft forever
18: veth2@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether fe:fa:ac:70:ca:85 brd ff:ff:ff:ff:ff:ff link-netns netns-10327a44-fdc9-099e-340f-17ddfd759e0d
inet6 fe80::fcfa:acff:fe70:ca85/64 scope link
valid_lft forever preferred_lft forever
19: veth3@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether 1e:20:af:e6:df:23 brd ff:ff:ff:ff:ff:ff link-netns netns-e70477fb-f6e9-ade7-c624-2b4ce47053be
inet6 fe80::1c20:afff:fee6:df23/64 scope link
valid_lft forever preferred_lft forever
20: veth4@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether d2:3f:47:c2:e3:91 brd ff:ff:ff:ff:ff:ff link-netns netns-448fb15e-a452-efde-7ae0-87d925f94091
inet6 fe80::d03f:47ff:fec2:e391/64 scope link
valid_lft forever preferred_lft forever
21: veth5@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether 9e:91:56:22:b6:b3 brd ff:ff:ff:ff:ff:ff link-netns netns-6e0e3dfe-5214-81d4-ea80-043e5844a1e4
inet6 fe80::9c91:56ff:fe22:b6b3/64 scope link
valid_lft forever preferred_lft forever
22: veth6@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether fa:cf:0b:9c:28:c7 brd ff:ff:ff:ff:ff:ff link-netns netns-c3bfdebf-2e88-8d7d-d3cd-2060a8c7428a
inet6 fe80::f8cf:bff:fe9c:28c7/64 scope link
valid_lft forever preferred_lft forever
23: veth7@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether 66:71:f8:9e:67:bc brd ff:ff:ff:ff:ff:ff link-netns netns-c72a6ec6-a03e-579c-8d1f-86a6719d86b7
inet6 fe80::6471:f8ff:fe9e:67bc/64 scope link
valid_lft forever preferred_lft forever
24: veth8@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether 4a:3a:76:80:59:9b brd ff:ff:ff:ff:ff:ff link-netns netns-76238717-eeb9-50a5-f96e-3aac5e4e7d31
inet6 fe80::483a:76ff:fe80:599b/64 scope link
valid_lft forever preferred_lft forever
25: veth9@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master podman1 state UP group default qlen 1000
link/ether c2:c6:30:88:ad:56 brd ff:ff:ff:ff:ff:ff link-netns netns-41c7d6c5-76f6-11f3-f772-06f999175ed6
inet6 fe80::c0c6:30ff:fe88:ad56/64 scope link
valid_lft forever preferred_lft forever
Other than dnf-automatic package installs, not much about my dev machine has changed in the better part of a year.
I'm using arch linux, so the packages should have the newest version.
I'm using firewallD and rootless podman with netavark and aardvark-dns.
I understand, that rootless podman with netavark won't manage my firewallD, but I would like to know which rules I need to activate to avoid the spam in my journal. And if the rule need to be in my loopback or network interface. (Also if it is enough to allow communication with the host instead of having an open port in the internet.
My dns resolver is systemd-resolved
My journal spam:
aardvark-dns[6156]: 21433 dns request failed: request timed out
The rootless container itself can ping to
google.com
. I didn't test if they can ping to a container dns name.