containers / aardvark-dns

Authoritative dns server for A/AAAA container records. Forwards other request to host's /etc/resolv.conf
Apache License 2.0
176 stars 32 forks source link

unexpected end of input reached #151

Open binarious opened 2 years ago

binarious commented 2 years ago

I've setup a nextcloud deployment with podman in combination with docker-compose. It's working fine mostly and dns is also generally working. But around every 30s I see connection issues between containers. I've created a test script which queries the database every second and I often see connection errors (SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo failed: Try again). journalctl shows the following output at these times:

aardvark-dns[70887]: Failed while parsing message: unexpected end of input reached
aardvark-dns[70887]: None received while parsing dns message, this is not expected server will ignore this message

Any ideas?

Deployment Setup **System details** - Fedora 36 - Podman 4.1.0 - Docker Compose 2.6.0 **podman** ``` systemctl --user enable podman.socket systemctl --user start podman.socket sudo dnf install -y podman podman-plugins podman network create nginx-proxy docker-compose up ``` **docker-compose.yml** It's `app` which sometimes cannot connect to `db` and `redis`. ```yml version: '3' services: db: image: docker.io/library/mariadb:10.5 command: --transaction-isolation=READ-COMMITTED --binlog-format=ROW restart: always volumes: - db:/var/lib/mysql env_file: - db.env redis: image: docker.io/library/redis:7.0-alpine restart: always app: image: docker.io/library/nextcloud:24-fpm-alpine restart: always security_opt: - label=disable volumes: - nextcloud:/var/www/html environment: - MYSQL_HOST=db - REDIS_HOST=redis env_file: - db.env depends_on: - db - redis web: build: ./web restart: always security_opt: - label=disable volumes: - nextcloud:/var/www/html:ro depends_on: - app ports: - "8082:8080" environment: VIRTUAL_HOST: REDACTED LETSENCRYPT_HOST: REDACTED LETSENCRYPT_EMAIL: REDACTED cron: image: docker.io/library/nextcloud:24-fpm-alpine restart: always security_opt: - label=disable volumes: - nextcloud:/var/www/html entrypoint: /cron.sh depends_on: - db - redis volumes: nextcloud: driver: local driver_opts: type: none o: bind device: "${PWD}/volumes/nextcloud" db: networks: default: name: nginx-proxy external: true ```
flouthoc commented 2 years ago

Hi @binarious , Thanks for creating the issue. Would it be possible to share a small reproducer. I have tried doing isolated load testing on aardvark server only by running it without netavark and it never fails but i am assuming something might be re-setting connection at netavark's end not sure but it could be something similar to https://github.com/containers/netavark/issues/210.

#!/bin/bash
for (( c=1; c<=100; c++ ))
do 
   dig -p 5533 @<my-interface-ip> testdbctr
done

I am still trying for a reproducer so will update once i am able to reproduce this issue.

binarious commented 2 years ago

Hi @flouthoc, I'll also do my best to create a small reproduction setup. Thank you for looking into it.

ryester19 commented 2 years ago

I often see these errors with my Nextcloud instance when I try to open the site after long periods of inactivity.

Aug 01 08:57:28 VPS1 aardvark-dns[524]: Failed while parsing message: unexpected end of input reached
Aug 01 08:57:28 VPS1 aardvark-dns[524]: None received while parsing dns message, this is not expected server will ignore this message
Aug 01 08:57:28 VPS1 aardvark-dns[524]: Failed while parsing message: unexpected end of input reached
Aug 01 08:57:28 VPS1 aardvark-dns[524]: None received while parsing dns message, this is not expected server will ignore this message
Aug 01 08:57:31 VPS1 nextcloud[8060]: [01-Aug-2022 12:57:31] WARNING: [pool nextcloud] child 521 said into stderr: "NOTICE: PHP message: [owncloud][index][3] {"Exception":"Doctrine\\DBAL\\Exception","Message":"Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] could not translate host name \"postgres\" to address: Try again","Code":7,"Trace":[{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1519,"function":"connect","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1041,"function":"getWrappedConnection","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/private/DB/Connection.php","line":261,"function":"executeQuery","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":345,"function":"executeQuery","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/private/DB/QueryBuilder/Q"
Aug 01 08:57:31 VPS1 nextcloud[8060]: [01-Aug-2022 12:57:31] WARNING: [pool nextcloud] child 521 said into stderr: "ueryBuilder.php","line":281,"function":"execute","class":"Doctrine\\DBAL\\Qu..."
Aug 01 08:57:31 VPS1 nextcloud[8060]: [01-Aug-2022 12:57:31] WARNING: [pool nextcloud] child 71 said into stderr: "NOTICE: PHP message: [owncloud][index][3] {"Exception":"Doctrine\\DBAL\\Exception","Message":"Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] could not translate host name \"postgres\" to address: Try again","Code":7,"Trace":[{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1519,"function":"connect","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1041,"function":"getWrappedConnection","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/private/DB/Connection.php","line":261,"function":"executeQuery","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":345,"function":"executeQuery","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/private/DB/QueryBuilder/Qu"
Aug 01 08:57:31 VPS1 nextcloud[8060]: [01-Aug-2022 12:57:31] WARNING: [pool nextcloud] child 71 said into stderr: "eryBuilder.php","line":281,"function":"execute","class":"Doctrine\\DBAL\\Qu..."

I have 4 machines running Arch Linux, and I see this error on all of them. Might be unrelated, but I also see this error on startup alot of times:

Jul 27 12:04:57 VPS1 systemd[1]: Started /usr/lib/podman/aardvark-dns --config /run/containers/networks/aardvark-dns -p 53 run.
Jul 27 12:04:57 VPS1 aardvark-dns[565]: Unable to start server unable to start CoreDns server: Address already in use (os error 98)
DynamoFox commented 2 years ago

Hello, I'm affected as well by this issue, both with aardvark-dns v1.0.3 and v.1.1.0.

In the logs below I'm running a container (vaultwarden) which connects to another container (a Postgres database) through a network bridge named "web-apps-compose_vaultwarden-net" with Podman in rootless mode. Vaultwarden's container is configured to connect to Postgres by looking it up via DNS with "postgres-db.dns.podman". The containers are kept together and managed through podman-compose.

Once vaultwarden is started it will sooner or later complain about failed DNS resolutions which are at the same time coupled with logs from aardvark-dns which exactly match the ones reported in this issue. (see more in the logs below) It should be noted that this issue happens even with other containers, such as nextcloud's one.

I tried running aardvark-dns with RUST_LOG=trace but it looks like nothing new is printed about this issue. (see the sample below and eventually the full aardvark-dns logs attached) Is there a way I can get more logs (ENV variable, cli parameter)? (i.e. possibly from trust-dns-* which is the dependency the error comes from)

Podman info
host:
  arch: amd64
  buildahVersion: 1.26.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.1.3-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.3, commit: ab52a597278b20173440140cd810dc9fa8785c93'
  cpuUtilization:
    idlePercent: 97.48
    systemPercent: 1.3
    userPercent: 1.23
  cpus: 2
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: Botti-VPS
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.18.15-hardened1-1-hardened
  linkmode: dynamic
  logDriver: journald
  memFree: 91676672
  memTotal: 2075852800
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.5-1
    path: /usr/bin/crun
    version: |-
      crun version 1.5
      commit: 54ebb8ca8bf7e6ddae2eb919f5b82d1d96863dea
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1001/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 1037561856
  uptime: 2h 51m 22.51s (Approximately 0.08 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/container/.config/containers/storage.conf
  containerStore:
    number: 9
    paused: 0
    running: 9
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/container/.local/share/containers/storage
  graphRootAllocated: 42410704896
  graphRootUsed: 7771459584
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 9
  runRoot: /run/user/1001/containers
  volumePath: /home/container/.local/share/containers/storage/volumes
version:
  APIVersion: 4.1.1
  Built: 1656056036
  BuiltTime: Fri Jun 24 09:33:56 2022
  GitCommit: f73d8f8875c2be7cd2049094c29aff90b1150241
  GoVersion: go1.18.3
  Os: linux
  OsArch: linux/amd64
  Version: 4.1.1
Logs of the container failing to resolve
/--------------------------------------------------------------------\
|                        Starting Vaultwarden                        |
|                           Version 1.25.2                           |
|--------------------------------------------------------------------|
| This is an *unofficial* Bitwarden implementation, DO NOT use the   |
| official channels to report bugs/features, regardless of client.   |
| Send usage/configuration questions or feature requests to:         |
|   https://vaultwarden.discourse.group/                             |
| Report suspected bugs/issues in the software itself at:            |
|   https://github.com/dani-garcia/vaultwarden/issues/new            |
\--------------------------------------------------------------------/

[INFO] No .env file found.

[WARNING] The following environment variables are being overriden by the config file,
[WARNING] please use the admin panel to make changes to them:
[WARNING] DOMAIN, IP_HEADER, SMTP_HOST, SMTP_PORT, SMTP_FROM, SMTP_FROM_NAME, SMTP_USERNAME, SMTP_PASSWORD

[2022-08-02 09:58:53.252][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 09:58:53.689][vaultwarden::api::notifications][INFO] Starting WebSockets server on 0.0.0.0:3012
[2022-08-02 09:58:53.693][start][INFO] Rocket has launched from http://0.0.0.0:8080
[2022-08-02 10:09:23.218][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:09:23.218][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:09:23.218][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:09:28.230][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:15:53.191][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:15:53.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:15:58.598][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:19:53.222][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:30:23.190][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:35:23.189][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:39:23.188][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:39:53.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:45:53.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:45:58.600][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:51:23.194][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:51:23.195][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 10:59:53.185][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:06:23.191][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:09:53.189][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:16:53.186][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:16:58.590][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:20:23.190][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:36:23.188][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:39:23.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:45:23.188][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:46:53.197][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:47:23.189][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:50:53.188][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 11:50:58.593][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:17:23.190][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:19:23.187][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:19:53.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:19:58.597][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:25:23.189][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:30:23.194][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:36:23.192][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again

[2022-08-02 12:50:23.190][r2d2][ERROR] could not translate host name "postgres-db" to address: Try again
Sample of the logs from aardwark-dns running with RUST_LOG=trace (journalctl -b | grep aard)
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: request source address: 10.89.5.3:54633
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: requested record type: AAAA
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: checking if backend has entry for: "postgres-db.dns.podman."
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: server backend.name_mappings: {"email-apps-compose_postfix-mta-sts-resolver-net": {"postfix-mta-sts-resolver": [10.89.7.2], "email-apps-compose_postfix-mta-sts-resolver_1": [10.89.7.2], "54c43a779c7f": [10.89.7.2]}, "web-apps-compose_nextcloud-net": {"web-apps-compose_nextcloud-webapp_1": [10.89.4.4], "web-apps-compose_postgres-db_1": [10.89.4.2], "6897e4e7219e": [10.89.4.2], "833c41410132": [10.89.4.4], "nextcloud-webapp": [10.89.4.4], "web-apps-compose_nextcloud-cron_1": [10.89.4.5], "nextcloud-cron": [10.89.4.5], "nextcloud-redis": [10.89.4.3], "b8607ed4dadb": [10.89.4.3], "ca52f8e2e029": [10.89.4.5], "web-apps-compose_nextcloud-redis_1": [10.89.4.3], "postgres-db": [10.89.4.2]}, "web-apps-compose_trilium-web-net": {"2c27d07a273f": [10.89.3.2], "web-apps-compose_trilium-webapp_1": [10.89.3.2], "trilium-webapp": [10.89.3.2]}, "web-apps-compose_vaultwarden-net": {"e518ae7c468d": [10.89.5.3], "web-apps-compose_postgres-db_1": [10.89.5.2], "6897e4e7219e": [10.89.5.2], "postgres-db": [10.89.5.2], "web-apps-compose_vaultwarden-webapp_1": [10.89.5.3], "vaultwarden-webapp": [10.89.5.3]}}
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: server backend.ip_mappings: {10.89.5.2: ["web-apps-compose_nextcloud-net", "web-apps-compose_vaultwarden-net"], 10.89.3.2: ["web-apps-compose_trilium-web-net"], 10.89.4.3: ["web-apps-compose_nextcloud-net"], 10.89.4.4: ["web-apps-compose_nextcloud-net"], 10.89.7.2: ["email-apps-compose_postfix-mta-sts-resolver-net"], 10.89.4.2: ["web-apps-compose_nextcloud-net", "web-apps-compose_vaultwarden-net"], 10.89.5.3: ["web-apps-compose_vaultwarden-net"], 10.89.4.5: ["web-apps-compose_nextcloud-net"]}
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: server backend kill switch: true
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: No backend lookup found, try resolving in current resolvers entry
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: [136] success reponse
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:48 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: Failed while parsing message: unexpected end of input reached
Aug 02 12:15:50 Botti-VPS aardvark-dns[626]: None received while parsing dns message, this is not expected server will ignore this message
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: parsed message "[19814] parsed message body: postgres-db.dns.podman. A IN edns: false"
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: server name: "web-apps-c"
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: request source address: 10.89.5.3:33697
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: requested record type: A
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: checking if backend has entry for: "postgres-db.dns.podman."
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: server backend.name_mappings: {"email-apps-compose_postfix-mta-sts-resolver-net": {"postfix-mta-sts-resolver": [10.89.7.2], "email-apps-compose_postfix-mta-sts-resolver_1": [10.89.7.2], "54c43a779c7f": [10.89.7.2]}, "web-apps-compose_nextcloud-net": {"web-apps-compose_nextcloud-webapp_1": [10.89.4.4], "web-apps-compose_postgres-db_1": [10.89.4.2], "6897e4e7219e": [10.89.4.2], "833c41410132": [10.89.4.4], "nextcloud-webapp": [10.89.4.4], "web-apps-compose_nextcloud-cron_1": [10.89.4.5], "nextcloud-cron": [10.89.4.5], "nextcloud-redis": [10.89.4.3], "b8607ed4dadb": [10.89.4.3], "ca52f8e2e029": [10.89.4.5], "web-apps-compose_nextcloud-redis_1": [10.89.4.3], "postgres-db": [10.89.4.2]}, "web-apps-compose_trilium-web-net": {"2c27d07a273f": [10.89.3.2], "web-apps-compose_trilium-webapp_1": [10.89.3.2], "trilium-webapp": [10.89.3.2]}, "web-apps-compose_vaultwarden-net": {"e518ae7c468d": [10.89.5.3], "web-apps-compose_postgres-db_1": [10.89.5.2], "6897e4e7219e": [10.89.5.2], "postgres-db": [10.89.5.2], "web-apps-compose_vaultwarden-webapp_1": [10.89.5.3], "vaultwarden-webapp": [10.89.5.3]}}
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: server backend.ip_mappings: {10.89.5.2: ["web-apps-compose_nextcloud-net", "web-apps-compose_vaultwarden-net"], 10.89.3.2: ["web-apps-compose_trilium-web-net"], 10.89.4.3: ["web-apps-compose_nextcloud-net"], 10.89.4.4: ["web-apps-compose_nextcloud-net"], 10.89.7.2: ["email-apps-compose_postfix-mta-sts-resolver-net"], 10.89.4.2: ["web-apps-compose_nextcloud-net", "web-apps-compose_vaultwarden-net"], 10.89.5.3: ["web-apps-compose_vaultwarden-net"], 10.89.4.5: ["web-apps-compose_nextcloud-net"]}
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: server backend kill switch: true
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: No backend lookup found, try resolving in current resolvers entry
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: [19814] success reponse
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: parsed message "[20132] parsed message body: postgres-db.dns.podman. AAAA IN edns: false"
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: server name: "web-apps-c"
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: request source address: 10.89.5.3:33697
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: requested record type: AAAA
Aug 02 12:15:53 Botti-VPS aardvark-dns[626]: checking if backend has entry for: "postgres-db.dns.podman."

aardvark-dns-FULL_LOG.txt (With RUST_LOG=trace; Note that I redacted some parts)

flouthoc commented 2 years ago

@DynamoFox with version 1.1.0 of aardvarkdns you also have to use netavark v1.1.0

DynamoFox commented 2 years ago

@DynamoFox with version 1.1.0 of aardvarkdns you also have to use netavark v1.1.0

Sure,

[2022-07-31T22:25:01+0200] [ALPM] upgraded aardvark-dns (1.0.3-1 -> 1.1.0-1)
[2022-07-31T22:25:01+0200] [ALPM] upgraded netavark (1.0.3-2 -> 1.1.0-1)

They were updated together and I fully rebooted the machine before experimenting the issue again.

Luap99 commented 2 years ago

Can you attach strace to the aardvark process and provide the logs when it happens.

DynamoFox commented 2 years ago

Can you attach strace to the aardvark process and provide the logs when it happens.

Okay, here they are. aardvark_debug_logs.zip

Inside there are two different reports:

I tested the report1 configuration only once, when I have more time I'll try it again and report back, so don't mind it too much for now. I suggest to look at report2 instead.

EDIT: It should be noted that when aardvark is strace-d resolution failures seem to be much higher (even tho I'm not sure about how often vaultwarden, which is the resolving container, does retry dns failures).

It should be noted (not in these tests) that I tried to manually install dig in one of the running containers affected by this problem, and no matter how much I tried resolutions from it, it never failed...

Luap99 commented 2 years ago

Can you add -s 300 to the strace command this will include longer messages so we can see unexpected end of input reached in the strace log

DynamoFox commented 2 years ago

Here's report3, same conditions as report2. strace cmd: strace -t -f -s 300 -o ./aardvark-dns_strace -E RUST_LOG=trace -p 645 report3.zip

Luap99 commented 2 years ago

I don't have time to debug this further but it looks like querys for <name>.dns.podman are working and only <name> is failing

ryester19 commented 1 year ago

I don't have time to debug this further but it looks like querys for <name>.dns.podman are working and only <name> is failing

<name>.dns.podman queries are also failing for me on occasion:

Dec 10 00:37:16 VPS1 nextcloud[754]: [10-Dec-2022 05:37:16] WARNING: [pool nextcloud] child 73 said into stderr: "NOTICE: PHP message: [owncloud][index][3] {"Exception":"Doctrine\\DBAL\\Exception","Message":"Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] could not translate host name \"postgres.dns.podman\" to address: Name does not resolve","Code":7,"Trace":[{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1519,"function":"connect","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Connection.php","line":1041,"function":"getWrappedConnection","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/private/DB/Connection.php","line":261,"function":"executeQuery","class":"Doctrine\\DBAL\\Connection","type":"->"},{"file":"/var/www/nextcloud/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php","line":345,"function":"executeQuery","class":"OC\\DB\\Connection","type":"->"},{"file":"/var/www/nextcloud/lib/pri"

DynamoFox commented 1 year ago

It seems like the issue persists even with the latest update of aardvark to 1.5.0

baude commented 1 year ago

has anyone see it in 1.6 or 1.7?