Mailcow on podman do not work and produce a funny error message

Mordecaine commented 2 years ago

/kind bug

Description: I tried to run Mailcow on podman with docker-compose. I get the following error:

[root@abydos mailcow-dockerized]# docker-compose up -d
Creating network "mailcowdockerized_mailcow-network" with driver "bridge"
Creating volume "mailcowdockerized_vmail-vol-1" with default driver
Creating volume "mailcowdockerized_vmail-index-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-socket-vol-1" with default driver
Creating volume "mailcowdockerized_redis-vol-1" with default driver
Creating volume "mailcowdockerized_rspamd-vol-1" with default driver
Creating volume "mailcowdockerized_solr-vol-1" with default driver
Creating volume "mailcowdockerized_postfix-vol-1" with default driver
Creating volume "mailcowdockerized_crypt-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-web-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-userdata-backup-vol-1" with default driver
Creating mailcowdockerized_clamd-mailcow_1     ... done
Creating mailcowdockerized_unbound-mailcow_1   ... done
Creating mailcowdockerized_dockerapi-mailcow_1 ... done
Creating mailcowdockerized_watchdog-mailcow_1  ... done
Creating mailcowdockerized_memcached-mailcow_1 ... done
Creating mailcowdockerized_redis-mailcow_1     ... done
Creating mailcowdockerized_sogo-mailcow_1      ... done
Creating mailcowdockerized_solr-mailcow_1      ... done
Creating mailcowdockerized_olefy-mailcow_1     ... done
Creating mailcowdockerized_mysql-mailcow_1     ... done
Creating mailcowdockerized_php-fpm-mailcow_1   ... done
Creating mailcowdockerized_dovecot-mailcow_1   ... error
Creating mailcowdockerized_postfix-mailcow_1   ...
Creating mailcowdockerized_nginx-mailcow_1     ...

ERROR: for mailcowdockerized_dovecot-mailcow_1  Cannot start service dovecot-mailcow: error configuring network namespace for container d8cf73369bfda68ee181fc1ecbdcd51036215f796aab39Creating mailcowdockerized_postfix-mailcow_1   ... error
r range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for mailcowdockerized_postfix-mailcow_1  Cannot start service postfix-mailcow: error configuring network namespace for container 5e27fbfd330a5f31eaabe680ed66db8d8fec33cc705520d74f9786dc721886fc: error adding pod mailcowdockerized_postfix-mailcow_1_mailcowdockerized_postfix-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate foCreating mailcowdockerized_nginx-mailcow_1     ... done
Creating mailcowdockerized_acme-mailcow_1      ... done

ERROR: for dovecot-mailcow  Cannot start service dovecot-mailcow: error configuring network namespace for container d8cf73369bfda68ee181fc1ecbdcd51036215f796aab39c8a9fc5c5f9f33350d: error adding pod mailcowdockerized_dovecot-mailcow_1_mailcowdockerized_dovecot-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for postfix-mailcow  Cannot start service postfix-mailcow: error configuring network namespace for container 5e27fbfd330a5f31eaabe680ed66db8d8fec33cc705520d74f9786dc721886fc: error adding pod mailcowdockerized_postfix-mailcow_1_mailcowdockerized_postfix-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate for range 0: requested IP address 172.22.1.253 is not available in range set 172.22.1.1-172.22.1.254
ERROR: Encountered errors while bringing up the project.
[root@abydos mailcow-dockerized]#

Steps to reproduce the issue:

Download Mailcow:

sudo -i
mkdir ~/sources/mailcow
cd ~/sources/mailcow
git clone https://github.com/mailcow/mailcow-dockerized
cd mailcow-dockerized

Install docker-compose and dependencies for podman


curl -L https://github.com/docker/compose/releases/download/$(curl -Ls https://www.servercow.de/docker-compose/latest.php)/docker-compose-$(uname -s)-$(uname -m) > /usr/local/sbin/docker-compose
chmod +x /usr/local/sbin/docker-compose

dnf install podman-docker -y systemctl enable podman.socket --now


3. Fix generate-config.sh to work with podman:
```bash
cat <<EOL > ./podman.patch
--- generate_config.sh  2021-09-03 14:05:22.652448594 +0200
+++ generate_config_new.sh      2021-09-03 14:07:27.284209832 +0200
@@ -25,7 +25,7 @@
   exit 1
 fi

-for bin in openssl curl docker-compose docker git awk sha1sum; do
+for bin in openssl curl docker-compose podman git awk sha1sum; do
   if [[ -z \$(which \${bin}) ]]; then echo "Cannot find \${bin}, exiting..."; exit 1; fi
 done
EOL

patch generate_config.sh podman.patch
./generate_config.sh

Disable IPv6 Network in docker-compose.yml because podman is not able to use this

-        - subnet: ${IPV6_NETWORK:-fd4d:6169:6c63:6f77::/64}
+        #- subnet: ${IPV6_NETWORK:-fd4d:6169:6c63:6f77::/64}

Follow Install steps from 5. https://mailcow.github.io/mailcow-dockerized-docs/i_u_m_install/

Describe the results you received: See Description

Describe the results you expected: All containers should start successfully

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version 3.2.3

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.21.3
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.29-1.module+el8.4.0+11822+6cc1e7d7.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: ae467a0c8001179d4d0adf4ada381108a893d7ec'
  cpus: 4
  distribution:
    distribution: '"rhel"'
    version: "8.4"
  eventLogger: file
  hostname: abydos.localdomain
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-305.19.1.el8_4.x86_64
  linkmode: dynamic
  memFree: 1387593728
  memTotal: 8145637376
  ociRuntime:
    name: runc
    package: runc-1.0.0-74.rc95.module+el8.4.0+11822+6cc1e7d7.x86_64
    path: /usr/bin/runc
    version: |-
      runc version spec: 1.0.2-dev
      go: go1.15.13
      libseccomp: 2.5.1
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4257214464
  swapTotal: 4257214464
  uptime: 3h 8m 12.12s (Approximately 0.12 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 15
    paused: 0
    running: 0
    stopped: 15
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 19
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.2.3
  Built: 1627370979
  BuiltTime: Tue Jul 27 09:29:39 2021
  GitCommit: ""
  GoVersion: go1.15.7
  OsArch: linux/amd64
  Version: 3.2.3

Package info (e.g. output of rpm -q podman or apt list podman):

podman-catatonit-3.2.3-0.10.module+el8.4.0+11989+6676f7ad.x86_64
podman-3.2.3-0.10.module+el8.4.0+11989+6676f7ad.x86_64
podman-docker-3.2.3-0.10.module+el8.4.0+11989+6676f7ad.noarch

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md) No

Additional environment details (AWS, VirtualBox, physical, etc.): Virtual Box

Luap99 commented 2 years ago

Can you start the podman service with --log-level debug and provide the full log after you run docker-compose up -d

Mordecaine commented 2 years ago

I changend the service file /usr/lib/systemd/system/podman.service:

[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
Documentation=man:podman-system-service(1)
StartLimitIntervalSec=0

[Service]
Type=exec
KillMode=process
- Environment=LOGGING="--log-level=info"
+ #Environment=LOGGING="--log-level=info"
+ Environment=LOGGING="--log-level=debug"
ExecStart=/usr/bin/podman $LOGGING system service

After that I reloaded the Systemd daemon and restarted the podman service:

systemctl daemon-reload
systemctl restart podman.service

After that I deleted all containers, networks, volumes etc. and rebuild them with docker-compose up -d

Gather logs:

journalctl --no-pager -u podman > podman.log

podman.log

Luap99 commented 2 years ago

Does it work after you run rm -rf /var/lib/cni/networks/mailcowdockerized_mailcow-network?

Mordecaine commented 2 years ago

Nope. Same error. I did:

podman container stop --all
podman container prune
podman network prune
podman volume prune

rm -rf /var/lib/cni/networks/mailcowdockerized_mailcow-network
docker-compose up -d

Message:

[root@abydos mailcow-dockerized]# docker-compose up -d
Creating network "mailcowdockerized_mailcow-network" with driver "bridge"
Creating volume "mailcowdockerized_vmail-vol-1" with default driver
Creating volume "mailcowdockerized_vmail-index-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-socket-vol-1" with default driver
Creating volume "mailcowdockerized_redis-vol-1" with default driver
Creating volume "mailcowdockerized_rspamd-vol-1" with default driver
Creating volume "mailcowdockerized_solr-vol-1" with default driver
Creating volume "mailcowdockerized_postfix-vol-1" with default driver
Creating volume "mailcowdockerized_crypt-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-web-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-userdata-backup-vol-1" with default driver
Creating mailcowdockerized_memcached-mailcow_1 ... done
Creating mailcowdockerized_olefy-mailcow_1     ... done
Creating mailcowdockerized_dockerapi-mailcow_1 ... done
Creating mailcowdockerized_unbound-mailcow_1   ... done
Creating mailcowdockerized_watchdog-mailcow_1  ... done
Creating mailcowdockerized_clamd-mailcow_1     ... done
Creating mailcowdockerized_redis-mailcow_1     ... done
Creating mailcowdockerized_solr-mailcow_1      ... done
Creating mailcowdockerized_sogo-mailcow_1      ... done
Creating mailcowdockerized_mysql-mailcow_1     ... done
Creating mailcowdockerized_php-fpm-mailcow_1   ... done
Creating mailcowdockerized_postfix-mailcow_1   ...
Creating mailcowdockerized_dovecot-mailcow_1   ... error
Creating mailcowdockerized_nginx-mailcow_1     ...

Creating mailcowdockerized_postfix-mailcow_1   ... done
a675b1fdc178e34a3c: error adding pod mailcowdockerized_dovecot-mailcow_1_mailcowdockerized_dovecot-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate foCreating mailcowdockerized_nginx-mailcow_1     ... done
Creating mailcowdockerized_acme-mailcow_1      ... done

ERROR: for dovecot-mailcow  Cannot start service dovecot-mailcow: error configuring network namespace for container 34435c198120aff7940ac1c6e5a58501965c3cbe06fa87a675b1fdc178e34a3c: error adding pod mailcowdockerized_dovecot-mailcow_1_mailcowdockerized_dovecot-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254
ERROR: Encountered errors while bringing up the project.

Luap99 commented 2 years ago

Any chance that you could try this with https://github.com/containers/podman/pull/11751?

Mordecaine commented 2 years ago

Thanks for that! But do you have a short documentation how I compile this fix into my podman?

Update: Ok, I found this link https://podman.io/getting-started/installation#building-from-scratch Is this documentation correct to get the fix into podman?

Mordecaine commented 2 years ago

Now I compiled the main branch from https://github.com/containers/podman.git

[root@localhost podman]# podman --version
podman version 4.0.0-dev

But I got the same error:

Creating mailcowdockerized_unbound-mailcow_1   ... done
Creating mailcowdockerized_redis-mailcow_1     ... done
Creating mailcowdockerized_sogo-mailcow_1      ... done
Creating mailcowdockerized_clamd-mailcow_1     ... done
Creating mailcowdockerized_memcached-mailcow_1 ... done
Creating mailcowdockerized_watchdog-mailcow_1  ... done
Creating mailcowdockerized_olefy-mailcow_1     ... done
Creating mailcowdockerized_dockerapi-mailcow_1 ... done
Creating mailcowdockerized_solr-mailcow_1      ... done
Creating mailcowdockerized_mysql-mailcow_1     ... done
Creating mailcowdockerized_php-fpm-mailcow_1   ... done
Creating mailcowdockerized_nginx-mailcow_1     ... error
Creating mailcowdockerized_postfix-mailcow_1   ...
Creating mailcowdockerized_dovecot-mailcow_1   ...

Creating mailcowdockerized_dovecot-mailcow_1   ... error

ERROR: for mailcowdockerized_dovecot-mailcow_1  Cannot start service dovecot-mailcow: error configuring network namespace for container 7c645ef1024fab8ea4706c66d7374dd9b769c8cb16fe57Creating mailcowdockerized_postfix-mailcow_1   ... done
r range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for nginx-mailcow  Cannot create container for service nginx-mailcow: container create: invalid IP address : in port mapping

ERROR: for dovecot-mailcow  Cannot start service dovecot-mailcow: error configuring network namespace for container 7c645ef1024fab8ea4706c66d7374dd9b769c8cb16fe57d09f135ce669b9dfab: error adding pod mailcowdockerized_dovecot-mailcow_1_mailcowdockerized_dovecot-mailcow_1 to CNI network "mailcowdockerized_mailcow-network": failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254
ERROR: Encountered errors while bringing up the project.

Luap99 commented 2 years ago

Can you run git pull --rebase https://github.com/Luap99/libpod net-alias to get my branch and then recompile. Also before you test make sure to run rm -rf /var/lib/cni/networks/mailcowdockerized_mailcow-network

Mordecaine commented 2 years ago

Sorry Luap. I did something wrong.

I did your rebase but I already installed podman-docker. But podman-docker requires an old podman version. So dnf downgraded my podman 4.0.0-dev version to the old podman version.

So maybe you can help me with that. I did the following steps:

# Install Dependencies
dnf install -y go git
dnf groupinstall -y "Development Tools"
subscription-manager repos --enable=codeready-builder-for-rhel-8-x86_64-rpms

# Compile Podman
cd ~
git clone https://github.com/containers/podman.git
cd podman
git pull --rebase https://github.com/Luap99/libpod net-alias
make package-install

systemctl enable podman.socket --now
curl -L https://github.com/docker/compose/releases/download/$(curl -Ls https://www.servercow.de/docker-compose/latest.php)/docker-compose-$(uname -s)-$(uname -m) > /usr/local/sbin/docker-compose
chmod +x /usr/local/sbin/docker-compose

Normally I would install now dnf install -y podman-docker but this command triggers the downgrade to the old podman version. If try to use docker-compose, I got an error because the socket do not work properly.

[root@localhost podman]# export DOCKER_HOST=unix:///run/user/$UID/podman/podman.sock
[root@localhost podman]# curl -H "Content-Type: application/json" --unix-socket /var/run/docker.sock http://localhost/_ping
curl: (7) Couldn't connect to server
[root@localhost podman]#

Maybe you have a hint for me, to compile the podman-docker package against my new podman version.

Luap99 commented 2 years ago

You can just run bin/podman system service -t0 unix:/var/run/docker.sock to start the podman service at the docker socket location, you do not have to install it.

Mordecaine commented 2 years ago

Hello Luap,

Sorry for the latency, but I had so much to do in the last time.

Now I was able to use your hint with the socket.

The socket works:

[root@localhost mailcow-dockerized]# curl -H "Content-Type: application/json" --unix-socket /var/run/docker.sock http://localhost/_ping
OK[root@localhost mailcow-dockerized]#

If I try the docker-compose up -d command, I get the same error message:

Creating mailcowdockerized_unbound-mailcow_1   ... done
Creating mailcowdockerized_sogo-mailcow_1    ...
Creating mailcowdockerized_redis-mailcow_1     ... error
Creating mailcowdockerized_solr-mailcow_1      ... done
Creating mailcowdockerized_sogo-mailcow_1      ... done
Creating mailcowdockerized_dockerapi-mailcow_1 ... done
Creating mailcowdockerized_memcached-mailcow_1 ... done
Creating mailcowdockerized_watchdog-mailcow_1  ... done
Creating mailcowdockerized_clamd-mailcow_1     ... done
Creating mailcowdockerized_mysql-mailcow_1     ...

ERROR: for mailcowdockerized_redis-mailcow_1  Cannot start service redis-mailcow: plugin type="bridge" failed (add): cni plugin bridge failed: failed to allocate for range 0: requestCreating mailcowdockerized_mysql-mailcow_1     ... done
Creating mailcowdockerized_postfix-mailcow_1   ...
Creating mailcowdockerized_dovecot-mailcow_1   ... error

Creating mailcowdockerized_postfix-mailcow_1   ... done
uested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for redis-mailcow  Cannot start service redis-mailcow: plugin type="bridge" failed (add): cni plugin bridge failed: failed to allocate for range 0: requested IP address 172.22.1.249 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for dovecot-mailcow  Cannot start service dovecot-mailcow: plugin type="bridge" failed (add): cni plugin bridge failed: failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254
ERROR: Encountered errors while bringing up the project.
[root@localhost mailcow-dockerized]#

But the error message is different than in the past.

Mordecaine commented 2 years ago

@Luap99 Is it ok for you if I ask carefully that you have an update to this error?

Cheers and a nice weekend, Mordecaine

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

Luap99 commented 2 years ago

@Mordecaine Sorry I do not have time to debug this issue further. We are currently working a new network backend called netavark which hopefully also fixes this issue.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

unixfox commented 2 years ago

Still interesting

rhatdan commented 2 years ago

This might be fixed with the new network redesign.

baude commented 2 years ago

should be fixed in 4.0 with netavark, re-open if not.

Zanathoz commented 2 years ago

@baude - Unfortunately, netavark and Podman4 result in a similar issue. Here's output of the docker-compose:

[]$ sudo docker-compose up -d
Creating network "mailcowdockerized_mailcow-network" with driver "bridge"
Creating volume "mailcowdockerized_vmail-vol-1" with default driver
Creating volume "mailcowdockerized_vmail-index-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-vol-1" with default driver
Creating volume "mailcowdockerized_mysql-socket-vol-1" with default driver
Creating volume "mailcowdockerized_redis-vol-1" with default driver
Creating volume "mailcowdockerized_rspamd-vol-1" with default driver
Creating volume "mailcowdockerized_solr-vol-1" with default driver
Creating volume "mailcowdockerized_postfix-vol-1" with default driver
Creating volume "mailcowdockerized_crypt-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-web-vol-1" with default driver
Creating volume "mailcowdockerized_sogo-userdata-backup-vol-1" with default driver
Creating mailcowdockerized_dockerapi-mailcow_1 ... error
Creating mailcowdockerized_unbound-mailcow_1   ... done
Creating mailcowdockerized_watchdog-mailcow_1  ... done
Creating mailcowdockerized_solr-mailcow_1      ... done
Creating mailcowdockerized_clamd-mailcow_1     ... done
Creating mailcowdockerized_sogo-mailcow_1      ... done
Creating mailcowdockerized_memcached-mailcow_1 ... done
Creating mailcowdockerized_redis-mailcow_1     ... done
Creating mailcowdockerized_olefy-mailcow_1     ... done
Creating mailcowdockerized_mysql-mailcow_1     ...
Creating mailcowdockerized_php-fpm-mailcow_1   ...
Creating mailcowdockerized_mysql-mailcow_1     ... done
Creating mailcowdockerized_php-fpm-mailcow_1   ... done
Creating mailcowdockerized_dovecot-mailcow_1   ...
Creating mailcowdockerized_postfix-mailcow_1   ...
Creating mailcowdockerized_dovecot-mailcow_1   ... error
Creating mailcowdockerized_postfix-mailcow_1   ... done
ERROR: for mailcowdockerized_nginx-mailcow_1  Cannot create container for service nginx-mailcow: container create: invalid IP address ":" in port mapping

ERROR: for mailcowdockerized_dovecot-mailcow_1  Cannot start service dovecot-mailcow: plugin type="bridge" failed (add): cni plugin bridge failed: failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254

ERROR: for dockerapi-mailcow  Cannot start service dockerapi-mailcow: crun: cannot disable OOM killer with cgroupv2: OCI runtime error

ERROR: for nginx-mailcow  Cannot create container for service nginx-mailcow: container create: invalid IP address ":" in port mapping

ERROR: for dovecot-mailcow  Cannot start service dovecot-mailcow: plugin type="bridge" failed (add): cni plugin bridge failed: failed to allocate for range 0: requested IP address 172.22.1.250 is not available in range set 172.22.1.1-172.22.1.254
ERROR: Encountered errors while bringing up the project.

This is on a fresh server running commands as non-root administrative account.

$ podman info
host:
  arch: amd64
  buildahVersion: 1.24.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 4
  distribution:
    distribution: fedora
    variant: server
    version: "35"
  eventLogger: journald
  hostname: Test
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.16.12-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 143175680
  memTotal: 8324710400
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.2-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.2
      commit: f6fbc8f840df1a414f31a60953ae514fa497c748
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8322281472
  swapTotal: 8324640768
  uptime: 19h 26m 59.76s (Approximately 0.79 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/extservice/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/extservice/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  volumePath: /home/extservice/.local/share/containers/storage/volumes
version:
  APIVersion: 4.0.2
  Built: 1646943965
  BuiltTime: Thu Mar 10 15:26:05 2022
  GitCommit: ""
  GoVersion: go1.16.14
  OsArch: linux/amd64
  Version: 4.0.2

$ rpm -q podman
podman-4.0.2-5.fc35.x86_64

Luap99 commented 2 years ago

Do you run docker-compose against the root socket? Your (rootless) podman info says networkBackend: netavark which is good but the error is clearly from cni. I would guess that the rootful mode is still using cni.

Zanathoz commented 2 years ago

Understood. I have not rebooted this server after installing Podman4 but I did reboot my production server after the upgrade and ran into the same error I posted above. I just checked and my production server is still using CNI as well.

I followed this guide to upgrade from Podman3.4 to 4.0.2 - https://podman.io/blogs/2022/02/04/network-usage.html

How can I force root to use the new netavark instead of CNI?

Edit: I also want to mention I stopped all my systemd services, removed all containers and local images, and then ran the podman system reset command before the upgrade on my production server. I then performed another dnf update and rebooted. Running Fedora 35 Server.

Luap99 commented 2 years ago

A reboot should not be required. If you run podman system reset it should delete everything and the next podman command should be able to pick netavark as default. Can you check if you have any cni networks in /etc/cni/net.d after podman system reset. It should only show the default one (87-podman-bridge.conflist). If there are more you should delete them (that would be a bug because podman system reset should already remove these automatically).

Also did you run podman system reset as root?

Zanathoz commented 2 years ago

Thanks, I'm going to work through that now. On my production system I was logged in as root when I ran the reset. This was after all systemd services disabled and all containers manually tore down and removed all image stores.

I was thinking another reset was in order as well. I ran it again on my test system and it appears netavark is default when running "sudo podman info". I'm going to go through my production system again and clear out the /etc/cni/net.d folder. There are some other files in there besides the default you mention. I'll report back once that's complete!

Zanathoz commented 2 years ago

Appears my root is still using CNI backend:

# ls /etc/cni/net.d
87-podman.conflist  cni.lock

# rm /etc/cni/net.d/cni.lock
rm: remove regular empty file '/etc/cni/net.d/cni.lock'? y

# ls /etc/cni/net.d
87-podman.conflist

# podman system reset --force
# podman info
host:
  arch: amd64
  buildahVersion: 1.24.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 8
  distribution:
    distribution: fedora
    variant: server
    version: "35"
  eventLogger: journald
  hostname: Production
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.16.12-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 8891695104
  memTotal: 12538998784
  networkBackend: cni
  ociRuntime:
    name: crun
    package: crun-1.4.2-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.2
      commit: f6fbc8f840df1a414f31a60953ae514fa497c748
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8585736192
  swapTotal: 8589930496
  uptime: 2h 47m 33.57s (Approximately 0.08 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
  - lscr.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.0.2
  Built: 1646943965
  BuiltTime: Thu Mar 10 15:26:05 2022
  GitCommit: ""
  GoVersion: go1.16.14
  OsArch: linux/amd64
  Version: 4.0.2

That lock file appears after a reset again, I assume it is a required lock file. We can move this to a new issue if needed as this is no longer specific to the original title.

Luap99 commented 2 years ago

Yeah the lockfile will be recreated every time, I was mostly worried about other .conflist files.

The only other reason I can think of why it is not choosing netavark is because it is not installed, but this cannot be the case since it worked in rootless mode so it is definitely installed.

To force netavark you can set it in containers.conf, add the following in /etc/containers/containers.conf:

[network]
network_backend = "netavark"

Zanathoz commented 2 years ago

Well that's strange, that file does not exist on the system. I did read through the podman doc and saw that reference but wasn't sure if it may have been installed elsewhere.

Creating it with just that entry gives this error:

[root@]# podman info
Error: could not find "netavark" in one of [/usr/local/libexec/podman /usr/local/lib/podman /usr/libexec/podman /usr/lib/podman].  To resolve this error, set the helper_binaries_dir key in the `[engine]` section of containers.conf to the directory containing your helper binaries.

I'm going to find a standard config file and build one from scratch and see if that helps anything. It appears my test system also does not have this file preset.

Edit: I found a standard config but it's missing the "helper_binaries" section. Where is netavark installed and I can put that in as a helper_binaries_dir variable?

Edit2: I'm deploying a new test server, Fedora 35 with root account enabled to test this with. I will be installing Podman4 out of the gate and can try to reproduce what we've done so far.

Luap99 commented 2 years ago

I guess netavark is not installed run rpm -ql netavark, it should be installed as /usr/libexec/podman/netavark on fedora. There is no need to change the helper_binaries_dir in the config since it should be in the default list.

Zanathoz commented 2 years ago

Hey, that would help!

# rpm -ql netavark
package netavark is not installed
# dnf install netavark
Last metadata expiration check: 0:39:06 ago on Sun 13 Mar 2022 01:40:19 PM EDT.
Dependencies resolved.
=============================================================================================================================================================================================================================================
 Package                                         Architecture                              Version                                            Repository                                                                                Size
=============================================================================================================================================================================================================================================
Installing:
 netavark                                        x86_64                                    1.0.1-1.fc35                                       copr:copr.fedorainfracloud.org:rhcontainerbot:podman4                                    1.9 M
Installing weak dependencies:
 aardvark-dns                                    x86_64                                    1.0.1-2.fc35                                       copr:copr.fedorainfracloud.org:rhcontainerbot:podman4                                    1.0 M

Transaction Summary
=============================================================================================================================================================================================================================================
Install  2 Packages

Total download size: 2.9 M
Installed size: 12 M
Is this ok [y/N]:

Should this not be installed as part of the Podman4 upgrade?

Edit: After installing and performing another "podman system reset", podman shows the proper backend as netavark!

Luap99 commented 2 years ago

If you already have podman installed it will keep cni to prevent breaking to many people. I think there is some form of conditional in the package, so that you do not need to have both cni and netavark installed. We should update the blog post to make this more clear.

Zanathoz commented 2 years ago

Okay - thanks for the help getting that sorted out, I've made some progress. I turned up a brand new Fedora 35 server, installed Podman4, it pulled down netavark on initial upgrade and all is well there.

Pulled down mailcow-dockerized from their github to the local system and got it running, but had to make a few tweaks to their compose file for it to work. To get it working I had to modify the following lines:

Comment out line 513 on the dockerapi-mailcow container:

#      oom_kill_disable: true

This caused the following error after docker-compose if left enabled:

ERROR: for dockerapi-mailcow  Cannot start service dockerapi-mailcow: crun: cannot disable OOM killer with cgroupv2: OCI runtime error

Directly Map ports to the nginx-mailcow container. Original Variables:

      ports:
        - "${HTTPS_BIND:-:}:${HTTPS_PORT:-443}:${HTTPS_PORT:-443}"
        - "${HTTP_BIND:-:}:${HTTP_PORT:-80}:${HTTP_PORT:-80}"

Changed Variables

      ports:
        - "443:443"
        - "80:80"

The original variables would error out with the following:

ERROR: for nginx-mailcow  Cannot create container for service nginx-mailcow: container create: invalid IP address ":" in port mapping

At this time, all of the containers are now running with the exception of netfilter-mailcow and ipv6nat-mailcow. Those were created but are crashing after one second. I'm new to MailCow though and am not certain if these are required for services to actually run.

This is probably beyond a Podman issue now as the containers can start. I can probably move onto the MailCow github and open an issue with them if you feel the above is working as intended? Perhaps the oom issue needs looked into?

Edit: I forgot to mention this was all completed as rootfull as root account. I'm honestly still learning in's and out's of rootless and wanted to eliminate that as a possible blocking point in my test.

unixfox commented 2 years ago

On docker ipv6nat-mailcow is not needed when docker is running natively with ipv6 or if you don't care about ipv6: https://mailcow.github.io/mailcow-dockerized-docs/post_installation/firststeps-disable_ipv6/

netfilter-mailcow is quite important for a public mail server, it's like a fail2ban service. it avoids bruteforce and things similar to that

Zanathoz commented 2 years ago

Thanks. I figured the IPv6 wasn't needed as I do have it disabled on the host and did find that document prior to deployment, but wanted to just get the stack working before any major changes.

NetFilter definitely sounds important and is not crashing after one second after disabling IPv6 in it's entirety (including the nat1pv6 container). NetFilter is still crashing after about 20 or so seconds, and so in NGINX after changing the port mapping to a direct 80:80 and 443:443 mapping. Seems there's still just a couple tweaks to make to get it up properly. Going to keep plugging away as this is pretty close to working!

Luap99 commented 2 years ago

Technically speaking all issues with the compose file against podman are podman bugs. We are trying to match the docker API, there are a few exceptions, the biggest thing is that we do not support docker swarm. So as long as it works with docker and it does not use swarm it should work.

If you could create a small reproducer for both problems and create separate issues for them, this would help getting them fixed.

Zanathoz commented 2 years ago

So Nginx error is resolved, somewhat. There's an HTTP/HTTPS_BIND variable in the mailcow.conf these point to. The Conf file says to leave them empty but in their documentation they have them as 127.0.0.1 pointing back to the host. I entered this into the conf and that error is resolved. NGINX still crashing for some reason, but at least no errors with it running the docker-compose now. I just need to find where to get the logs for that container to see what it's barking about.

OOM still an issue and will need a separate issue created. Is isolating that to it's own docker-compose file sufficient when opening a new issue, or should I re-create it in another way? I'm not super versed in linux troubleshooting. I've gotten by getting about 20 different services/containers up in Podman on my own but this is the first actual wall I've hit trying to turn up a new one.

Zanathoz commented 2 years ago

After some further testing, there's a communication issue between some of the containers. NGINX won't load because it can't connect to PHP-FPM. PHP-FPM won't load because it can't resolve to the REDIS container.

The docker-compose has them all using an internal unbound container for resolution. I connected to the local unbound and confirmed it was able to resolve outside IP addresses. When pinging the internal REDIS container, it resolves correctly:

Server:         172.22.1.1
Address:        172.22.1.1:53

Non-authoritative answer:
Name:   redis.dns.podman
Address: 172.22.1.249
Name:   redis.dns.podman
Address: fd4d:6169:6c63:6f77::8d

I tried adding a static entry under the PHP-FPM container for redis but this did not resolve the connection issue for the container. It can ping the name in the host file, but nslookup resolves it as NXDOMAIN:

/var/www/html # nslookup redis
Server:         172.22.1.254
Address:        172.22.1.254:53

** server can't find redis: NXDOMAIN

I think I'm going to give up on this for now and just deploy on Ubuntu with Docker. If anyone else makes any breakthrough's I'd be happy to collab and test any configurations out.

Edit: I wanted to add that REDIS was able to resolve the PHP-FPM container through unbound just fine, and unbound was able to resolve both REDIS and PHP-FPM as expected.

Edit2: I also disabled SELinux in testing as well. I do not think this was the culprit due to the way Podman handles it, but wanted to also mention that.

Zanathoz commented 2 years ago

Alright, I'm stubborn and my whole ecosystem is Fedora/Podman and I'd like to try and keep it that way. I just spun up Ubuntu with Docker and it worked flawlessly following these steps:

git clone https://github.com/mailcow/mailcow-dockerized /var/mailcow

/var/mailcow/generate_config.sh

<Enter Mail FQDN>
<Enter TimeZone>

/var/mailcow/docker-compose up -d

I spun up a brand new Fedora 35 instance again leaving IPv6 enabled, updated to Podman4, following the same steps as above (no patch as the previous OP stated) and ran into the same two issues previously. I had to disable OOM in the docker-compose file, and also point the HTTP_BIND/HTTPS_BIND variables to 127.0.0.1 for docker-compose to work at all. NGINX, IP6NAT and NETFILTER still crashing as before.

That it. I'm done for now. Definition of insanity... or is it science? Either way, I'm passing the ball to someone else as I'm a bit over my head on where to go from here.

containers / podman

Mailcow on podman do not work and produce a funny error message #11719