alvistack / ansible-role-podman

Ansible Role for Podman Installation
Apache License 2.0
61 stars 6 forks source link

Install failure on RHEL 8.8 #26

Open joshmcorreia opened 1 month ago

joshmcorreia commented 1 month ago

I'm facing the following error on RHEL 8.8:

$ ansible-playbook install_podman_playbook.yml  
...
TASK [alvistack.podman : yum install] ***
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-dnsname'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-plugins'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-podman-machine'})
FAILED - RETRYING: [obsidian]: yum install (3 retries left).
FAILED - RETRYING: [obsidian]: yum install (2 retries left).
FAILED - RETRYING: [obsidian]: yum install (1 retries left).
failed: [obsidian] (item={'state': 'latest', 'name': 'podman'}) => {"ansible_loop_var": "item", "attempts": 3, "changed": false, "failures": [], "item": {"name": "podman", "state": "latest"}, "msg": "Depsolve Error occurred: \n Problem: problem with installed package podman-catatonit-3:4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64\n  - package podman-catatonit-3:4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64 requires podman = 3:4.4.1-14.module+el8.8.0+19108+ffbdcd02, but none of the providers can be installed\n  - package podman-catatonit-3:4.4.1-15.module+el8.8.0+19698+00f3cb66.x86_64 requires podman = 3:4.4.1-15.module+el8.8.0+19698+00f3cb66, but none of the providers can be installed\n  - package podman-catatonit-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64 requires podman = 3:4.4.1-16.module+el8.8.0+19993+47c8ef84, but none of the providers can be installed\n  - cannot install both podman-100:5.2.2-1.8.x86_64 and podman-3:4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64\n  - cannot install both podman-3:4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64 and podman-100:5.2.2-1.8.x86_64\n  - cannot install both podman-3:4.4.1-15.module+el8.8.0+19698+00f3cb66.x86_64 and podman-100:5.2.2-1.8.x86_64\n  - cannot install both podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64 and podman-100:5.2.2-1.8.x86_64\n  - cannot install the best update candidate for package podman-3:4.4.1-14.module+el8.8.0+19108+ffbdcd02.x86_64", "rc": 1, "results": []}

Any idea on how I can resolve this?

hswong3i commented 1 month ago

Somehow my greedy OBS repo may not totally 1-to-1 substitute original upstream package name, so you may try with:

yum remove -y podman-catatonit podman

Then re-run my role for testing?

hswong3i commented 1 month ago

I had double confirm with molecule and it is working fine:

sudo -E molecule test -s rhel-8-libvirt
joshmcorreia commented 1 month ago

@hswong3i I tried removing those packages first and still have the same issue:

[root@obsidian ~]# yum remove -y podman-catatonit podman
Updating Subscription Management repositories.
Dependencies resolved.
========================================================================================================================
 Package                        Arch   Version                                  Repository                         Size
========================================================================================================================
Removing:
 podman                         x86_64 3:4.4.1-16.module+el8.8.0+19993+47c8ef84 @rhel-8-for-x86_64-appstream-rpms  49 M
 podman-catatonit               x86_64 3:4.4.1-16.module+el8.8.0+19993+47c8ef84 @rhel-8-for-x86_64-appstream-rpms 761 k
Removing dependent packages:
 cockpit-podman                 noarch 63.1-1.module+el8.8.0+19993+47c8ef84     @rhel-8-for-x86_64-appstream-rpms 620 k
 containernetworking-podman-machine
                                x86_64 100:0.2.0-17.10                          @home_alvistack                   6.1 M
 toolbox                        x86_64 0.0.99.3-7.module+el8.8.0+19993+47c8ef84 @rhel-8-for-x86_64-appstream-rpms 6.9 M
Removing unused dependencies:
 conmon                         x86_64 3:2.1.6-1.module+el8.8.0+19993+47c8ef84  @rhel-8-for-x86_64-appstream-rpms 172 k
 podman-gvproxy                 x86_64 100:0.7.5-1.8                            @home_alvistack                    12 M
 shadow-utils-subid             x86_64 2:4.6-17.el8                             @rhel-8-for-x86_64-baseos-rpms    205 k

Transaction Summary
========================================================================================================================
Remove  8 Packages

Freed space: 76 M
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                1/1
  Running scriptlet: cockpit-podman-63.1-1.module+el8.8.0+19993+47c8ef84.noarch                                     1/1
  Erasing          : cockpit-podman-63.1-1.module+el8.8.0+19993+47c8ef84.noarch                                     1/8
  Erasing          : containernetworking-podman-machine-100:0.2.0-17.10.x86_64                                      2/8
  Erasing          : podman-gvproxy-100:0.7.5-1.8.x86_64                                                            3/8
  Erasing          : toolbox-0.0.99.3-7.module+el8.8.0+19993+47c8ef84.x86_64                                        4/8
  Running scriptlet: podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                                         5/8
  Erasing          : podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                                         5/8
  Running scriptlet: podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                                         5/8
  Erasing          : podman-catatonit-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                               6/8
  Running scriptlet: podman-catatonit-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                               6/8
  Erasing          : conmon-3:2.1.6-1.module+el8.8.0+19993+47c8ef84.x86_64                                          7/8
  Erasing          : shadow-utils-subid-2:4.6-17.el8.x86_64                                                         8/8
  Running scriptlet: shadow-utils-subid-2:4.6-17.el8.x86_64                                                         8/8
  Verifying        : cockpit-podman-63.1-1.module+el8.8.0+19993+47c8ef84.noarch                                     1/8
  Verifying        : conmon-3:2.1.6-1.module+el8.8.0+19993+47c8ef84.x86_64                                          2/8
  Verifying        : containernetworking-podman-machine-100:0.2.0-17.10.x86_64                                      3/8
  Verifying        : podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                                         4/8
  Verifying        : podman-catatonit-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64                               5/8
  Verifying        : podman-gvproxy-100:0.7.5-1.8.x86_64                                                            6/8
  Verifying        : shadow-utils-subid-2:4.6-17.el8.x86_64                                                         7/8
  Verifying        : toolbox-0.0.99.3-7.module+el8.8.0+19993+47c8ef84.x86_64                                        8/8
Installed products updated.

Removed:
  cockpit-podman-63.1-1.module+el8.8.0+19993+47c8ef84.noarch
  conmon-3:2.1.6-1.module+el8.8.0+19993+47c8ef84.x86_64
  containernetworking-podman-machine-100:0.2.0-17.10.x86_64
  podman-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64
  podman-catatonit-3:4.4.1-16.module+el8.8.0+19993+47c8ef84.x86_64
  podman-gvproxy-100:0.7.5-1.8.x86_64
  shadow-utils-subid-2:4.6-17.el8.x86_64
  toolbox-0.0.99.3-7.module+el8.8.0+19993+47c8ef84.x86_64

Complete!

Here's my ansible playbook:

- hosts:
    - podman_hosts
  roles:
    - alvistack.podman

And here's what I get when I run the playbook:

$ ansible-playbook install_podman_playbook.yml

PLAY [podman_hosts] ****************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [obsidian]

TASK [alvistack.podman : include default variables] ********************************************************************
ok: [obsidian]

TASK [alvistack.podman : include release specific variables] ***********************************************************
ok: [obsidian] => (item=/home/josh/.ansible/roles/alvistack.podman/files/../vars/redhat-8.yml)

TASK [alvistack.podman : include release specific tasks] ***************************************************************
included: /home/josh/.ansible/roles/alvistack.podman/tasks/redhat.yml for obsidian => (item=/home/josh/.ansible/roles/alvistack.podman/tasks/./redhat.yml)

TASK [alvistack.podman : rpm --import] *********************************************************************************
ok: [obsidian] => (item={'key': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8/repodata/repomd.xml.key', 'fingerprint': '789CFFDE0295B8A1F4E5690C4BECC97550D0B1FD', 'state': 'present'})

TASK [alvistack.podman : yum-config-manager --add-repo] ****************************************************************
ok: [obsidian] => (item={'file': 'home:alvistack', 'name': 'home_alvistack', 'description': 'home:alvistack (AlmaLinux_8)', 'baseurl': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8', 'enabled': True, 'priority': '2', 'module_hotfixes': True, 'gpgcheck': True, 'gpgkey': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8/repodata/repomd.xml.key', 'state': 'present'})

TASK [alvistack.podman : yum install] **********************************************************************************
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-dnsname'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-plugins'})
changed: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-podman-machine'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman'})
FAILED - RETRYING: [obsidian]: yum install (3 retries left).
FAILED - RETRYING: [obsidian]: yum install (2 retries left).
FAILED - RETRYING: [obsidian]: yum install (1 retries left).
failed: [obsidian] (item={'state': 'latest', 'name': 'podman-aardvark-dns'}) => {"ansible_loop_var": "item", "attempts": 3, "changed": false, "failures": [], "item": {"name": "podman-aardvark-dns", "state": "latest"}, "msg": "Unknown Error occurred: Transaction test error:\n  file /usr/libexec/podman/aardvark-dns from install of podman-aardvark-dns-100:1.12.2-1.2.x86_64 conflicts with file from package aardvark-dns-2:1.5.0-2.module+el8.8.0+19993+47c8ef84.x86_64\n", "rc": 1, "results": []}
changed: [obsidian] => (item={'state': 'latest', 'name': 'podman-docker'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman-gvproxy'})
FAILED - RETRYING: [obsidian]: yum install (3 retries left).
FAILED - RETRYING: [obsidian]: yum install (2 retries left).
FAILED - RETRYING: [obsidian]: yum install (1 retries left).
failed: [obsidian] (item={'state': 'latest', 'name': 'podman-netavark'}) => {"ansible_loop_var": "item", "attempts": 3, "changed": false, "failures": [], "item": {"name": "podman-netavark", "state": "latest"}, "msg": "Unknown Error occurred: Transaction test error:\n  file /usr/libexec/podman/netavark from install of podman-netavark-100:1.12.2-1.7.x86_64 conflicts with file from package netavark-2:1.5.1-2.module+el8.8.0+19993+47c8ef84.x86_64\n", "rc": 1, "results": []}
changed: [obsidian] => (item={'state': 'latest', 'name': 'python3-podman-compose'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'shadow-utils'})

PLAY RECAP *************************************************************************************************************

Is the galaxy collection up to date? That's what I'm trying to use in my playbook.

hswong3i commented 1 month ago

So now the message say aardvark-dns and netavark get conflicted which is new issue, which could be fixed with:

yum remove -y aardvark-dns netavark
joshmcorreia commented 1 month ago

I was able to get ansible to stop throwing errors, but podman won't launch correctly even though the playbook says it was successful.

I removed the conflicting packages manually:

# dnf remove -y aardvark-dns netavark

And then re-ran the playbook:

$ ansible-playbook install_podman_playbook.yml

PLAY [podman_hosts] ****************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [obsidian]

TASK [alvistack.podman : include default variables] ********************************************************************
ok: [obsidian]

TASK [alvistack.podman : include release specific variables] ***********************************************************
ok: [obsidian] => (item=/home/josh/.ansible/roles/alvistack.podman/files/../vars/redhat-8.yml)

TASK [alvistack.podman : include release specific tasks] ***************************************************************
included: /home/josh/.ansible/roles/alvistack.podman/tasks/redhat.yml for obsidian => (item=/home/josh/.ansible/roles/alvistack.podman/tasks/./redhat.yml)

TASK [alvistack.podman : rpm --import] *********************************************************************************
ok: [obsidian] => (item={'key': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8/repodata/repomd.xml.key', 'fingerprint': '789CFFDE0295B8A1F4E5690C4BECC97550D0B1FD', 'state': 'present'})

TASK [alvistack.podman : yum-config-manager --add-repo] ****************************************************************
ok: [obsidian] => (item={'file': 'home:alvistack', 'name': 'home_alvistack', 'description': 'home:alvistack (AlmaLinux_8)', 'baseurl': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8', 'enabled': True, 'priority': '2', 'module_hotfixes': True, 'gpgcheck': True, 'gpgkey': 'http://downloadcontent.opensuse.org/repositories/home:/alvistack/AlmaLinux_8/repodata/repomd.xml.key', 'state': 'present'})

TASK [alvistack.podman : yum install] **********************************************************************************
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-dnsname'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-plugins'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'containernetworking-podman-machine'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman-aardvark-dns'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman-docker'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman-gvproxy'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'podman-netavark'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'python3-podman-compose'})
ok: [obsidian] => (item={'state': 'latest', 'name': 'shadow-utils'})

TASK [alvistack.podman : copy templates] *******************************************************************************
ok: [obsidian] => (item={'dest': '/etc/containers/nodocker'})

TASK [alvistack.podman : flush handlers] *******************************************************************************

TASK [alvistack.podman : systemctl start podman.service] ***************************************************************
ok: [obsidian]

PLAY RECAP *************************************************************************************************************
obsidian : ok=9    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

And now I get the following output on the target machine:

# podman version
WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and will be removed in a future version. Set environment variable `PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning.
Error: cni support is not enabled in this build, only netavark. Got unsupported network backend "cni"
joshmcorreia commented 1 month ago

I'm not sure why it's trying to use cni instead of netavark. According to the podman docs:

CNI is deprecated and will be removed in the next major Podman version 5.0, in preference of Netavark

hswong3i commented 1 month ago

First of all, in order to sync with CRI-O in Kubernetes, I am using CNI by default during testing with molecule, see https://github.com/alvistack/ansible-role-podman/blob/master/molecule/default/converge.yml

On the other hand this is how podman being packaged with my OBS repo, which I didn't specify any CNI nor Netavark setup: https://github.com/alvistack/containers-podman/blob/alvistack/v5.2.2/debian/rules

So here we have some working agenda:

  1. Sync my OBS package naming with the RHEL package naming, so I could override them correct during greedy installation, automatically
  2. You may first reference my molecule test plan, install both CNI and podman together, which is confirmed as functioning
  3. I will give some study about how to control podman to use Netavark by default (instead of CNI by default), then update the molecule test plan accordingly, too
hswong3i commented 1 month ago

I'm not sure why it's trying to use cni instead of netavark. According to the podman docs:

CNI is deprecated and will be removed in the next major Podman version 5.0, in preference of Netavark

Here the official document for networking: https://github.com/containers/podman/blob/47b85af6351518df314532987e303796237141e2/DISTRO_PACKAGE.md?plain=1#L42-L45

After installation, if you would like to migrate all your containers to use Netavark, you will need to set network_backend = "netavark" under the [network] section in your containers.conf, typically located at: /usr/share/containers/containers.conf

My ansible-role-containers_common config: https://github.com/alvistack/ansible-role-containers_common/blob/master/templates/etc/containers/containers.conf.j2#L330-L359

[network]

# Network backend determines what network driver will be used to set up and tear down container networks.
# Valid values are "cni" and "netavark".
# The default value is empty which means that it will automatically choose CNI or netavark. If there are
# already containers/images or CNI networks preset it will choose CNI.
#
# Before changing this value all containers must be stopped otherwise it is likely that
# iptables rules and network interfaces might leak on the host. A reboot will fix this.
#
# network_backend = ""

# Path to directory where CNI plugin binaries are located.
#
# cni_plugin_dirs = [
#     "/usr/local/libexec/cni",
#     "/usr/libexec/cni",
#     "/usr/local/lib/cni",
#     "/usr/lib/cni",
#     "/opt/cni/bin",
# ]

# List of directories that will be searched for netavark plugins.
#
# netavark_plugin_dirs = [
#     "/usr/local/libexec/netavark",
#     "/usr/libexec/netavark",
#     "/usr/local/lib/netavark",
#     "/usr/lib/netavark",
# ]

OK so now the problem are:

  1. My OBS package naming couldn't override RHEL's packaging naming, so need to update
  2. The installation path shouldn't be /usr/libexec/podman, but /usr/libexec/netavark

I will fix these later ;-)

joshmcorreia commented 1 month ago

On the other hand this is how podman being packaged with my OBS repo, which I didn't specify any CNI nor Netavark setup:

I think I narrowed down why we're seeing cni as the default on RHEL8 - it ships with Podman 4.4.1 which was before cni was removed. I think that means it's defaulting to cni because of this, and the ansible role needs to update the user's containers.conf to netavark if the user chooses Podman 5.0+ since cni is no longer supported. Do you think it makes sense to have the role update the containers.conf file? Maybe via a string or regex replacement?

Here's a command line example:

$ sed -i 's/network_backend = "cni"/network_backend = "netavark"/g' /usr/share/containers/containers.conf

Or an Ansible solution:

- name: Set podman network_backend settings to 'netavark'
  ansible.builtin.replace:
    path: /usr/share/containers/containers.conf
    regexp: '^network_backend = "cni"$'
    replace: 'network_backend = "netavark"'

I'm willing to bet that if you ran your role against a machine without podman installed it would default to netavark, but since RHEL ships with podman, the out-of-date config is left behind during an upgrade via the role.

joshmcorreia commented 1 month ago

If it helps at all, this is what I'm doing now, which seems to work in conjunction with your role (I run my role before I run yours):

- name: Remove the conflicting 'podman-catatonit' package
  ansible.builtin.package:
    name: podman-catatonit
    state: absent

- name: Remove the conflicting 'aardvark-dns' package
  ansible.builtin.package:
    name: aardvark-dns
    state: absent

- name: Remove the conflicting 'netavark' package
  ansible.builtin.package:
    name: netavark
    state: absent

- name: Check if /usr/share/containers/containers.conf exists
  ansible.builtin.stat:
    path: /usr/share/containers/containers.conf
  register: containers_conf_exists

- name: Update deprecated podman network_backend settings
  ansible.builtin.replace:
    path: /usr/share/containers/containers.conf
    regexp: '^network_backend = "cni"$'
    replace: 'network_backend = "netavark"'
  when: containers_conf_exists.stat.exists

- name: Check if PODMAN_IGNORE_CGROUPSV1_WARNING is already set
  ansible.builtin.lineinfile:
    state: absent
    path: /etc/environment
    regexp: "^PODMAN_IGNORE_CGROUPSV1_WARNING="
  check_mode: true
  changed_when: false
  register: podman_ignore_cgroupsv1_warning_check

- name: Define PODMAN_IGNORE_CGROUPSV1_WARNING if undefined
  ansible.builtin.lineinfile:
    state: present
    path: /etc/environment
    line: "PODMAN_IGNORE_CGROUPSV1_WARNING=true"
  when: podman_ignore_cgroupsv1_warning_check.found == 0

- name: Set PODMAN_IGNORE_CGROUPSV1_WARNING to true
  ansible.builtin.replace:
    path: /etc/environment
    regexp: '^PODMAN_IGNORE_CGROUPSV1_WARNING=.*'
    replace: 'PODMAN_IGNORE_CGROUPSV1_WARNING=true'