coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
264 stars 59 forks source link

How to make ansible work in Fedora CoreOS again #592

Closed DirkTheDaring closed 2 years ago

DirkTheDaring commented 4 years ago

Background: Currently, we still run a lot of stuff with old CoreOS and we are in the transition to new Fedora CoreOS with our production environments. We use a lot of ansible automation for systems set up here - e.g. kubespray, which we use install AND upgrade our running k8s clusters for CoreOS. Initially, we were happy that we could get the required python3 from the Fedora CoreOS repo. But that turned out to be more difficult, because the package "python3-libselinux", or "libselinux-python3" like it was named earlier, has a dependency on libselinux, which is a base package in Fedora CoreOS and cannot be upgraded. In the lifecycle of the upgrade repo it happens that it provides a newer version of python3-selinux, which depends on a newer version of libselinux . See kubespray issue. Then we are stuck as we cannot the necessary base packages for ansible in Fedora CoreOS.

Solution approach: We have the required that python3 and python3-libselinux must be installed reliably. The idea is quite simple, just deactivate the fedora-upgrade repo, then we fall back to the "baseline" of the currently installed Fedora CoreOS and we are able to install the python3 and python3-libselinux reliably (in an older version though). To our current knowledge rpm-ostree has not the option like DNF, which allows just to disable the upgrade repo during install. Therefore we edit the fedora-updates repo to disable it Also, our requirement which is shown here is, as soon as "SSH" is available, ansible can start doing his job, meaning python3 and python3-libselinux is available Initially, we just created a simple shell script which we put under /opt/bin/, but ignition does not set the selinux attributes during install, therefore any shell file was not allowed to execute in our tests.

Then we came up with a unit file, with shell commands included there, and that works for us. After the python3 and python3-selinux install it re-enables upgrades, but also puts python3* pyhon3-selinux on the exclude list. Then it works again as expected The exercise putting it into ignition is left to the reader.

[Unit]
Requires=network-online.target
After=network-online.target NetworkManager.service
Before=sshd.service  # run before ansible can login
[Service]
Type=oneshot
# do not execute anymore if it was already installed
ExecCondition=/usr/bin/test ! -f /etc/package-installer.done
#   install at least python3 + python3-libeselinux as package so that ansible can run
#  There are troubles with with dependencies in the upgrade repo, therefore we
#  we disable the fedora-upgrade repo during install, so that we at least get necessary
#  packages to run. 
ExecStart=/usr/bin/sed -i '/\[updates\]/,/^\[/ s/^enabled=.*$/enabled=0/'     /etc/yum.repos.d/fedora-updates.repo
ExecStart=/usr/bin/rpm-ostree install python3 libselinux-python3
ExecStart=/usr/bin/sed -i '/\[updates\]/,/^\[/ s/^enabled=.*$/enabled=1/'     /etc/yum.repos.d/fedora-updates.repo
ExecStart=/usr/bin/sed -i '/^\[updates\]/a exclude=python3* python3-selinux*' /etc/yum.repos.d/fedora-updates.repo
ExecStartPost=/usr/bin/touch /etc/package-installer.done
ExecStartPost=/usr/sbin/shutdown -r now
dustymabe commented 4 years ago

A few comments/questions:

Can you disable SELinux and see if Ansible behaves differently?

DirkTheDaring commented 4 years ago

in the past, we sideloaded python as a package (mainly because kubespray does later the same thing in old CoreOS) - and it contained the SELinux package already afaik. So basically it would be revert to the old mechanism and install a completely pre-built python3 with all necessary libraries - seems like the only way to keep CoreOS as an option.

We can only suspect that it causes troubles as soon as ansible (like in kubespray) tries to copy files to the coreos, then the detection kicks in and ansible STOPS, complaining that the python3-selinux package is not available and that it cannot continue --> It's not an option to "leave out" SELinux as soon as you want to touch files, otherwise, somehow ansible needs to patched to ignore SELinux enabled OSes.

Note: I have not debugged ansible and I do not know of any switch in ansible to make it "selinux-agnostic". On another note: kubespray can also install on Ubuntu/Debian so it is not a requirement of kubespray per-se, as Ubuntu/Debian does not use SELinux.

dustymabe commented 4 years ago

in the past, we sideloaded python as a package (mainly because kubespray does later the same thing in old CoreOS) - and it contained the SELinux package already afaik. So basically it would be revert to the old mechanism and install a completely pre-built python3 with all necessary libraries - seems like the only way to keep CoreOS as an option.

It would be the most reliable way right now, yes. We are working to fix the package layering problem (#400), but in general having less layered packages will make updates more reliable in the future too.

All of that is another way of saying: we hope to fix this immediate issue soon, but you may want to consider the tradeoffs between package layering and delivering a self contained thing (like you were doing before).

We can only suspect that it causes troubles as soon as ansible (like in kubespray) tries to copy files to the coreos, then the detection kicks in and ansible STOPS, complaining that the python3-selinux package is not available and that it cannot continue --> It's not an option to "leave out" SELinux as soon as you want to touch files, otherwise, somehow ansible needs to patched to ignore SELinux enabled OSes.

Right. What is the detection, though? If it just detects if SELinux is enabled then you can easily workaround that by disabling it.

Note: I have not debugged ansible and I do not know of any switch in ansible to make it "selinux-agnostic". On another note: kubespray can also install on Ubuntu/Debian so it is not a requirement of kubespray per-se, as Ubuntu/Debian does not use SELinux.

jamescassell commented 4 years ago

Seems like snap solves some of these problems: installing cli tools from a container for use on the host? -- maybe it's possible to auto-mount a container with the requisite packages into the host system $PATH on boot? It's the general problem of "how do I run tools on the host without layering them as an RPM?"

lucab commented 4 years ago

Right. What is the detection, though? If it just detects if SELinux is enabled then you can easily workaround that by disabling it.

@dustymabe see https://github.com/coreos/fedora-coreos-tracker/issues/578.

dustymabe commented 4 years ago

Right. What is the detection, though? If it just detects if SELinux is enabled then you can easily workaround that by disabling it.

@dustymabe see #578.

I see. It uses /usr/sbin/selinuxenabled so if you disable SELinux then it shouldn't require you to install the libraries.

[core@cosa-devsh ~]$ getenforce 
Disabled
[core@cosa-devsh ~]$ /usr/sbin/selinuxenabled 
[core@cosa-devsh ~]$ echo $?
1

Obviously not ideal to disable selinux, but if you're just looking for a quick solution that should work.

dustymabe commented 4 years ago

@DirkTheDaring - does that seem to work ^^?

DirkTheDaring commented 4 years ago

currently on vacation. No engineers are available to test. The solution given at the beginning seems to work fine, with some slight modification (python3 can be removed and selinux libs need to be added). This works for us ("TM") ... :)

trexx commented 4 years ago

@dustymabe

Yep, disabling SELinux means ansible does not complain about python3-libselinux. I've tested this on some of our playbooks. We'll go with this until there's a better solution.

jdoss commented 4 years ago

This might work for your needs @DirkTheDaring without having to turn off SELinux on FCOS.

https://gist.github.com/jdoss/26f535be5b0c0f94834c0e7103883e93

---
variant: fcos
version: 1.0.0
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 AAAA...

storage:
  directories:
  - path: /opt/ansible
    mode: 0755

  files:
  # This uses podman image mount which is not in Podman 2.0.x. This is a 2.1-dev podman binary.
  - path: /usr/local/bin/podman
    mode: 0555
    contents:
      source: https://joedoss.com/downloads/podman.gz
      compression: gzip
      verification:
        hash: sha512-f8eb2001e6f18270ee09d00432f14d3077b80a95cd4162628732add0f49ceaacd4219f7e459cc5babd49b8d46ce632c8d839732c96e30020020f1271f7f75881

  - path: /usr/local/bin/fcos_python
    mode: 0555
    contents:
      inline: |
        #! /usr/bin/env bash

        show_help() {
          echo
          echo "Usage: fcos_python {enable|disable|help}"
          exit 0
        }

        case $1 in

              enable)
                echo "Enabling Python 3.8"
                /usr/local/bin/podman pull quay.io/forem/ansible:2.9.11
                mount --bind $(/usr/local/bin/podman image mount quay.io/forem/ansible:2.9.11) /opt/ansible/
                ln -sfv /opt/ansible/usr/bin/python /usr/local/bin/python
                ln -sfv /opt/ansible/usr/lib64/libpython3.8.so.1.0 /usr/local/lib/libpython3.8.so.1.0
                ldconfig
                ;;

              disable)
                echo "Disabling Python 3.8"
                rm -f /usr/local/lib/libpython3.8.so.1.0
                rm -f /usr/local/bin/python
                ldconfig
                umount /opt/ansible/
                /usr/local/bin/podman image umount quay.io/forem/ansible:2.9.11
                ;;

              help)
                show_help
                ;;

              *)
                printf "Unknown command!"
                show_help
                exit 1
                ;;

        esac

  - path: /etc/ld.so.conf.d/python.conf
    mode: 0644
    contents:
      inline: |
        /usr/local/lib

systemd:
  units:
  - name: fcos-python-3.service
    enabled: true
    contents: |
      [Unit]
      Description=FCOS Python 3 service
      Wants=network.target
      After=network-online.target

      [Service]
      ExecStart=/usr/local/bin/fcos_python enable
      ExecStop=/usr/local/bin/fcos_python disable

      Type=oneshot
      RemainAfterExit=true

      [Install]
      WantedBy=multi-user.target default.target

I am only using Ansible with FCOS for adhoc commands and service orchestration and this works well without having to turn off SELinux. I have no idea if it will work with kubespray.

rtsisyk commented 4 years ago

Is it possible to include python3 to the base system?

bgilbert commented 4 years ago

@rtsisyk We're trying to keep Python out of Fedora CoreOS.

rtsisyk commented 4 years ago

We're trying to keep Python out of Fedora CoreOS.

OK. This design choose makes sense. MachineConfigOperator + Ignition can completely replace Ansible. The only thing I miss is a support for sed-like operations in Ingitions.

Update: I wrote a proposal to Ignition repository: https://github.com/coreos/ignition/issues/1099

travier commented 3 years ago

Package layering has been made reliable with the new fedora-archive repo. Could someone try again setting up the Python packages required for Ansible support via package layering on first boot via Ignition?

cgwalters commented 2 years ago

We now have https://github.com/coreos/coreos-layering-examples/tree/main/ansible-firewalld and I consider this the canonical way to use Ansible with Fedora CoreOS derivatives. (But you can also layer python on your own)