redhat-cop / ee_utilities

This ansible collection includes a number of roles and tools which can be useful for managing Ansible Execution Environments.
https://galaxy.ansible.com/infra/ee_utilities
GNU General Public License v3.0
57 stars 35 forks source link

infra.ee_utilities.ee_builder has issues related to dnf for python_interpreter section or, system-packages section of build #124

Closed erikhjensen closed 1 year ago

erikhjensen commented 1 year ago

Summary

We're migrating to infra.ee_utilities.ee_builder latest from redhat_cop.ee_utilities.ee_builder and ansible-builder 1.2.
I have a new build server and both /bin/sh and /usr/bin/sh are confirmed to be present.

The build is failing here:

"[1/4] STEP 10/15: ARG PKGMGR",
        "--> Using cache c2ff90960421319e08f73d047e05f0606c4939167828d4724466113e3a596a3e",
        "--> c2ff9096042",
        "[1/4] STEP 11/15: RUN $PKGMGR install $PYPKG -y ; if [ -z $PKGMGR_PRESERVE_CACHE ]; then $PKGMGR clean all; fi",
        "/bin/sh: /usr/bin/dnf: No such file or directory",
        "/bin/sh: /usr/bin/dnf: No such file or directory",

in this build context, is this command being run inside the container that's built-up or in the builder container? Either way, it seems to have an issue w/ /usr/bin/dnf. Is that your interpretation? See below for my follow up comment whereby we comment out the python_interpreter section of the ee-specification and it fails w/ a dnf-related error later in the podman build.

Issue Type

Ansible, Collection, Docker/Podman details

"[1/4] STEP 10/15: ARG PKGMGR",
        "--> Using cache c2ff90960421319e08f73d047e05f0606c4939167828d4724466113e3a596a3e",
        "--> c2ff9096042",
        "[1/4] STEP 11/15: RUN $PKGMGR install $PYPKG -y ; if [ -z $PKGMGR_PRESERVE_CACHE ]; then $PKGMGR clean all; fi",
        "/bin/sh: /usr/bin/dnf: No such file or directory",
        "/bin/sh: /usr/bin/dnf: No such file or directory",
ansible --version
ansible [core 2.15.2]

ansible-galaxy collection list

podman --version
podman version 4.4.1

OS / ENVIRONMENT

Build Environment Rhel8 w/ active subscription. Ansible Automation Collection 2.4 repos enabled.

Desired Behavior

Desire to allow /usr/bin/dnf to not throw a file-not found error.

Actual Behavior

The RUN statement in Containerfile looks for shell at /bin/sh and dnf at /usr/bin/dnf and looks to error-out

Playbook - a playbook to build the Execution Environment An execution env compatible w/ V3 of builder

STEPS TO REPRODUCE


---
# See documentation here: https://github.com/redhat-cop/ee_utilities/tree/devel/roles/ee_builder
# https://ansible.readthedocs.io/projects/builder/en/stable/definition/#execution-environment-definition
- name: Create custom execution environment
  hosts: localhost
  gather_facts: false
  collections:
    - infra.ee_utilities
  vars:
    aap_release: ansible-automation-platform-24    
    ee_ah_host: "{{ lookup('env','EEBUILD_AUTO_HUB_HOST') }}"
    ee_ah_token: "{{ lookup('env','EEBUILD_AUTO_HUB_TOKEN') }}"
    ee_registry_dest: "{{ lookup('env','EEBUILD_AUTO_HUB_HOST') }}"

    # This determines if collections are pulled from the Automation Hub or the Web at large
    ee_pull_collections_from_hub: true
    ee_builder_dir_clean: false
    ee_builder_dir: "."
    ee_verbosity: "2"
    ee_list:
      - name: cee-terraform-rhel8-eev3

        images:
          base_image: 
            name: registry.redhat.io/{{ aap_release }}/ee-supported-rhel8:latest

        dependencies:
          ansible_core:
              package_pip: ansible-core==2.15
          ansible_runner:
              package_pip: ansible-runner
          python_interpreter:  #comment this out to have the process error later during system-package install
              package_system: "python39"
              python_path: "/usr/bin/python3.9"
          system:
            - unzip
            - wget
            - krb5-libs [platform:rpm]
            - krb5-workstation [platform:rpm]
            - bind-utils
          python:
            - azure-cli
            - PyGithub
            - pypsrp
            - setuptools
            - git+https://github.com/vmware/vsphere-automation-sdk-python.git
            - pywinrm  #do we need this? isn't it ootb?
          galaxy:
            collections:
              - name: cloud.terraform
              - name: community.general
              - name: community.vmware
              - name: ansible.windows
              - name: community.aws
              - name: amazon.aws
              - name: infra.controller_configuration

        build_steps:          
          append_final:
          # NOTE: Spacing below between the two RUN commands is critical
          #       This is due to the multiple translations this code will go through
          #       before it is actually used. To avoid this, keep each command on
          #       one line. In this case it was done this way for readability.
            - |
              RUN for TF_VERS in 0.12.24 0.14.8 1.3.7 ; do
                              TF_SHORT=$(awk -F '.' '{print $1"."$2}' <<< $TF_VERS) ;
                              wget https://releases.hashicorp.com/terraform/${TF_VERS}/terraform_${TF_VERS}_linux_amd64.zip ;
                              unzip -o terraform_${TF_VERS}_linux_amd64.zip ;
                              cp terraform /bin/terraform-${TF_SHORT} ;
                          done && mv terraform /bin/terraform && rm terraform*
            - RUN wget -O /bin/argocd https://github.com/argoproj/argo-cd/releases/download/v1.5.1/argocd-linux-amd64
            - RUN chmod +x /bin/argocd

        # build_files:
        #   - krb5.conf

  roles:
    - infra.ee_utilities.ee_builder #https://github.com/redhat-cop/ee_utilities/blob/3.0.0/README.md
as above
erikhjensen commented 1 year ago

Following up. When we comment out the whole python interpreter section it fails later in the build and also gives an error message about /usr/bin/dnf. In that case, its when the build process is trying to include the system packages in the resultant container.

section which we comment out

python_interpreter:
              package_system: "python39"
              python_path: "/usr/bin/python3.9"

Build fails later:

TASK [infra.ee_utilities.ee_builder : Run the Ansible Builder Program] *********
... lines omitted ...
"module_args": {
            "_raw_params": "ansible-builder build -f\n  ./execution_environment.yml\n  -t cee-terraform-rhel8-eev3 --container-runtime=podman\n   --prune-images          --verbosity 2\n"
... lines omitted ...
"stdout_lines": [
        "Running command:",
        "  podman build -f context/Containerfile -t cee-terraform-rhel8-eev3 context",
        "...showing last 20 lines of output...",
... lines omitted ...
"+ /usr/bin/dnf install -y bind-utils dnf gcc krb5-devel libcurl-devel libssh-devel libxml2-devel make openssl-devel python3-devel python3-jmespath python3-lxml python3-netaddr python3-rpm python38-Cython python38-devel python38-lxml python38-pytz python38-pyyaml python38-requests python39-devel unzip wget",
        "/output/scripts/assemble: line 75: /usr/bin/dnf: No such file or directory",
        "Error: building at STEP \"RUN /output/scripts/assemble\": while running runtime: exit status 127"

My understanding is that this dnf command is being run either in the builder container or, the target-container whose base is ee-supported. Either way, if the natural disposition of the process is to use /usr/bin/dnf, I find it curious that tool cannot be found in those two OOTB images. this makes me feel my assumptions are not correct about the build-context.

erikhjensen commented 1 year ago

As per: https://www.ansible.com/blog/unlocking-efficiency-harnessing-the-capabilities-of-ansible-builder-3.0

The microdnf package manager can configured in the execution-environment definition

options: package_manager_path: /usr/bin/microdnf