AcademySoftwareFoundation / aswf-docker

Common container configuration
Apache License 2.0
153 stars 36 forks source link

Evaluate Red Hat's UBI image for the base image #174

Open omenos opened 1 year ago

omenos commented 1 year ago

Overview

With RHEL rebuild distributions in a state of reliability uncertainty, it is worth investigating whether or not the UBI images provided by Red Hat will satisfy the needs of the ASWF images.

Universal Base Images (UBI): Images, repositories, and source code UBI repositories public CDN

It is known that in its present repository state, it will not. The goal of this issue is to discover which packages are missing from the UBI repositories that the ASWF needs, and put in a request to Red Hat to add them in. Hopefully with backing support from @Bob-Davis :wink:

Methodology

There are two primary types of missing packages: missing sub-packages (e.g. pkg-devel, may be architecture specific) and packages that are not in the released distribution whatsoever. For the purposes of our industry, the primary repositories investigated for the EL8 platform will be BaseOS, AppStream, CodeReady Builder (PowerTools/CRB), and EPEL.

As the ASWF images are ultimately based on NVIDIA's CUDA image, evaluation should start with the UBI variant rather than the Rocky Linux variant.

Personally, I recommend running this process on a system that has been entitled with a Red Hat subscription, keeping everything "RHEL". The Developer Subscription for Individuals suffices for this, and entitling can be done on any distribution (I myself run Fedora as my host OS).

> [pkgmanager] [install] subscription-manager
> subscription-manager register

If you use Podman, there are no extra steps. If you are using Docker, you'll need to mount your /etc/pki/entitlement directory into the container.

The UBI images by default have their BaseOS, AppStream, and CodeReady Builder repositories enabled, and dnf config-manager available. In the current base script, install_yumpackages.sh, the first block determines whether or not the environment is EL8 or not and does a series of operations, some of which are unnecessary within UBI. I've modified the section like so:

. /etc/os-release
PKGOPTS="-y --nobest --nodocs --setopt install_weak_deps=false --exclude *cuda12*"
BASEOS_MAJORVERSION=$(echo ${VERSION_ID} | cut -d '.' -f 1)
if [ "$BASEOS_MAJORVERSION" -gt "7" ]; then
    dnf config-manager --save --setopt=ubi*.priority=0 --setopt=ubi*.cost=0
    dnf config-manager --enable codeready-builder-for-rhel-$BASEOS_MAJORVERSION-x86_64-rpms
    dnf $PKGOPTS upgrade
    dnf $PKGOPTS install "python(abi) = $PYTHON_VERSION"
fi

The cost and priority repository configurations effectively set those repositories to have the highest weight when a package is found in multiple repositories, even if the package is older than the lesser prioritized repository.

Additionally, dnf install epel-release does not work on RHEL; you will need to point it at the actual repo RPM, i.e.

dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-${BASEOS_MAJORVERSION}.noarch.rpm

If you want to point to a rebuild distribution rather than RHEL, then you'll have to add those repositories yourself and adjust the config-manager statements. For example:

curl -o /etc/yum.repos.d/rebuild.repo "https://git.almalinux.org/rpms/almalinux-release/raw/branch/a8/almalinux.repo"
dnf config-manager --save --setopt=*.priority=0 --setopt=*.cost=0 appstream baseos powertools
dnf config-manager --enable powertools

From this point on, it's just run, rerun, and rerun again! An easy solution would be to use a shell function like so:

ubi-check() {
    podman run --rm -dt --name aswf-ubi docker.io/nvidia/cuda:11.8.0-devel-ubi8
    podman cp scripts/common/install_yumpackages.sh aswf-ubi:/tmp
    podman exec aswf-ubi chmod +x /tmp/install_yumpackages.sh
    podman exec -e ASWF_DTS_VERSION=$1 PYTHON_VERSION=$2 aswf-ubi /tmp/install.sh || exit 1
    podman exec aswf-ubi dnf list installed | grep -iv -e "@system" -e "@ubi" -e "@epel"
    podman stop aswf-ubi
}

Catalog which packages come from non-UBI repositories and verify the data is correct. To check, run the discovered non-UBI packages back through dnf and filter for packages in UBI repositories found as "Available". Any packages listed from the following command can be removed from the missing catalog as they are indeed in the UBI repositories.

# RHEL
echo 'dnf --quiet --disablerepo=epel,rhel*,codeready* --showduplicate list $(dnf --quiet list installed | grep -iv -e "@system" -e "@ubi" -e "@epel" | tail -n +2 | awk "{ print \$1 }") | grep "ubi-8" | awk "{ print \$1 }"' | podman exec -i aswf-ubi /bin/bash

# Rebuild
echo 'dnf --quiet --disablerepo=epel,baseos,appstream,powertools --showduplicate list $(dnf --quiet list installed | grep -iv -e "@system" -e "@ubi" -e "@epel" | tail -n +2 | awk "{ print \$1 }") | grep "ubi-8" | awk "{ print \$1 }"' | podman exec -i aswf-ubi /bin/bash

Next Steps

Once missing packages have been identified, there is a decision tree to follow:

omenos commented 1 year ago

Current package status from initial runs of common/install_yumpackages.sh:

omenos commented 1 year ago

I've added the catalog check command to the description... apologies in advance for that horror show.