NVIDIA / nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs
Apache License 2.0
2.52k stars 271 forks source link

RHEL rpm package transaction test failure with FIPS mode #116

Open gregaf300 opened 1 year ago

gregaf300 commented 1 year ago

Platform Information

ARCH=x86_64
NAME="Red Hat Enterprise Linux"
VERSION="8.8 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)"

Repository Information

# Original repo that was attempted
[cuda-rhel8-x86_64]
name=cuda-rhel8-x86_64
baseurl=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64
enabled=1
gpgcheck=1
gpgkey=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/D42D0685.pub
# Second repo attempted in case any deviation
[nvidia-container-toolkit]
name=nvidia-container-toolkit
baseurl=https://nvidia.github.io/libnvidia-container/stable/rpm/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://nvidia.github.io/libnvidia-container/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
...

Encountered Issue

Attempting a successful DNF installation of the nvidia-container-toolkit and its dependencies fails to pass the transaction check with a FIPS mode enabled instance.

==================================================================================================================
 Package                                 Architecture     Version               Repository                   Size
==================================================================================================================
Installing:
 nvidia-container-toolkit                x86_64           1.14.2-1              cuda-rhel8-x86_64           975 k
Installing dependencies:
 libnvidia-container-tools               x86_64           1.14.2-1              cuda-rhel8-x86_64            38 k
 libnvidia-container1                    x86_64           1.14.2-1              cuda-rhel8-x86_64           998 k
 nvidia-container-toolkit-base           x86_64           1.14.2-1              cuda-rhel8-x86_64           3.3 M

Transaction Summary
==================================================================================================================
Install  4 Packages

Total download size: 5.2 M
Installed size: 16 M
Is this ok [y/N]: y
Downloading Packages:
(1/4): libnvidia-container-tools-1.14.2-1.x86_64.rpm                              1.2 MB/s |  38 kB     00:00
(2/4): nvidia-container-toolkit-1.14.2-1.x86_64.rpm                                21 MB/s | 975 kB     00:00
(3/4): libnvidia-container1-1.14.2-1.x86_64.rpm                                    19 MB/s | 998 kB     00:00
(4/4): nvidia-container-toolkit-base-1.14.2-1.x86_64.rpm                           78 MB/s | 3.3 MB     00:00
------------------------------------------------------------------------------------------------------------------
Total                                                                              67 MB/s | 5.2 MB     00:00
Running transaction check
Transaction check succeeded.
Running transaction test
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Transaction test error:
  package nvidia-container-toolkit-base-1.14.2-1.x86_64 does not verify: no digest
  package libnvidia-container1-1.14.2-1.x86_64 does not verify: no digest
  package libnvidia-container-tools-1.14.2-1.x86_64 does not verify: no digest
  package nvidia-container-toolkit-1.14.2-1.x86_64 does not verify: no digest

Expected Outcome

It was expected that the nvidia-container-toolkit would successfully install. It seems the nvidia-container-toolkit and its dependencies lack SHA256 digests resulting in a failure to pass DNF transaction checks when FIPS mode is enabled. It was an unexpected failure as the NVIDIA drivers rpm packages from the same rpm repository were successful. They installed via the nvidia-driver DNF module satisfying the SHA256 digest requirement.

Seeking Guidance

Disabling FIPS mode is not an option for us due to security requirements

klueska commented 1 year ago

For RHEL, I would recommend pulling from the CUDA download repository instead of the nvidia-container-toolkit repository. We started publishing the packages in both places a few releases ago and only haven't updated the documentation because not all supported OSs are available on the CUDA download repository yet.

http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/

gregaf300 commented 1 year ago

Thank you for the advice on that, but unfortunately that is the repo that we have been using. The nvidia-container-toolkit repository was used only for attempting to test whether the offending packages digest had a SHA256 header since we had the transaction test failure with the CUDA download repository.

joelcomp1 commented 1 year ago

I am seeing this same issue on RHEL8 using the same above things exactly, is there any proposed resolution to this?

wagneran commented 11 months ago

We are also seeing the same on RHEL8 - watching this issue for any hopeful resolution.

EDIT: Last month we were able to install 1.13.5, this month is looks like 1.14.3 is the current version giving us issues (we updated to the latest 1.14 documentation). We went ahead and locked the version to RHEL8.8 from the following command

distribution=$(echo "rhel8.8") \
   && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
echo "Copied the Nvidia container toolkit"
sudo yum clean expire-cache -y
sudo yum install nvidia-container-toolkit -y
echo "Installed nvidia-container-toolkit"

and our RHEL8.9 image is working fine now.

jgforbes commented 4 months ago

Now that RHEL-7 is EOL. Can the rpm packages be built on RHEL-8 so that the rpm version is >=4.14? We would like to install on FIPS enabled system and that will only work if a sha256 or sha512 digest is used.

rwd5213 commented 3 months ago

Has anyone found a fix for this? Seems wild that they are not fips compliant

jgforbes commented 3 months ago

The RPMS available on github were built for RHEL-7. If you build the rpms for RHEL-8, they will work on a FIPS compliant RHEL-8 system.

rwd5213 commented 3 months ago

Any docs on that? looks like the developer guide only has targets for centos 7 and ubunut?

jgforbes commented 3 months ago
  1. Install docker, as the rpms are built inside of a container. Podman will not work as is.

  2. clone the repository

  3. cd nvidia-container-toolkit/scripts

  4. ./build-all-components.sh centos8-x86_64

You should then have the rpms in

nvidia-container-toolkit/dist/centos8/x86_64

On 2024-08-19 13:55, Ryan Despres wrote:

Any docs on that? looks like the developer guide only has targets for centos 7 and ubunut?

-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you commented.Message ID: @.***>

Links:

[1] https://github.com/NVIDIA/nvidia-container-toolkit/issues/116#issuecomment-2297123381 [2] https://github.com/notifications/unsubscribe-auth/AATDIDHRYYUMP4G3L7A7FHDZSIWSPAVCNFSM6AAAAABK7IGOOKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJXGEZDGMZYGE

rwd5213 commented 3 months ago

okay followed that and am encountering libseccomp2 is needed by libnvidia-container-libseccomp2-1.16.1-1.x86_64 . anythoughts on how to solve that? libseccomp package is installed and there is no libseccomp2. The rest of the rpms appear to install fine