Open ruta-04 opened 7 months ago
Hi @ruta-04, did you configure cluster-wide-entitlement? See https://docs.nvidia.com/networking/display/kubernetes2310/network+operator#src-144713486_NetworkOperator-Cluster-wideEntitlement
@rollandf
yes, I added the cluster-wide-entitlement and ran the test pod provided in the instructions which gave the following output. It matches the example output.
[ansible@csah-pri entitlement]$ oc logs cluster-entitled-build-pod -n default Updating Subscription Management repositories. Unable to read consumer identity subscription-manager is operating in container mode. Red Hat Enterprise Linux 9 for x86_64 - BaseOS 14 MB/s | 17 MB 00:01 Red Hat Enterprise Linux 9 for x86_64 - AppStre 25 MB/s | 29 MB 00:01 Red Hat Universal Base Image 9 (RPMs) - BaseOS 481 kB/s | 515 kB 00:01 Red Hat Universal Base Image 9 (RPMs) - AppStre 2.2 MB/s | 1.8 MB 00:00 Red Hat Universal Base Image 9 (RPMs) - CodeRea 321 kB/s | 192 kB 00:00 ====================== Name Exactly Matched: kernel-devel ====================== kernel-devel-5.14.0-70.13.1.el9_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-70.17.1.el9_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-70.22.1.el9_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-70.26.1.el9_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-70.30.1.el9_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-162.6.1.el9_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-162.23.1.el9_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-162.12.1.el9_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-162.22.2.el9_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-162.18.1.el9_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-284.11.1.el9_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-284.25.1.el9_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-284.18.1.el9_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-284.30.1.el9_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-362.8.1.el9_3.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-362.13.1.el9_3.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-5.14.0-362.18.1.el9_3.x86_64 : Development package for building kernel modules to match the kernel ========================== Name Matched: kernel-devel ========================== kernel-devel-matched-5.14.0-70.13.1.el9_0.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-70.17.1.el9_0.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-70.26.1.el9_0.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-70.22.1.el9_0.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-70.30.1.el9_0.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-162.6.1.el9_1.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-162.23.1.el9_1.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-162.22.2.el9_1.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-162.12.1.el9_1.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-162.18.1.el9_1.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-284.11.1.el9_2.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-284.25.1.el9_2.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-284.18.1.el9_2.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-362.8.1.el9_3.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-284.30.1.el9_2.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-362.13.1.el9_3.x86_64 : Meta package to install matching core and devel packages for a given kernel kernel-devel-matched-5.14.0-362.18.1.el9_3.x86_64 : Meta package to install matching core and devel packages for a given kernel
any update on this?
Looks that entitlement is setup OK. Did you restart MOFED after setting up the entitlement?
BTW, in 24.1 version, entitlement is not needed as compilation is done with DTK.
I setup the entitlement before starting the mofed pods.
Anything else we can try?
On Sun, Mar 10, 2024 at 12:30 AM Fred Rolland @.***> wrote:
Looks that entitlement is setup OK. Did you restart MOFED after setting up the entitlement?
BTW, in 24.1 version, entitlement is not needed as compilation is done with DTK.
— Reply to this email directly, view it on GitHub https://github.com/Mellanox/network-operator/issues/830#issuecomment-1987107055, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJO5CY65MGGBQA3CXTDH5TYXP4ZZAVCNFSM6AAAAABDXLOAI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXGEYDOMBVGU . You are receiving this because you were mentioned.Message ID: @.***>
@e0ne I remember that we encountered an issue where the Red Hat subscription should be of a certain type. Do you recall something related?
Any update on this one?
On Tue, Mar 12, 2024 at 3:13 AM Fred Rolland @.***> wrote:
@e0ne https://github.com/e0ne I remember that we encountered an issue where the Red Hat subscription should be of a certain type. Do you recall something related?
— Reply to this email directly, view it on GitHub https://github.com/Mellanox/network-operator/issues/830#issuecomment-1991004095, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJO5C26V7PJU4T6M6XSUMDYX22J5AVCNFSM6AAAAABDXLOAI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRGAYDIMBZGU . You are receiving this because you were mentioned.Message ID: @.***>
Will you be able to use more recent Network Operator release? From 24.1, we no longer require cluster wide entitlement in OpenShift. Latest: https://docs.nvidia.com/networking/display/kubernetes2411
I am using Nvidia Network Operator 23.10.0 on the Openshift Container Platform (rhcos 4.14).
While creating nicclusterpolicy, it spins up MOFED pods which are crashing continuously (running but in notReady state)
` [ansible@csah-pri entitlement]$ oc logs mofed-rhcos4.14-ds-ncj2n Unsetting driver ready state No OFED driver found for kernel 5.14.0-284.40.1.el9_2.x86_64 Enabling RHOCP and EUS RPM repos... ID="rhcos" VERSION_ID="4.14" RHEL_VERSION="9.2" Updating Subscription Management repositories. Unable to read consumer identity subscription-manager is operating in container mode. Updating Subscription Management repositories. Unable to read consumer identity subscription-manager is operating in container mode. cuda 589 B/s | 3.5 kB 00:06
cuda 294 kB/s | 1.2 MB 00:04
Red Hat Enterprise Linux 9 for x86_64 - AppStre 0.0 B/s | 0 B 00:10
Errors during downloading metadata for repository 'rhel-9-for-x86_64-appstream-rpms':
Red Hat Enterprise Linux 9 for x86_64 - AppStre 5.0 MB/s | 24 MB 00:04
Red Hat Enterprise Linux 9 for x86_64 - BaseOS 13 MB/s | 16 MB 00:01
Red Hat Enterprise Linux 9 for x86_64 - BaseOS 11 MB/s | 14 MB 00:01
Red Hat Universal Base Image 9 (RPMs) - BaseOS 3.1 kB/s | 3.8 kB 00:01
Red Hat Universal Base Image 9 (RPMs) - BaseOS 802 kB/s | 515 kB 00:00
Red Hat Universal Base Image 9 (RPMs) - AppStre 22 kB/s | 4.2 kB 00:00
Red Hat Universal Base Image 9 (RPMs) - AppStre 2.3 MB/s | 1.8 MB 00:00
Red Hat Universal Base Image 9 (RPMs) - CodeRea 22 kB/s | 4.2 kB 00:00
Red Hat Universal Base Image 9 (RPMs) - CodeRea 345 kB/s | 192 kB 00:00
Metadata cache created. Installing dependencies Error in POSTTRANS scriptlet in rpm package kernel-core
Installed: cryptsetup-libs-2.6.0-3.el9.x86_64
device-mapper-9:1.02.195-3.el9.x86_64
device-mapper-libs-9:1.02.195-3.el9.x86_64
dracut-057-44.git20230822.el9.x86_64
kbd-2.4.0-9.el9.x86_64
kbd-legacy-2.4.0-9.el9.noarch
kbd-misc-2.4.0-9.el9.noarch
kernel-5.14.0-284.40.1.el9_2.x86_64
kernel-core-5.14.0-284.40.1.el9_2.x86_64
kernel-modules-5.14.0-284.40.1.el9_2.x86_64
kernel-modules-core-5.14.0-284.40.1.el9_2.x86_64
kpartx-0.8.7-22.el9.x86_64
libkcapi-1.3.1-3.el9.x86_64
libkcapi-hmaccalc-1.3.1-3.el9.x86_64
linux-firmware-20230310-138.el9_2.noarch
linux-firmware-whence-20230310-138.el9_2.noarch
pigz-2.5-4.el9.x86_64
systemd-udev-252-18.el9.x86_64
Downgraded: elfutils-0.188-3.el9.x86_64
elfutils-debuginfod-client-0.188-3.el9.x86_64
elfutils-libelf-0.188-3.el9.x86_64
elfutils-libs-0.188-3.el9.x86_64
Installed: createrepo_c-0.20.1-1.el9.x86_64
createrepo_c-libs-0.20.1-1.el9.x86_64
elfutils-libelf-devel-0.188-3.el9.x86_64
kernel-rpm-macros-185-12.el9.noarch
numactl-libs-2.0.16-1.el9.x86_64
zlib-devel-1.2.11-40.el9.x86_64
Installing Linux kernel headers... Error: Unable to find a match: kernel-headers-5.14.0-284.40.1.el9_2.x86_64 kernel-devel-5.14.0-284.40.1.el9_2.x86_64
Command "dnf -q -y --releasever=9.2 install kernel-headers-5.14.0-284.40.1.el9_2.x86_64 kernel-devel-5.14.0-284.40.1.el9_2.x86_64" failed with exit code: 1 Terminate event caught Terminating container Unsetting driver ready state Keeping currently loaded Mellanox OFED Driver...`
I can't seem to pinpoint the issue here. is it that os's kernel version is not currently supported by OFED driver?