NVIDIA / yum-packaging-precompiled-kmod

NVIDIA precompiled kernel module packaging for RHEL
Apache License 2.0
35 stars 16 forks source link

Building modules for CentOS Stream #31

Open jcpunk opened 2 years ago

jcpunk commented 2 years ago

Hello,

This ticket is a more formalized method of reviewing https://github.com/NVIDIA/yum-packaging-precompiled-kmod/issues/21 for CentOS Stream 9 (and following).

I’d like to outline my understanding of your process, to make sure I know what resources CentOS needs to provide for this to be reconsidered. I'm hopeful this pile of information will help in the process of reviewing and building for CentOS Stream.

  1. You are notified of a new RHEL kernel via “some process”
  2. The driver team reviews relevant changes to the various kernel interfaces
  3. Code is written/updated to reflect the new RHEL kernel behavior
  4. A pre-compiled kmod is published with a specific target of the new kernel.

Each pre-compiled kmod is specific to a released kernel. The kABI is not used in part because the limited introspection, in part because 1:1 mappings are much easier to validate, and in part because the DRM subsystem is not on the stable kABI list.

Much of what I’ve got written here is covered in much more detail at https://wiki.centos.org/Events/Dojo/FOSDEM2022 “Tracking Kernel Rate of Change”. I’ve gone into (painful?) detail on exact change metrics on real world data within that presentation.

At minor releases within the RHEL product lifecycle (8.x, 9.x) the RHEL kernel is updated with changes published in the CentOS Stream kernel since the last minor release (x-1).

I believe the clearest benefit to NVIDIA adding pre-compiled kmods for CentOS Stream 9 is in reducing the time pressure on engineering for steps 2-4 regarding RHEL sync up.

Since the existing drivers compile against the kernel.org kernel, engineering is probably spending most of its time on step 2. The changes to the RHEL kernel come as backports from the kernel.org kernel, so I suspect much of the codebase is ready by the time those patches make it to RHEL. There are doubtless places where new connective hooks need to be inserted or the complex “ifdef” maze of supporting multiple kernels.

I believe CentOS Stream 9 can help this in two ways: A) You can see exactly how the kernel is changing in RHEL 9 at https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9 B) You can run your validation process against what will change in RHEL early.

On element (B), this is perhaps most clearly expressed in the RHEL 8.5 kernel vs the CentOS Stream 8 kernel. The RHEL 8.5 GA kernel released into CentOS Stream 8 a total of 19 days before RHEL 8.5 GA. All kmods tracking CentOS Stream 8 were ready for the RHEL 8.5 release kernel weeks before RHEL 8.5 was published. The two kernels are identical at that point.

There are some kernels released in CentOS Stream 8 that serve as a “stepping stone” to the next big kernel. However, I view this as an advantage to folks who need to review the actual function changes – these “stepping stone” kernels have smaller patch sets, and thus less to review. The changes are backported from kernel.org, this isn’t like supporting a separate distribution kernel as the code comes from kernel.org (which already works) and goes into RHEL (which you are targeting).

If we can establish a clear method for notification when new packages are published in CentOS Stream, is this something you’d be willing to explore - packaging for CentOS Stream in addition to RHEL?

I don't love the idea of regular polling, but today anyone can pull down https://git.centos.org/api/0/rpms/kernel/git/tags?with_commits=True and filter against 'imports/c8s/' or 'imports/c9s/' to get the list of current tagged kernels. The commit hashes listed there can be transformed into a kernel source via the scripts in centos-git-common or centpkg. In theory a repoquery of BaseOS should provide a clear list of kernels in the release to be compared against the listed tags.

With the existing published content for CentOS Stream 9, you could begin prep work for RHEL 9 and reduce any delay in support for that platform after launch.