kubernetes-sigs / node-feature-discovery

Node feature discovery for Kubernetes
Apache License 2.0
803 stars 245 forks source link

`kernel.loadedmodule` doesn't work on all OS #1025

Closed budimanjojo closed 1 year ago

budimanjojo commented 1 year ago

What would you like to be added: Currently, kernel.loadedmodule feature only check for the content of /proc/modules for modules compiled as module and not built in tree module. Maybe there's a way to check for built-in kernel that is in /lib/modules/<kernel-version>/modules.builtin. I think this can be done by either improve the function getLoadedModules or allow the user to choose what kind of test to get the information.

Why is this needed:

Currently Talos Linux will just silently not creating nodeFeatureRule when I put matchFeatures that check for kernel.loadedmodule even if the module is loaded.

marquiz commented 1 year ago

Thanks @budimanjojo for bringing this forth, I can understand the problem.

Maybe we should have a feature called kernel.enabledmodule or smth that would list both the loaded dynamic modules and the builtin modules(?)

I guess /lib/modules/<KVERSION>/modules.builtin is the canonical source of the information. At first I was thinking at /sys/module/ instead but I guess that doesn't list all builtin modules.

Any thoughts @zvonkok @mythi?

@budimanjojo for now you should be able to bridge the gap using kernel.config and the matchAny. I.e. have two matchFeatures rules under matchAny: first for CONFIG=m && loadedmodule=yand the other just CONFIG=y.

marquiz commented 1 year ago

/help

k8s-ci-robot commented 1 year ago

@marquiz: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes-sigs/node-feature-discovery/issues/1025): >/help Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
budimanjojo commented 1 year ago

@marquiz Thanks! I don't know whether /lib/modules/<KVERSION>/modules.builtin or /sys/module/ is the better choice and this need some deeper investigation. In Talos, I see there are significantly bigger list in /lib/modules though:

 talosctl -n 192.168.200.21 cat /lib/modules/5.15.85-talos/modules.builtin | wc -l
812
 talosctl -n 192.168.200.21 ls /sys/module | wc -l
214
marquiz commented 1 year ago

Thanks! I don't know whether /lib/modules/<KVERSION>/modules.builtin or /sys/module/ is the better choice and this need some deeper investigation.

Looking at the kernel documentation (https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-module and https://www.kernel.org/doc/Documentation/kbuild/kbuild.txt) looks like /lib/modules/ would be the right choice as that is sure to list ALL builtin modules. /sys/modules is missing builtin modules that neither have a version nor any parameters.

budimanjojo commented 1 year ago

@marquiz I'm fine with that. Should this replace the current check or a choice for the users? Because the problem is the current kernel.loadedmodule doesn't mean the module is truly loaded.

marquiz commented 1 year ago

Should this replace the current check or a choice for the users? Because the problem is the current kernel.loadedmodule doesn't mean the module is truly loaded.

At a quick thought I think we could/should leave kernel.loadedmodule as is, just clearly document that it lists loaded dynamic kernel modules. Just have a new feature for listing all modules, dynamic and builtin which would be practically a union of kernel.loadedmodule and modules.builtin. WDYT?

budimanjojo commented 1 year ago

That's okay. @marquiz but, /sys/modules doesn't only show builtin modules. /lib/modules/<kVersion>/modules.builtin is the way to check for builtin modules I think? I'm not sure if it's standard in all OSes though.

marquiz commented 1 year ago

/sys/modules doesn't only show builtin modules. /lib/modules/<kVersion>/modules.builtin is the way to check for builtin modules I think? I'm not sure if it's standard in all OSes though.

Exactly. What I tried to say is to use /lib/modules/<kVersion>/modules.builtin and combine that with /proc/modules (which we already have) to get the list of ALL modules enabled in the system. AFAIU /lib/modules/<kVersion> is the standard in pretty much all OSes. And if that doesn't exist then we just don't populate the new feature as we cannot get info about ALL modules.

budimanjojo commented 1 year ago

@marquiz someone found this commit (the deleted code), hope it can give u some ideas https://github.com/kubernetes/kubernetes/pull/114669/files 🙂

AhmedGrati commented 1 year ago

/assign

mythi commented 1 year ago

Maybe there's a way to check for built-in kernel that is in /lib/modules/<kernel-version>/modules.builtin.

I was first confused what module means in this context. If I understand it correctly, you're not looking for "module is available but not loaded" but rather "module is either =y or =m in the kernel config". Would kernel.config feature source work for you?

For example, I have a need for vfio-pci that moved to a built-in in the recent Ubuntu versions so I can have:

- feature: kernel.config
  matchExpressions:
    VFIO_PCI:
      op: Exists
marquiz commented 1 year ago

I guess the problem is that with that you are not able (with a simple rule) to determine whether the driver has been loaded (if it happens to be =m)

mythi commented 1 year ago

AFAIU, #1086 does not solve that either so custom logic would be needed. I have no strong opinions either way but to me it looks NFD provides the necessary NodeFeatureRules capabilities to solve this.

E.g.,

my-module=ok if config=y or loadedmodule exists my-module=needs-loading if config=m and loadedmodule doesnotexist

marquiz commented 1 year ago

Yes, for determining if a module needs to be loaded a more complex rule would be needed. But just to check if a driver/module has been loaded you would be served by a simple check

my-module=ok if config=y or loadedmodule exists

If loadedmodule would contain also builtin modules it should be enough to do

my-module=ok if loadedmodule exists

right?

my-module=needs-loading if config=m and loadedmodule doesnotexist

This would be true. Unless you're 100% sure that the driver has been enabled (either m 0r y) you could do just

my-module=needs-loading if loadedmodule doesnotexist
mythi commented 1 year ago

If loadedmodule would contain also builtin modules it should be enough to do

my-module=ok if loadedmodule exists

right?

Yes, true. But I tend to agree with your comment that maybe keep the existing loadedmodule unchanged and have something new for this combined functionality.

Unless you're 100% sure that the driver has been enabled (either m 0r y) you could do just

perhaps a bit offtopic but how should we deal with out-of-tree modules?

marquiz commented 1 year ago

@zvonkok any thoughts on any of this?