Closed mlorenzofr closed 1 month ago
@mlorenzofr: This pull request references MGMT-18923 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
Attention: Patch coverage is 73.23944%
with 19 lines
in your changes missing coverage. Please review.
Project coverage is 59.70%. Comparing base (
0ae2fc8
) to head (c89a326
). Report is 4 commits behind head on master.
/lgtm
/lgtm
/restest
/lgtm
/lgtm
/hold
Looks good to me, @CrystalChun would u mind taking a quick look in case I am missing something?
/approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: mlorenzofr, rccrdpccl
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/cc @ori-amizur
Does the change allow to detect other PCI devices (non GPU)?
Yes, indeed Gaudi devices are FPGAs and classified under the processing accelerators branch (0x12
instead of 0x03
). However they will be added to GPU list.
But for this development the requirements were to detect Nvidia and Gaudi cards as GPUs to find them in the inventory. RHEL AI uses them and detecting this hardware using the inventory would do some work easier.
The changes added here should:
/retest
@mlorenzofr: all tests passed!
Full PR test history. Your PR dashboard.
/lgtm
/unhold
[ART PR BUILD NOTIFIER]
Distgit: ose-agent-installer-node-agent This PR has been included in build ose-agent-installer-node-agent-container-v4.18.0-202410010938.p0.gaab6665.assembly.stream.el9. All builds following this will include this PR.
These changes fix an issue in
inventory
when the GPUs are discovered on the system. Due to how ghw library works, only devices with a DRM interface are considered GPUs, which excludes devices like nvidia A100 or Habana Gaudi due to them not create that interface.inventory
will now check all PCI devices looking for GPU devices. To provide more flexibility in the future, a new configuration option has been provided toinventory
. The configuration can be set via configuration file or via INVENTORY_GPU_CONFIG environment variable (documentation and examples provided in the README)The
inventory
binary has been built (make build
) and tested on my own laptop, a beaker machine, a VM and also a physical machine with Nvidia A100 hardware. In all cases, the GPUs were detected correctly, and filtering via environment variable and configuration file was also successful.List all the issues related to this PR
How was this code tested?
Checklist
Reviewers Checklist