Closed bsper2 closed 5 months ago
A couple of possible ways to tell if an NVIDIA GPU is installed in a given host:
# SEARCH PCI DEVICES - PROBABLY BETTER FOR A CUSTOM FACT
lspci | grep -i nvidia | egrep -iqw '3D|Tesla' && echo true || echo false
# RUN nvidia-smi - RELIES ON NVIDIA SOFTWARE TO BE INSTALLED
nvidia-smi | grep -q NVIDIA && echo true || echo false
I think the fact could be written something like the following:
Facter.add(:nvidia_gpu) do
setcode do
`lspci | grep -i nvidia | egrep -iqw '3D|Tesla' && echo true || echo false`.strip
end
end
Eventually it may also be useful to list the type(s) of NVIDIA GPUs installed. But that is beyond this scope.
Right now we enable DCGM install and telegraf collection by default, but ideally this should use a fact to only install this when a node has an NVIDIA GPU.
Once that fact is in place the README.md should be updated to remove comments about turning the DCGM off on non nvidia gpu nodes