CCI-MOC / esi

Elastic Secure Infrastructure project
6 stars 12 forks source link

Extend ESI introspection to include GPU information #531

Closed tzumainn closed 5 months ago

tzumainn commented 5 months ago

In the long-term, automatic identification of GPU nodes and their associated traits will be valuable for users that have specific GPU requirements. Updating baremetal introspection to handle that will be very valuable.

tzumainn commented 5 months ago

Actually, looks like this exists. Here are the steps:

a) Create an accelerators.yaml file and place it in /etc/ironic-inspector

pci_devices:
  - vendor_id: "10de"
    device_id: "1eb8"
    type: GPU
    device_info: NVIDIA Corporation Tesla T4
  - vendor_id: "10de"
    device_id: "1df6"
    type: GPU
    device_info: NVIDIA Corporation GV100GL
  - vendor_id: "10de"
    device_id: "20b0"
    type: GPU
    device_info: NVIDIA Corporation GA100

b) Update ironic-inspector.conf and restart the service

[accelerators]
known_devices=/etc/ironic-inspector/accelerators.yaml
.
.
.
[processing]
processing_hooks=$default_processing_hooks,extra_hardware,lldp_basic,local_link_connection,physnet_cidr_map,accelerators

c) Update ironic.conf and restart ironic_conductor

[inspector]
extra_kernel_params = ipa-inspection-collectors=default,pci-devices
tzumainn commented 5 months ago

Next step is to put this in esi-pilot so that it's automatically installed

tzumainn commented 5 months ago

https://github.com/CCI-MOC/esi-pilot/pull/58