nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
2 stars 0 forks source link

OpenShift Virtualization: Figure out resourcequotas for GPUs that are passthroughed #820

Open naved001 opened 3 days ago

naved001 commented 3 days ago

Motivation

To offer GPUs in VMs in OpenShift virtualization we need to figure out how resourcequotas's will apply when the GPUs are configured in PCI passthrough mode. This also ties in on how we will manage said quotas from coldfront and ultimately generate invoices for these.

Completion Criteria

We can enforce quotas on passthroughed GPUs.

Description

When a GPU is configured in passthrough mode they show up as new allocatable resources. For instance, nvidia.com/A100_SXM4_40GB appears as a resource for A100 GPUs, and nvidia.com/GV100GL_Tesla_V100 for V100 GPUs. As a result, we will need to update the resource quotas for projects and adjust Coldfront configuration to manage access to these resources.

Original issue where this is discussed: https://github.com/nerc-project/operations/issues/725#issuecomment-2474796687

Completion dates

Desired - 20YY-MM-DD Required - TBD