Closed mambelli closed 1 day ago
Some clarifications. Not setting and setting to 0 (GLIDEIN_Resource_Slots is not defined, or does not include GPUs, or GPUs=0) should all have the same behavior of not having the GPU in the slot (via Machine_resource_gpus=0). The GPU is not physically disabled or other - just ignored by HTCondor and not usable by the jobs. The HTCondor configuration is created in condor_startup.sh and that script is already parsing the attribute GLIDEIN_Resource_Slots when present. GLIDEIN_Resource_Slots is documented in https://glideinwms.fnal.gov/doc.v3_6/factory/custom_vars.html Here are some examples:
<attr name="GLIDEIN_Resource_Slots" const="True" glidein_publish="True" job_publish="False" parameter="True" publish="True" type="string" value="GPUs,1,type=main"/>
<attr name="GLIDEIN_Resource_Slots" const="True" glidein_publish="True" job_publish="False" parameter="True" publish="True" type="string" value="ioslot,2,disk=1GB;monitor;GPUs,3,,main"/>
Is your feature request related to a problem? Please describe. HTCondor changed its behavior. When GPUs are available on the host it will set those up in the machine unless explicitly told not to do so. This is part of its changes to encourage explicit setting and distinguish from leaving things undefined. Not setting a resource is different from setting it to 0. Factory operators still expect not to have any GPU in the machine if they do not ask explicitly for it, setting
GLIDEIN_Resource_Slots
There are multiple ways to tell HTCondor not to consider GPUs:
After discussing with TJ in a meeting on 10/9 seems that the last 2 are the preferred solutions
Describe the solution you'd like When
GLIDEIN_Resource_Slots
is not defined or does not include GPUs setMachine_resource_gpus=0
in the configuration of the slots. This should be in the generated condor config made for the glidein (in condor_startup.sh)Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Info (please complete the following information): Stakeholders and components can be a comma-separated list or on multiple lines. If you add a new stakeholder or component, not on the sample list, add it on a line on its own.
Additional context NA