Closed Qubitium closed 1 week ago
nvidia-smi uses PCI_BUS_ID order but python program may be launched using default which is not PCI_BUS_ID order for gpu. If the env values do not match, wrong gpu is returned for gpu_id. Validate the env and raise error if issue exists.
gpu_id
TESTS
@microsoft-github-policy-service agree
@LeiWang1999 Ready for review. The CUDA order ENV must be validated (match nvidia-smi) in multi-gpu env or we get the wrong gpu back.
nvidia-smi uses PCI_BUS_ID order but python program may be launched using default which is not PCI_BUS_ID order for gpu. If the env values do not match, wrong gpu is returned for
gpu_id
. Validate the env and raise error if issue exists.TESTS