Feature: drain selected GPUs on boot

This PR adds the ability to shut down faulty GPUs by setting the variable nvidia_drain_devices at the host level. If the variable is defined, our nvidia role creates a boot-time service which passes the device to nvidia-smi drain. As a result, the device is not advertised anymore as a CUDA device but it's still visible to lspci, which means it's hidden to end-user programs but an administrator can run validation routines on it.

IKIM-Essen / EMCP-config

Feature: drain selected GPUs on boot #140