hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.81k stars 1.94k forks source link

Support for Nvidia GPUs #3499

Closed diptanu closed 5 years ago

diptanu commented 6 years ago

Support for NVIDIA GPU cards as resources in Nomad is needed for scheduling applications which need to be scheduled on machines which has GPU cards.

Here are the requirements that we have - a. Schedule applications in a fleet which has GPU cards only on a subset of the cluster b. Schedule applications on specific GPU cards like Tesla. Some of the machines might have older hardware, so users might want to choose a specific generation. c. Provide tasks with a GPU device and the default devices such as /dev/nvidiactl and /dev/nvidia-utm d. Collect GPU metrics such as temperature, memory used, utilization, fan speed, number of pids using the cards and expose them as part of the node metrics.

Implementation. a. Fingerprint the GPUs using nvidia-smi and enumerate the GPU devices. b. Add NvidiaGPUResource in Resources, and implement all the logic for copying, adding and indexing GPU devices. We need an indexing mechanism like network devices here(probably simpler) since we are handing out one GPU device per allocation(or task?) c. Add API and statistics collector on the client for exposing the GPU stats d. Add scheduling logic(similar to network devices) e. Support GPUs in jobspec so that users can ask for a GPU device.

We can land the PRs in roughly the same order. The reason I am making Nvidia specific data models is that I am not sure how the AMD GPUs look like or how they are being used by developers.

diptanu commented 6 years ago

@caiush Let me know if you have any additional use cases that aren't covered in my description.

dadgar commented 6 years ago

What are you thinking for e? Just gpu = n or its own stanza with additional details?

diptanu commented 6 years ago

@dadgar There are many ways of doing this, not sure what's the best way. Users could ask for n GPUs in the resources block and add constraints like GPU models, driver version etc. We could also do a GPU specific stanza, which is probably more work.

acbrewbaker commented 6 years ago

Hi guys. How can I help with this effort?

ryanmickler commented 6 years ago

Well, for this, the first use case to provide for is the use of the new nvidia-docker runtime.

To do this, we'd like to able to detect that the docker runtime 'nvidia' is present on the host, as well as the number of GPUs present. Then we'd pass the arguments --runtime=nvidia to the docker run command, as well as -e NVIDIA_VISIBLE_DEVICES={1,3}, where 1 and 3 represents the id's of the gpus on the host. So say we'd want to schedule a task that uses 2 gpus, and my host has 8 gpus, then we'd like nomad to be able to manage and schedule the use this resource.

so perhaps in our job's task stanza with driver=docker, inside the config stanza we have runtime=nvidia, and in the resources stanza we have gpu=2.

How does that sound as a first pass?

see: https://github.com/NVIDIA/nvidia-docker https://github.com/NVIDIA/nvidia-container-runtime

dadgar commented 6 years ago

Hey all, this is very much a priority for us. There is however some prerequisite work that we would like to tackle first to better organize the client to allow for this type of resource manager. As such I would caution community contributions until that lands!

ryanmickler commented 6 years ago

Ok, i'm posting here a workaround we are using until an official runtime=nvidia feature is added.

Here is a jobfile that mimics the effect of the nvidia-docker command. This seems to work, but i'm not sure if it is as full-featured, and has lots of versions in it that could be dynamic

job "gpu-job" {
    type = "service"
    group "gpu-group" {
        task "gpu-task" {

            env {
                # from nvidia-docker
PATH="/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                CUDA_VERSION="9.0.176",
                CUDA_PKG_VERSION="9-0=9.0.176-1",
                LD_LIBRARY_PATH="/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64",
                NVIDIA_VISIBLE_DEVICES="all",
                NVIDIA_DRIVER_CAPABILITIES="compute,utility",
                NVIDIA_REQUIRE_CUDA="cuda>=9.0",
                NCCL_VERSION="2.1.2",
                CUDNN_VERSION="7.0.5.15"
            }

            driver = "docker"

            config {
                # to access the GPU device, we need to be privileged
                privileged = true

                volumes = [
                    # mount the nvidia-docker driver (NOT SURE IF WE NEED THIS?)
                    "/var/lib/nvidia-docker/volumes/nvidia_driver/$${meta.libcuda_version}:/usr/local/nvidia",
                    # all the libraries /usr/lib/x86_64-linux-gnu/libcuda*
                    "/usr/lib/x86_64-linux-gnu/libcuda.so:/usr/lib/x86_64-linux-gnu/libcuda.so",
                    "/usr/lib/x86_64-linux-gnu/libcuda.so.1:/usr/lib/x86_64-linux-gnu/libcuda.so.1",
                    "/usr/lib/x86_64-linux-gnu/libcuda.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libcuda.so.$${meta.libcuda_version}",
                    # all the libraries /usr/lib/x86_64-linux-gnu/libnvidia*
"/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so:/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1.0.0:/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1.0.0",
"/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1.0.1:/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1.0.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-encode.so:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so:/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-gtk2.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-gtk2.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-gtk3.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-gtk3.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so:/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-ml.so:/usr/lib/x86_64-linux-gnu/libnvidia-ml.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so",
"/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1",
"/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.$${meta.libcuda_version}",
"/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.$${meta.libcuda_version}:/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.$${meta.libcuda_version}",
                    # all the nvidia devices /dev/nvidia*
                    "/dev/nvidia0:/dev/nvidia0",
                    "/dev/nvidia1:/dev/nvidia1",
                    "/dev/nvidiactl:/dev/nvidiactl",
                    "/dev/nvidia-uvm:/dev/nvidia-uvm",
                    "/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools",
                ]

                # e.g. use Google Tensorflow gpu container.
                image = "gcr.io/tensorflow/tensorflow:latest-gpu"
            }
        }
    }
}

secondly, to set the meta $${meta.libcuda_version} on the host, need to use the following script while setting up the nomad.hcl, (you will need to have a meta block already in the file)

LC_VERSION=$( find /usr/lib/x86_64-linux-gnu/ -name "libcuda.so.*.*" -print0 | cut -d'.' -f 3-4 )
sed -i -- "s|meta {|meta {\n    libcuda_version = \"$LC_VERSION\"|g" /etc/nomad.d/nomad.hcl

Hope this helps.

zezke commented 6 years ago

I am looking into how to get this working as well this week. The CUDA_VISIBLE_DEVICES can probably be set via the environment variables, but as far as I know we are currently unable to provide extra parameters to docker run commands. Maybe adding that is an easy first step?

hmeine commented 6 years ago

@dadgar I have exactly the same requirements / goals as @ryanmickler. Could you give us an update on the timeline of this feature? You mentioned that you want to "first … better organize the client to allow for this type of resource manager"; are there any related publicly visible issues that would shed a light on the progress there?

hmeine commented 6 years ago

@dadgar Sorry to bother you again, but could you give a brief status report pretty-please? (See previous comment.)

adragoset commented 5 years ago

Any update on this? Right now I'm just setting the docker daemon to default to the nvidia runtime and dedicating certain hosts to only gpu work. Messy but i don't know of a better option until this feature makes it in.

dadgar commented 5 years ago

@adragoset We are actively working on this feature and it should be part of 0.9.0.

momania commented 5 years ago

@dadgar Great news! Is there any info available on what the feature will look like? Will the gpu's for instance be added to the resources and as such be configurable per job?

endocrimes commented 5 years ago

Hey folks, This has shipped as part of the Nomad 0.9 beta!

GPU support was enabled by implementing the Device Plugin API. You can find the docs for the Nvidia plugin here, and for usage inside your job configuration here.

Thanks :)

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.