NVIDIA / dcgm-exporter

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
Apache License 2.0
847 stars 151 forks source link

Collect container name even when not using K8S #238

Open BryanQuigley opened 7 months ago

BryanQuigley commented 7 months ago

I'm not using K8S but want to collect container name as part of metrics. Each job is run in a container and the container name matches the jobid we want to query by.

I'm hoping maybe I'm missing an option to have the container name metrics be collected via docker (or podman) env variables. docker inspect shows what devices are visible via NVIDIA_VISIBLE_DEVICES.

Or grab output from a file file which will output device names like: /dev/nvidia5.

Any other approaches welcome!

nvvfedorov commented 7 months ago

@BryanQuigley , Thank you for the suggestion. Today, dcgm-exporter uses the following logic:

When the no-hostname config option is false (default value), the dcgm-exporter attempts to get the hostname from the NODE_NAME environment variable and container hostname as a fallback option.

Can you check what you see on the PBS instance?

BryanQuigley commented 7 months ago

Thanks for the quick reply! Sorry, I missed some key bits. We can't use DCGM-exporter in each container which I believe is what was described above.

We want to deploy dcgm-exporter to the host and have it report on all containers on the host. The hostname field is currently working as we would want. We want to add the container field similar to what is done in k8s.

The simple case is just one container to parse, but on some nodes there maybe 8 containers we want metrics from (and the reason we are moving from nvidia-smi metrics is MIG support, so that means potentially a lot of individual containers).

Docker inspect path

  1. Loop through all running containers
  2. docker inspect 12345.hpcq (container name)
  3. Under Config -> Env there is NVIDIA_VISIBLE_DEVICES which list GPU device ids.
  4. Add container=12345.hpcq as the container name to the metrics for those GPU device ids.

PBS Specific

  1. Loop through all GPU files in /var/spool/pbs/mom_priv/jobs/*.GPU
  2. cat /var/spool/pbs/mom_priv/jobs/12345.GPU to get nvidia dev names
  3. Add container=12345 as the container name to the metrics for those GPU device ids.

There may be other ways to associate it to a process or control group, but I don't see obvious ways to get back to container name. Thanks!

nvvfedorov commented 7 months ago

@BryanQuigley , Please run hostname command inside the docker container.

nvvfedorov commented 7 months ago

@BryanQuigley, I see where the problem is. Today, dcgm-exporter, uses k8s API to get a list of pods and containers and use this information to map containers and devices. We need to evaluate and prioritize this feature request.

nvvfedorov commented 5 months ago

@BryanQuigley , What is the PBS?

BryanQuigley commented 5 months ago

High performance computing workload manager (has open source and closed versions - I believe for this purpose they are the same) https://openpbs.org/ https://altair.com/pbs-professional/

nvvfedorov commented 4 months ago

@BryanQuigley , Re: GPU files in /var/spool/pbs/mom_priv/jobs/.GPU - is something configurable? What is inside the ".GPU" file? Can the file contain a job name?

If the file contains a job name, we may read the files and provide labels: GPU => Job Name, for example.

BryanQuigley commented 4 months ago

So currently it's

cat /var/spool/pbs/mom_priv/jobs/12345.hpcq.GPU
/dev/nvidia2

Are you saying it could work today if it was instead:

cat /var/spool/pbs/mom_priv/jobs/12345.hpcq.GPU
/dev/nvidia2=12345.hpcq

Or have 1 file with all the job ids to nvidia devices?

nvvfedorov commented 4 months ago

I have a similar request about another workflow manager for the HPC. We consider a file format something like this:

  1. File name: GPU - id.
  2. File content - job ID, container ID, or any string the workflow manager can put into the file to describe the job. I assume that 1 GPU may run one job so that the file will have one record.

Is it something that you can configure on your environment?

BryanQuigley commented 4 months ago

I'll check on how configurable that is.

We do have containers that use more than one gpu so the file ends up looking like /dev/nvidia7 ... /dev/nvidia0

BryanQuigley commented 4 months ago

We should be able to create custom other files to your spec. It will be configurable what files to pull it from, yes?

Just as long as a GPU can be associated with multiple containers we should be good. This should work with MIG device names too?

nvvfedorov commented 4 months ago

@BryanQuigley , what do you mean: "configurable what files to pull "? We can do the following, for example, you pass a path to a directory, where the DCGM-exporter can find files, where the file name is GPU ID (numeric, 0,1,2, etc), and each line of the file assumed is job label value.

BryanQuigley commented 4 months ago

path to a directory works ^

BryanQuigley commented 4 months ago

Sorry for delay. Yes, it is configurable. Ideally we get 1 job to multiple GPUs though.

nvvfedorov commented 3 months ago

@BryanQuigley, You can try our integration with HPC in the new version: https://github.com/NVIDIA/dcgm-exporter/releases/tag/3.3.6-3.4.2. Link to readme: https://github.com/NVIDIA/dcgm-exporter/blob/main/README.md#how-to-include-hpc-jobs-in-metric-labels

BryanQuigley commented 3 months ago

Thanks! Will give a try

michaelact commented 2 months ago

Hi, I have a question regarding environments that are not running on top of HPC (only a single server with GPU). How can I get container name in metrics?