NVIDIA / dcgm-exporter

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
Apache License 2.0
896 stars 155 forks source link

How to get current device MIG model is single or mixed? #293

Open lengrongfu opened 7 months ago

lengrongfu commented 7 months ago

I have an A800 device, when I open the MIG model, we can't from dcgm-exporter get a metric know the current device MIG model is single or mixed.

nvvfedorov commented 7 months ago

Please answer the following questions to get better assistance:

What happened?

Tell us what happened and provide as many details as possible, including logs.

What did you expect to happen?

Tell us about expected behaviour.

What is the GPU model?

Tell us about the hardware configuration of the GPU, including the output of 'nvidia-smi'

What is the environment?

Is DCGM-Exporter running on bare metal or in a virtual environment, container, pod, etc?

How did you deploy the dcgm-exporter and what is the configuration?

Tell us how you deployed DCGM-Exporter. Did you use helm, build from source or use the GPU Operator?

How can we reproduce the issue?

Clear and concise steps to reproduce an issue can help everyone by allowing us to identify and fix problems more quickly.

What is the version?

Tell us about DCGM-exporter version.
nvvfedorov commented 6 months ago

@lengrongfu, The GPU feature discovery (https://github.com/NVIDIA/gpu-feature-discover) offers the "nvidia.com/mig.strategy" node labels. Do you want to see this label as part of the metric output?

Can you tell us your use case?

lengrongfu commented 2 weeks ago

@lengrongfu, The GPU feature discovery (https://github.com/NVIDIA/gpu-feature-discover) offers the "nvidia.com/mig.strategy" node labels. Do you want to see this label as part of the metric output?

Can you tell us your use case?

Thanks, i know this node labels nvidia.com/mig.strategy, but we use grafana dashboard can't display mig model.

If we need this feature, we can contribute a PR.