utkuozdemir / nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary
MIT License
828 stars 101 forks source link

[Feature Request] Memory Usage per GPU Process #190

Open bibo318 opened 3 months ago

bibo318 commented 3 months ago

We recommend adding RAM usage visualization by GPU processes to your product, to help users monitor the performance of processes more completely and accurately.

bibo318 commented 3 months ago

Can I contribute to that feature project?

utkuozdemir commented 3 months ago

Thank you for the recommendation and the offer. Can you elaborate on what you mean by adding RAM usage? Are you talking about the GPU memory usage, or the system memory? And do you mean only the processes that are using the GPU? (like using nvidia-smi --query-compute-apps?

Reason I'm asking is, I want to keep this project to monitor Nvidia GPU-related stuff only. For all other things, there are way better solutions, like the node exporter.

bibo318 commented 3 months ago

I am only referring to the GPU RAM. I want to add a metric to monitor the processes using GPU RAM, like the command: nvidia-smi --query-compute-apps=pid,process_name,used_gpu_memory --format=csv To be able to monitor GPU memory usage in more detail

utkuozdemir commented 3 months ago

I see, thank you, you are very welcome to contribute. Just please keep in mind that it might take quite some time for me to review/give feedback on it, so it might need some patience (see the maintenance status warning on the README).

kakuRgs commented 1 month ago

This is also what I want, looking forward to updating soon

utkuozdemir commented 1 month ago

This is the next thing I wanna do. But I don't know when I can find some time.