XuehaiPan / nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
https://nvitop.readthedocs.io
Apache License 2.0
4.56k stars 144 forks source link

[Question] How to count the total amount of one user's GPU memory on all graphics cards by using nvitop? #95

Closed Charlie0257 closed 1 year ago

Charlie0257 commented 1 year ago

Required prerequisites

Questions

Thanks for the wonderful project!

How to count the total amount of one user's GPU memory on all graphics cards by using nvitop?

Thanks for any suggestions! :) @XuehaiPan

XuehaiPan commented 1 year ago

Hi @Charlie0257, thanks for raising this. Here is a code snippet:

# memory_usage.py

import itertools
from collections import defaultdict

from nvitop import Device, GpuProcess, NA, bytes2human

devices = Device.all()  # a list of all GPUs on the system
all_gpu_processes = list(  # a list of all GPU processes on all devices
    itertools.chain.from_iterable(
        (device.processes().values() for device in devices),
    ),
)

used_gpu_memory = defaultdict(int)
used_host_memory = defaultdict(int)
with GpuProcess.failsafe():  # ignore NoSuchProcess and AccessDenied exceptions
    for process in all_gpu_processes:
        username = process.username()
        if username is NA:
            continue  # the process is gone

        # Here we use int(memory) to convert N/A value to 0
        used_gpu_memory[username] += int(process.gpu_memory())
        used_host_memory[username] += int(process.host_memory())

print('used_gpu_memory:', dict(used_gpu_memory))  # dict of {username: used_gpu_memory in bytes}
print('used_host_memory:', dict(used_host_memory))  # dict of {username: used_host_memory in bytes}
print()

print(
    'used_gpu_memory:',
    {username: bytes2human(memory) for username, memory in used_gpu_memory.items()},
)
print(
    'used_host_memory:',
    {username: bytes2human(memory) for username, memory in used_host_memory.items()},
)

Here is an example output:

$ python3 memory_usage.py
used_gpu_memory: {'me': 509974937600}
used_host_memory: {'me': 334980202496}

used_gpu_memory: {'me': '475.0GiB'}
used_host_memory: {'me': '312.0GiB'}

Hope this resolves your question.

Charlie0257 commented 1 year ago

Thanks for this answer! :)

I will close this issue.