XuehaiPan / nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
https://nvitop.readthedocs.io
Apache License 2.0
4.72k stars 148 forks source link

[Question] Memory bandwidth utilization of GPUs? #103

Closed walkieq closed 10 months ago

walkieq commented 11 months ago

Required prerequisites

Questions

Is there a way to measure the runtime memory bandwidth utilization of GPUs? Or is it possible to estimate/calculate the memory bandwidth utilization using the numbers reported in nvitop?

XuehaiPan commented 11 months ago

Is there a way to measure the runtime memory bandwidth utilization of GPUs?

@walkieq The memory bandwidth utilization rate can be retrieved by:

from nvitop import Device

devices = Device.all()
for device in devices:
    print(f'Memroy usage percentage:           {device.memory_percent() / 100:%}')
    print(f'Memory bandwidth utilization rate: {device.memory_utilization() / 100:%}')

Also, you can get the memory throughput via:

device.pcie_throughput()
device.pcie_tx_throughput()
device.pcie_rx_throughput()
device.pcie_tx_throughput_human()
device.pcie_rx_throughput_human()

device.nvlink_throughput()
device.nvlink_tx_throughput()
device.nvlink_rx_throughput()
device.nvlink_tx_throughput_human()
device.nvlink_rx_throughput_human()
walkieq commented 11 months ago

Thank you so much! I have tested the API but I got some interesting results.

I am running the following:

from nvitop import Device

devices = Device.all()
for device in devices:
    print(f'Memroy usage percentage:           {device.memory_percent() / 100:%}')
    print(f'Memory bandwidth utilization rate: {device.memory_utilization() / 100:%}')
    print(device.pcie_throughput())
    print(device.pcie_tx_throughput())
    print(device.pcie_rx_throughput())
    print(device.pcie_tx_throughput_human())
    print(device.pcie_rx_throughput_human())

and the results are:

Memroy usage percentage:           94.600000%
Memory bandwidth utilization rate: 0.000000%
ThroughputInfo(tx=177000, rx=177000)
175000
182000
177.7MiB/s
168.9MiB/s

It looks like the device.memory_utilization() is different to the device.pcie_throughput(). Which one shall I refer to?

XuehaiPan commented 11 months ago

It looks like the device.memory_utilization() is different to the device.pcie_throughput(). Which one shall I refer to?

Device.memory_utilization refers to:

Percent of time over the past sample period during which global (device) memory was being read or written.

The sample period may be between 1 second and 1/6 second depending on the product.

Device.pcie_throughput refers to:

The current PCIe throughput in KiB/s.

This function is querying a byte counter over a 20ms interval and thus is the PCIe throughput over that interval.

You can see the detailed definition via:

nvidia-smi --help-query-gpu | less --pattern=utilization
image image

Both the memory utilization rate and the PCIe throughput have their own meaning. You can also calculate the PCIe bandwidth utilization rate by:

PCIe utilization rate = 100 * current PCIe throughput / theoretical PCIe throughput (via PCIe generation and PCIe width)
XuehaiPan commented 10 months ago

Closing due to inactivity. Please feel free to ask for a reopening if you have more questions.