intel / xpumanager

MIT License
94 stars 20 forks source link

Integrated GPU support #66

Closed lamalexck closed 10 months ago

lamalexck commented 1 year ago

Do we support iGPU? I am getting all N/A for iGPU and Freq is not the frequency in realtime, but the max freq the SoC support.

$ xpu-smi stats -d 0000:00:02.0
+-----------------------------+--------------------------------------------------------------------+
| Device ID                   | 0                                                                  |
+-----------------------------+--------------------------------------------------------------------+
| GPU Utilization (%)         | N/A                                                                |
| EU Array Active (%)         | N/A                                                                |
| EU Array Stall (%)          | N/A                                                                |
| EU Array Idle (%)           | N/A                                                                |
|                             |                                                                    |
| Compute Engine Util (%)     | N/A                                                                |
| Render Engine Util (%)      | N/A                                                                |
| Media Engine Util (%)       | N/A                                                                |
| Decoder Engine Util (%)     | N/A                                                                |
| Encoder Engine Util (%)     | N/A                                                                |
| Copy Engine Util (%)        | N/A                                                                |
| Media EM Engine Util (%)    | N/A                                                                |
| 3D Engine Util (%)          | N/A                                                                |
+-----------------------------+--------------------------------------------------------------------+
| Reset                       | N/A                                                                |
| Programming Errors          | N/A                                                                |
| Driver Errors               | N/A                                                                |
| Cache Errors Correctable    | N/A                                                                |
| Cache Errors Uncorrectable  | N/A                                                                |
| Mem Errors Correctable      | N/A                                                                |
| Mem Errors Uncorrectable    | N/A                                                                |
+-----------------------------+--------------------------------------------------------------------+
| GPU Power (W)               | N/A                                                                |
| GPU Frequency (MHz)         | 1400                                                               |
| Media Engine Freq (MHz)     | N/A                                                                |
| GPU Core Temperature (C)    | N/A                                                                |
| GPU Memory Temperature (C)  | N/A                                                                |
| GPU Memory Read (kB/s)      | N/A                                                                |
| GPU Memory Write (kB/s)     | N/A                                                                |
| GPU Memory Bandwidth (%)    | N/A                                                                |
| GPU Memory Used (MiB)       | N/A                                                                |
| GPU Memory Util (%)         | N/A                                                                |
| Xe Link Throughput (kB/s)   | N/A                                                                |
+-----------------------------+--------------------------------------------------------------------+
yupengzh-intel commented 1 year ago

Integrated GPUs are not formally supported by XPUM, so the output is unpredictable. Maybe you can try with sudo, to see if you can get more data.

eero-t commented 11 months ago

@lamalexck In general, most of the metrics in Level-Zero Sysman API (used by XPUM), are not available on integrated GPUs, only discrete ones.

(Integrated GPU shares e.g. memory and power with CPU, so there's no GPU specific metric for them that could be reported. Kernel may offer some per-context memory usage info in sysfs, but there's no Sysman API for that, only for device level memory usage.)

Best is to test directly what Sysman supports, using tester from Sysman backend project:

$ TESTER=zello_sysman
$ wget https://raw.githubusercontent.com/intel/compute-runtime/$TAG_TESTER/level_zero/tools/test/black_box_tests/$TESTER.cpp
$ g++ -O2 -Wall -o $TESTER $TESTER.cpp -lze_loader -locloc

(Building requires level-zero header / frontend to be installed, which are available from Intel driver repositories, and latest distro repositories.)

zello_sysman tester has separate options for querying each different metric type: https://spec.oneapi.io/level-zero/latest/sysman/api.html

If it reports a metric value, but XPUM does not (when both use the same Level-Zero backend library), it's XPUM issue. Otherwise the issue is on backend driver side, or kernel/FW/HW not providing that information.

lamalexck commented 10 months ago

thanks for the explanation.